Source author record

David Gleich

David Gleich appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Social and Information Networks Data Structures and Algorithms physics.soc-ph Discrete Mathematics Machine Learning math.NA

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Dominant Z-Eigenpairs of Tensor Kronecker Products are Decoupled and Applications to Higher-Order Graph Matching

Tensor Kronecker products, the natural generalization of the matrix Kronecker product, are independently emerging in multiple research communities. Like their matrix counterpart, the tensor generalization gives structure for implicit multiplication and factorization theorems. We present a theorem that decouples the dominant eigenvectors of tensor Kronecker products, which is a rare generalization from matrix theory to tensor eigenvectors. This theorem implies low rank structure ought to be present in the iterates of tensor power methods on Kronecker products. We investigate low rank structure in the network alignment algorithm TAME, a power method heuristic. Using the low rank structure directly or via a new heuristic embedding approach, we produce new algorithms which are faster while improving or maintaining accuracy, and scale to problems that cannot be realistically handled with existing techniques.

preprint2016arXiv

An optimization approach to locally-biased graph algorithms

Locally-biased graph algorithms are algorithms that attempt to find local or small-scale structure in a large data graph. In some cases, this can be accomplished by adding some sort of locality constraint and calling a traditional graph algorithm; but more interesting are locally-biased graph algorithms that compute answers by running a procedure that does not even look at most of the input graph. This corresponds more closely to what practitioners from various data science domains do, but it does not correspond well with the way that algorithmic and statistical theory is typically formulated. Recent work from several research communities has focused on developing locally-biased graph algorithms that come with strong complementary algorithmic and statistical theory and that are useful in practice in downstream data science applications. We provide a review and overview of this work, highlighting commonalities between seemingly-different approaches, and highlighting promising directions for future work.

preprint2015arXiv

Computing active subspaces with Monte Carlo

Active subspaces can effectively reduce the dimension of high-dimensional parameter studies enabling otherwise infeasible experiments with expensive simulations. The key components of active subspace methods are the eigenvectors of a symmetric, positive semidefinite matrix whose elements are the average products of partial derivatives of the simulation's input/output map. We study a Monte Carlo method for approximating the eigenpairs of this matrix. We offer both theoretical results based on recent non-asymptotic random matrix theory and a practical approach based on the bootstrap. We extend the analysis to the case when the gradients are approximated, for example, with finite differences. Our goal is to provide guidance for two questions that arise in active subspaces: (i) How many gradient samples does one need to accurately approximate the eigenvalues and subspaces? (ii) What can be said about the accuracy of the estimated subspace, both theoretically and practically? We test the approach on both simple quadratic functions where the active subspace is known and a parameterized PDE with 100 variables characterizing the coefficients of the differential operator.

preprint2014arXiv

Using Triangles to Improve Community Detection in Directed Networks

In a graph, a community may be loosely defined as a group of nodes that are more closely connected to one another than to the rest of the graph. While there are a variety of metrics that can be used to specify the quality of a given community, one common theme is that flows tend to stay within communities. Hence, we expect cycles to play an important role in community detection. For undirected graphs, the importance of triangles -- an undirected 3-cycle -- has been known for a long time and can be used to improve community detection. In directed graphs, the situation is more nuanced. The smallest cycle is simply two nodes with a reciprocal connection, and using information about reciprocation has proven to improve community detection. Our new idea is based on the four types of directed triangles that contain cycles. To identify communities in directed networks, then, we propose an undirected edge-weighting scheme based on the type of the directed triangles in which edges are involved. We also propose a new metric on quality of the communities that is based on the number of 3-cycles that are split across communities. To demonstrate the impact of our new weighting, we use the standard METIS graph partitioning tool to determine communities and show experimentally that the resulting communities result in fewer 3-cycles being cut. The magnitude of the effect varies between a 10 and 50% reduction, and we also find evidence that this weighting scheme improves a task where plausible ground-truth communities are known.

preprint2011arXiv

Neighborhoods are good communities

The communities of a social network are sets of vertices with more connections inside the set than outside. We theoretically demonstrate that two commonly observed properties of social networks, heavy-tailed degree distributions and large clustering coefficients, imply the existence of vertex neighborhoods (also known as egonets) that are themselves good communities. We evaluate these neighborhood communities on a range of graphs. What we find is that the neighborhood communities often exhibit conductance scores that are as good as the Fiedler cut. Also, the conductance of neighborhood communities shows similar behavior as the network community profile computed with a personalized PageRank community detection method. The latter requires sweeping over a great many starting vertices, which can be expensive. By using a small and easy-to-compute set of neighborhood communities as seeds for these PageRank communities, however, we find communities that precisely capture the behavior of the network community profile when seeded everywhere in the graph, and at a significant reduction in total work.