Source author record

Pan Peng

Pan Peng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms hep-th Machine Learning math.GT Social and Information Networks Computational Complexity Discrete Mathematics Logic in Computer Science math-ph math.CO math.MP math.PR math.QA math.RT Neural and Evolutionary Computing

Catalog footprint

What is connected

15works

15topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

A Sublinear-Time Spectral Clustering Oracle with Improved Preprocessing Time

We address the problem of designing a sublinear-time spectral clustering oracle for graphs that exhibit strong clusterability. Such graphs contain $k$ latent clusters, each characterized by a large inner conductance (at least $φ$) and a small outer conductance (at most $\varepsilon$). Our aim is to preprocess the graph to enable clustering membership queries, with the key requirement that both preprocessing and query answering should be performed in sublinear time, and the resulting partition should be consistent with a $k$-partition that is close to the ground-truth clustering. Previous oracles have relied on either a $\textrm{poly}(k)\log n$ gap between inner and outer conductances or exponential (in $k/\varepsilon$) preprocessing time. Our algorithm relaxes these assumptions, albeit at the cost of a slightly higher misclassification ratio. We also show that our clustering oracle is robust against a few random edge deletions. To validate our theoretical bounds, we conducted experiments on synthetic networks.

preprint2022arXiv

Approximately Counting Subgraphs in Data Streams

Estimating the number of subgraphs in data streams is a fundamental problem that has received great attention in the past decade. In this paper, we give improved streaming algorithms for approximately counting the number of occurrences of an arbitrary subgraph $H$, denoted $\# H$, when the input graph $G$ is represented as a stream of $m$ edges. To obtain our algorithms, we provide a generic transformation that converts constant-round sublinear-time graph algorithms in the query access model to constant-pass sublinear-space graph streaming algorithms. Using this transformation, we obtain the following results. 1. We give a $3$-pass turnstile streaming algorithm for $(1\pm ε)$-approximating $\# H$ in $\tilde{O}(\frac{m^{ρ(H)}}{ε^2\cdot \# H})$ space, where $ρ(H)$ is the fractional edge-cover of $H$. This improves upon and generalizes a result of McGregor et al. [PODS 2016], who gave a $3$-pass insertion-only streaming algorithm for $(1\pm ε)$-approximating the number $\# T$ of triangles in $\tilde{O}(\frac{m^{3/2}}{ε^2\cdot \# T})$ space if the algorithm is given additional oracle access to the degrees. 2. We provide a constant-pass streaming algorithm for $(1\pm ε)$-approximating $\# K_r$ in $\tilde{O}(\frac{mλ^{r-2}}{ε^2\cdot \# K_r})$ space for any $r\geq 3$, in a graph $G$ with degeneracy $λ$, where $K_r$ is a clique on $r$ vertices. This resolves a conjecture by Bera and Seshadhri [PODS 2020]. More generally, our reduction relates the adaptivity of a query algorithm to the pass complexity of a corresponding streaming algorithm, and it is applicable to all algorithms in standard sublinear-time graph query models, e.g., the (augmented) general model.

preprint2022arXiv

Sublinear-Time Clustering Oracle for Signed Graphs

Social networks are often modeled using signed graphs, where vertices correspond to users and edges have a sign that indicates whether an interaction between users was positive or negative. The arising signed graphs typically contain a clear community structure in the sense that the graph can be partitioned into a small number of polarized communities, each defining a sparse cut and indivisible into smaller polarized sub-communities. We provide a local clustering oracle for signed graphs with such a clear community structure, that can answer membership queries, i.e., "Given a vertex $v$, which community does $v$ belong to?", in sublinear time by reading only a small portion of the graph. Formally, when the graph has bounded maximum degree and the number of communities is at most $O(\log n)$, then with $\tilde{O}(\sqrt{n}\operatorname{poly}(1/\varepsilon))$ preprocessing time, our oracle can answer each membership query in $\tilde{O}(\sqrt{n}\operatorname{poly}(1/\varepsilon))$ time, and it correctly classifies a $(1-\varepsilon)$-fraction of vertices w.r.t. a set of hidden planted ground-truth communities. Our oracle is desirable in applications where the clustering information is needed for only a small number of vertices. Previously, such local clustering oracles were only known for unsigned graphs; our generalization to signed graphs requires a number of new ideas and gives a novel spectral analysis of the behavior of random walks with signs. We evaluate our algorithm for constructing such an oracle and answering membership queries on both synthetic and real-world datasets, validating its performance in practice.

preprint2021arXiv

On Testability of First-Order Properties in Bounded-Degree Graphs

We study property testing of properties that are definable in first-order logic (FO) in the bounded-degree graph and relational structure models. We show that any FO property that is defined by a formula with quantifier prefix $\exists^*\forall^*$ is testable (i.e., testable with constant query complexity), while there exists an FO property that is expressible by a formula with quantifier prefix $\forall^*\exists^*$ that is not testable. In the dense graph model, a similar picture is long known (Alon, Fischer, Krivelevich, Szegedy, Combinatorica 2000), despite the very different nature of the two models. In particular, we obtain our lower bound by a first-order formula that defines a class of bounded-degree expanders, based on zig-zag products of graphs. We expect this to be of independent interest. We then prove testability of some first-order properties that speak about isomorphism types of neighbourhoods, including testability of $1$-neighbourhood-freeness, and $r$-neighbourhood-freeness under a mild assumption on the degrees.

preprint2020arXiv

Augmenting the Algebraic Connectivity of Graphs

For any undirected graph $G=(V,E)$ and a set $E_W$ of candidate edges with $E\cap E_W=\emptyset$, the $(k,γ)$-spectral augmentability problem is to find a set $F$ of $k$ edges from $E_W$ with appropriate weighting, such that the algebraic connectivity of the resulting graph $H=(V,E\cup F)$ is least $γ$. Because of a tight connection between the algebraic connectivity and many other graph parameters, including the graph's conductance and the mixing time of random walks in a graph, maximising the resulting graph's algebraic connectivity by adding a small number of edges has been studied over the past 15 years. In this work we present an approximate and efficient algorithm for the $(k,γ)$-spectral augmentability problem, and our algorithm runs in almost-linear time under a wide regime of parameters. Our main algorithm is based on the following two novel techniques developed in the paper, which might have applications beyond the $(k,γ)$-spectral augmentability problem. (1) We present a fast algorithm for solving a feasibility version of an SDP for the algebraic connectivity maximisation problem from [GB06]. Our algorithm is based on the classic primal-dual framework for solving SDP, which in turn uses the multiplicative weight update algorithm. We present a novel approach of unifying SDP constraints of different matrix and vector variables and give a good separation oracle accordingly. (2) We present an efficient algorithm for the subgraph sparsification problem, and for a wide range of parameters our algorithm runs in almost-linear time, in contrast to the previously best known algorithm running in at least $Ω(n^2mk)$ time [KMST10]. Our analysis shows how the randomised BSS framework can be generalised in the setting of subgraph sparsification, and how the potential functions can be applied to approximately keep track of different subspaces.

preprint2020arXiv

Average Sensitivity of Spectral Clustering

Spectral clustering is one of the most popular clustering methods for finding clusters in a graph, which has found many applications in data mining. However, the input graph in those applications may have many missing edges due to error in measurement, withholding for a privacy reason, or arbitrariness in data conversion. To make reliable and efficient decisions based on spectral clustering, we assess the stability of spectral clustering against edge perturbations in the input graph using the notion of average sensitivity, which is the expected size of the symmetric difference of the output clusters before and after we randomly remove edges. We first prove that the average sensitivity of spectral clustering is proportional to $λ_2/λ_3^2$, where $λ_i$ is the $i$-th smallest eigenvalue of the (normalized) Laplacian. We also prove an analogous bound for $k$-way spectral clustering, which partitions the graph into $k$ clusters. Then, we empirically confirm our theoretical bounds by conducting experiments on synthetic and real networks. Our results suggest that spectral clustering is stable against edge perturbations when there is a cluster structure in the input graph.

preprint2020arXiv

More Effective Randomized Search Heuristics for Graph Coloring Through Dynamic Optimization

Dynamic optimization problems have gained significant attention in evolutionary computation as evolutionary algorithms (EAs) can easily adapt to changing environments. We show that EAs can solve the graph coloring problem for bipartite graphs more efficiently by using dynamic optimization. In our approach the graph instance is given incrementally such that the EA can reoptimize its coloring when a new edge introduces a conflict. We show that, when edges are inserted in a way that preserves graph connectivity, Randomized Local Search (RLS) efficiently finds a proper 2-coloring for all bipartite graphs. This includes graphs for which RLS and other EAs need exponential expected time in a static optimization scenario. We investigate different ways of building up the graph by popular graph traversals such as breadth-first-search and depth-first-search and analyse the resulting runtime behavior. We further show that offspring populations (e. g. a (1+$λ$) RLS) lead to an exponential speedup in $λ$. Finally, an island model using 3 islands succeeds in an optimal time of $Θ(m)$ on every $m$-edge bipartite graph, outperforming offspring populations. This is the first example where an island model guarantees a speedup that is not bounded in the number of islands.

preprint2016arXiv

Dynamic Graph Stream Algorithms in $o(n)$ Space

In this paper we study graph problems in dynamic streaming model, where the input is defined by a sequence of edge insertions and deletions. As many natural problems require $Ω(n)$ space, where $n$ is the number of vertices, existing works mainly focused on designing $\tilde{O}(n)$ space algorithms. Although sublinear in the number of edges for dense graphs, it could still be too large for many applications (e.g. $n$ is huge or the graph is sparse). In this work, we give single-pass algorithms beating this space barrier for two classes of problems. We present $o(n)$ space algorithms for estimating the number of connected components with additive error $\varepsilon n$ and $(1+\varepsilon)$-approximating the weight of minimum spanning tree, for any small constant $\varepsilon>0$. The latter improves previous $\tilde{O}(n)$ space algorithm given by Ahn et al. (SODA 2012) for connected graphs with bounded edge weights. We initiate the study of approximate graph property testing in the dynamic streaming model, where we want to distinguish graphs satisfying the property from graphs that are $\varepsilon$-far from having the property. We consider the problem of testing $k$-edge connectivity, $k$-vertex connectivity, cycle-freeness and bipartiteness (of planar graphs), for which, we provide algorithms using roughly $\tilde{O}(n^{1-\varepsilon})$ space, which is $o(n)$ for any constant $\varepsilon$. To complement our algorithms, we present $Ω(n^{1-O(\varepsilon)})$ space lower bounds for these problems, which show that such a dependence on $\varepsilon$ is necessary.

preprint2015arXiv

Congruent skein relations for colored HOMFLY-PT invariants and colored Jones polynomials

Colored HOMFLY-PT invariant, the generalization of the colored Jones polynomial, is one of the most important quantum invariants of links. This paper is devoted to investigating the basic structures of the colored HOMFLY-PT invariants of links. By using the HOMFLY-PT skein theory, firstly, we show that the (reformulated) colored HOMFLY-PT invariants actually lie in the ring $\mathbb{Z}[(q-q^{-1})^2,t^{\pm 1}]$. Secondly, we establish some symmetric formulas for colored HOMFLY-PT invariants of links, which include the rank-level duality as an easy consequence. Finally, motivated by the Labastida-Mariño-Ooguri-Vafa conjecture for framed links, we propose congruent skein relations for (reformulated) colored HOMFLY-PT invariants which are the generalizations of the skein relation for classical HOMFLY-PT polynomials. Then we study the congruent skein relation for colored Jones polynomials. In fact, we obtain a succinct formula for the case of knot. As an application, we prove a vanishing result for Reshetikhin-Turaev invariants of a family of 3-manifolds. Finally we study the congruent skein relations for $SU(n)$ quantum invariants.

preprint2015arXiv

Testing Cluster Structure of Graphs

We study the problem of recognizing the cluster structure of a graph in the framework of property testing in the bounded degree model. Given a parameter $\varepsilon$, a $d$-bounded degree graph is defined to be $(k, ϕ)$-clusterable, if it can be partitioned into no more than $k$ parts, such that the (inner) conductance of the induced subgraph on each part is at least $ϕ$ and the (outer) conductance of each part is at most $c_{d,k}\varepsilon^4ϕ^2$, where $c_{d,k}$ depends only on $d,k$. Our main result is a sublinear algorithm with the running time $\widetilde{O}(\sqrt{n}\cdot\mathrm{poly}(ϕ,k,1/\varepsilon))$ that takes as input a graph with maximum degree bounded by $d$, parameters $k$, $ϕ$, $\varepsilon$, and with probability at least $\frac23$, accepts the graph if it is $(k,ϕ)$-clusterable and rejects the graph if it is $\varepsilon$-far from $(k, ϕ^*)$-clusterable for $ϕ^* = c'_{d,k}\frac{ϕ^2 \varepsilon^4}{\log n}$, where $c'_{d,k}$ depends only on $d,k$. By the lower bound of $Ω(\sqrt{n})$ on the number of queries needed for testing graph expansion, which corresponds to $k=1$ in our problem, our algorithm is asymptotically optimal up to polylogarithmic factors.

preprint2015arXiv

Testing Small Set Expansion in General Graphs

We consider the problem of testing small set expansion for general graphs. A graph $G$ is a $(k,ϕ)$-expander if every subset of volume at most $k$ has conductance at least $ϕ$. Small set expansion has recently received significant attention due to its close connection to the unique games conjecture, the local graph partitioning algorithms and locally testable codes. We give testers with two-sided error and one-sided error in the adjacency list model that allows degree and neighbor queries to the oracle of the input graph. The testers take as input an $n$-vertex graph $G$, a volume bound $k$, an expansion bound $ϕ$ and a distance parameter $\varepsilon>0$. For the two-sided error tester, with probability at least $2/3$, it accepts the graph if it is a $(k,ϕ)$-expander and rejects the graph if it is $\varepsilon$-far from any $(k^*,ϕ^*)$-expander, where $k^*=Θ(k\varepsilon)$ and $ϕ^*=Θ(\frac{ϕ^4}{\min\{\log(4m/k),\log n\}\cdot(\ln k)})$. The query complexity and running time of the tester are $\widetilde{O}(\sqrt{m}ϕ^{-4}\varepsilon^{-2})$, where $m$ is the number of edges of the graph. For the one-sided error tester, it accepts every $(k,ϕ)$-expander, and with probability at least $2/3$, rejects every graph that is $\varepsilon$-far from $(k^*,ϕ^*)$-expander, where $k^*=O(k^{1-ξ})$ and $ϕ^*=O(ξϕ^2)$ for any $0<ξ<1$. The query complexity and running time of this tester are $\widetilde{O}(\sqrt{\frac{n}{\varepsilon^3}}+\frac{k}{\varepsilon ϕ^4})$. We also give a two-sided error tester with smaller gap between $ϕ^*$ and $ϕ$ in the rotation map model that allows (neighbor, index) queries and degree queries.

preprint2013arXiv

Detecting and Characterizing Small Dense Bipartite-like Subgraphs by the Bipartiteness Ratio Measure

We study the problem of finding and characterizing subgraphs with small \textit{bipartiteness ratio}. We give a bicriteria approximation algorithm \verb|SwpDB| such that if there exists a subset $S$ of volume at most $k$ and bipartiteness ratio $θ$, then for any $0<ε<1/2$, it finds a set $S'$ of volume at most $2k^{1+ε}$ and bipartiteness ratio at most $4\sqrt{θ/ε}$. By combining a truncation operation, we give a local algorithm \verb|LocDB|, which has asymptotically the same approximation guarantee as the algorithm \verb|SwpDB| on both the volume and bipartiteness ratio of the output set, and runs in time $O(ε^2θ^{-2}k^{1+ε}\ln^3k)$, independent of the size of the graph. Finally, we give a spectral characterization of the small dense bipartite-like subgraphs by using the $k$th \textit{largest} eigenvalue of the Laplacian of the graph.

preprint2011arXiv

The Small-Community Phenomenon in Networks

We investigate several geometric models of network which simultaneously have some nice global properties, that the small diameter property, the small-community phenomenon, which is defined to capture the common experience that (almost) every one in our society belongs to some meaningful small communities by the authors (2011), and that under certain conditions on the parameters, the power law degree distribution, which significantly strengths the results given by van den Esker (2008), and Jordan (2010). The results above, together with our previous progress in Li and Peng (2011), build a mathematical foundation for the study of communities and the small-community phenomenon in various networks. In the proof of the power law degree distribution, we develop the method of alternating concentration analysis to build concentration inequality by alternatively and iteratively applying both the sub- and super-martingale inequalities, which seems powerful, and which may have more potential applications.

preprint2010arXiv

New Structure of Knot Invariants

Based on the proof of Labastida-Mari{ñ}o-Ooguri-Vafa conjecture \cite{lmov}, we derive an infinite product formula for Chern-Simons partition functions, the generating function of quantum $\fsl_N$ invariants. Some symmetry properties of the infinite product will also be discussed.

preprint2010arXiv

On a proof of the Labastida-Marino-Ooguri-Vafa conjecture

We outline a proof of a remarkable conjecture of Labastida-Mari{ñ}o-Ooguri-Vafa about certain new algebraic structures of quantum link invariants and the integrality of infinite family of new topological invariants. Our method is based on the cut-and-join analysis and a special rational ring characterizing the structure of the Chern-Simons partition function.

Pan Peng

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

A Sublinear-Time Spectral Clustering Oracle with Improved Preprocessing Time

Approximately Counting Subgraphs in Data Streams

Sublinear-Time Clustering Oracle for Signed Graphs

On Testability of First-Order Properties in Bounded-Degree Graphs

Augmenting the Algebraic Connectivity of Graphs

Average Sensitivity of Spectral Clustering

More Effective Randomized Search Heuristics for Graph Coloring Through Dynamic Optimization

Dynamic Graph Stream Algorithms in $o(n)$ Space

Congruent skein relations for colored HOMFLY-PT invariants and colored Jones polynomials

Testing Cluster Structure of Graphs

Testing Small Set Expansion in General Graphs

Detecting and Characterizing Small Dense Bipartite-like Subgraphs by the Bipartiteness Ratio Measure

The Small-Community Phenomenon in Networks

New Structure of Knot Invariants

On a proof of the Labastida-Marino-Ooguri-Vafa conjecture