Researcher profile

Hanzhi Wang

Hanzhi Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

Approximate Graph Propagation Revisited: Dynamic Parameterized Queries, Tighter Bounds and Dynamic Updates

We revisit Approximate Graph Propagation (AGP), a unified framework which captures various graph propagation tasks, such as PageRank, feature propagation in Graph Neural Networks (GNNs), and graph-based Retrieval-Augmented Generation (RAG). Our work focuses on the settings of dynamic graphs and dynamic parameterized queries, where the underlying graphs evolve over time (updated by edge insertions or deletions) and the input query parameters are specified on the fly to fit application needs. Our first contribution is an interesting observation that the SOTA solution, AGP-Static, can be adapted to support dynamic parameterized queries; however several challenges remain unresolved. Firstly, the query time complexity of AGP-Static is based on an assumption of using an optimal algorithm for subset sampling in its query algorithm. Unfortunately, back to that time, such an algorithm did not exist; without such an optimal algorithm, an extra $O(\log^2 n)$ factor is required in the query complexity, where $n$ is the number of vertices in the graphs. Secondly, AGP-Static performs poorly on dynamic graphs, taking $O(n\log n)$ time to process each update. To address these challenges, we propose a new algorithm, AGP-Static++, which is simpler yet reduces roughly a factor of $O(\log^2 n)$ in the query complexity while preserving the approximation guarantees of AGP-Static. However, AGP-Static++ still requires $O(n)$ time to process each update. To better support dynamic graphs, we further propose AGP-Dynamic, which achieves $O(1)$ amortized time per update, significantly improving the aforementioned $O(n)$ per-update bound, while still preserving the query complexity and approximation guarantees. Last, our comprehensive experiments validate the theoretical improvements: compared to the baselines, our algorithm achieves speedups of up to $177\times$ on update time and $10\times$ on query efficiency.

preprint2026arXiv

PageRank Centrality in Directed Graphs with Bounded In-Degree

We study the computational complexity of locally estimating a node's PageRank centrality in a directed graph $G$. For any node $t$, its PageRank centrality $π(t)$ is defined as the probability that a random walk in $G$, starting from a uniformly chosen node, terminates at $t$, where each step terminates with a constant probability $α\in(0,1)$. To obtain a multiplicative $\big(1\pm O(1)\big)$-approximation of $π(t)$ with probability $Ω(1)$, the previously best upper bound is $O(n^{1/2}\min\{ Δ_{in}^{1/2},Δ_{out}^{1/2},m^{1/4}\})$ from [Wang, Wei, Wen, Yang, STOC '24], where $n$ and $m$ denote the number of nodes and edges in $G$, and $Δ_{in}$ and $Δ_{out}$ upper bound the in-degrees and out-degrees of $G$, respectively. Using a refinement of the proof in the same paper, we establish a lower bound of $Ω(n^{1/2}\min\{Δ_{in}^{1/2}/n^γ,Δ_{out}^{1/2}/n^γ,m^{1/4}\})$, where $γ=\frac{1}{2}(2\max\{\log_{1/(1-α)}Δ_{in},1\}-1)^{-1}$. As $γ$ only depends on $Δ_{in}$ and $n^γ=O(1)$ for $Δ_{in}=Ω\left(n^{Ω(1)}\right)$, the known upper bound is tight if we only parameterize the complexity by $n$, $m$, and $Δ_{out}$. However, there remains a gap of $Ω(n^γ)$ when considering $Δ_{in}$, and this gap is large when $Δ_{in}$ is small. In the extreme case where $Δ_{in}\le1/(1-α)$, we have $γ=1/2$, leading to a gap of $Ω(n^{1/2})$ between the bounds $O(n^{1/2})$ and $Ω(1)$. In this paper, we present a new algorithm that achieves the above lower bound (up to logarithmic factors). The algorithm assumes that $n$ and the bounds $Δ_{in}$ and $Δ_{out}$ are known in advance. Our key technique is a novel randomized backwards propagation process that only propagates selectively based on Monte Carlo estimated PageRank scores.

preprint2022arXiv

Edge-based Local Push for Personalized PageRank

Personalized PageRank (PPR) is a popular node proximity metric in graph mining and network research. Given a graph G=(V,E) and a source node $s \in V$, a single-source PPR (SSPPR) query asks for the PPR value $\vpi(u)$ with respect to s, which represents the relative importance of node u in the context of the source node s. Among existing algorithms for SSPPR queries, LocalPush is a fundamental method which serves as a cornerstone for subsequent algorithms. In LocalPush, a push operation is a crucial primitive operation, which distributes the probability at a node u to ALL u's neighbors via the corresponding edges. Although this push operation works well on unweighted graphs, unfortunately, it can be rather inefficient on weighted graphs. In particular, on unbalanced weighted graphs where only a few of these edges take the majority of the total weight among them, the push operation would have to distribute insignificant probabilities along those edges which just take the minor weights, resulting in expensive overhead. To resolve this issue, we propose the EdgePush algorithm, a novel method for computing SSPPR queries on weighted graphs. EdgePush decomposes the aforementioned push operations in edge-based push, allowing the algorithm to operate at the edge level granularity. Hence, it can flexibly distribute the probabilities according to edge weights. Furthermore, our EdgePush allows a fine-grained termination threshold for each individual edge, leading to a superior complexity over LocalPush. Notably, we prove that EdgePush improves the theoretical query cost of LocalPush by an order of up to O(n) when the graph's weights are unbalanced, both in terms of $\ell_1$-error and normalized additive error. Our experimental results demonstrate that EdgePush significantly outperforms state-of-the-art baselines in terms of query efficiency on large motif-based and real-world weighted graphs.

preprint2022arXiv

Instant Graph Neural Networks for Dynamic Graphs

Graph Neural Networks (GNNs) have been widely used for modeling graph-structured data. With the development of numerous GNN variants, recent years have witnessed groundbreaking results in improving the scalability of GNNs to work on static graphs with millions of nodes. However, how to instantly represent continuous changes of large-scale dynamic graphs with GNNs is still an open problem. Existing dynamic GNNs focus on modeling the periodic evolution of graphs, often on a snapshot basis. Such methods suffer from two drawbacks: first, there is a substantial delay for the changes in the graph to be reflected in the graph representations, resulting in losses on the model's accuracy; second, repeatedly calculating the representation matrix on the entire graph in each snapshot is predominantly time-consuming and severely limits the scalability. In this paper, we propose Instant Graph Neural Network (InstantGNN), an incremental computation approach for the graph representation matrix of dynamic graphs. Set to work with dynamic graphs with the edge-arrival model, our method avoids time-consuming, repetitive computations and allows instant updates on the representation and instant predictions. Graphs with dynamic structures and dynamic attributes are both supported. The upper bounds of time complexity of those updates are also provided. Furthermore, our method provides an adaptive training strategy, which guides the model to retrain at moments when it can make the greatest performance gains. We conduct extensive experiments on several real-world and synthetic datasets. Empirical results demonstrate that our model achieves state-of-the-art accuracy while having orders-of-magnitude higher efficiency than existing methods.

preprint2022arXiv

Multi-objective optimization of actuation waveform for high-precision drop-on-demand inkjet printing

Drop-on-demand (DOD) inkjet printing has been considered as one of promising technologies for the fabrication of advanced functional materials. For a DOD printer, high-precision dispensing techniques for achieving satellite-free smaller droplets, have long been desired for patterning thin-film structures. The present study considers the inlet velocity of a liquid chamber located upstream of a dispensing nozzle as a control variable and aims to optimize its waveform using a sample-efficient Bayesian optimization algorithm. Firstly, the droplet dispensing dynamics are numerically reproduced by using an open-source OpenFOAM solver, interFoam, and the results are passed on to another code based on pyFoam. Then, the parameters characterizing the actuation waveform driving a DOD printer are determined by the Bayesian optimization (BO) algorithm so as to maximize a prescribed multi-objective function expressed as the sum of two factors, i.e., the size of a primary droplet and the presence of satellite droplets. The results show that the present BO algorithm can successfully find high-precision dispensing waveforms within 150 simulations. Specifically, satellite droplets can be effectively eliminated and the droplet diameter can be significantly reduced to 24.9% of the nozzle diameter by applying the optimal waveform.

preprint2020arXiv

Exact Single-Source SimRank Computation on Large Graphs

SimRank is a popular measurement for evaluating the node-to-node similarities based on the graph topology. In recent years, single-source and top-$k$ SimRank queries have received increasing attention due to their applications in web mining, social network analysis, and spam detection. However, a fundamental obstacle in studying SimRank has been the lack of ground truths. The only exact algorithm, Power Method, is computationally infeasible on graphs with more than $10^6$ nodes. Consequently, no existing work has evaluated the actual trade-offs between query time and accuracy on large real-world graphs. In this paper, we present ExactSim, the first algorithm that computes the exact single-source and top-$k$ SimRank results on large graphs. With high probability, this algorithm produces ground truths with a rigorous theoretical guarantee. We conduct extensive experiments on real-world datasets to demonstrate the efficiency of ExactSim. The results show that ExactSim provides the ground truth for any single-source SimRank query with a precision up to 7 decimal places within a reasonable query time.

preprint2020arXiv

Personalized PageRank to a Target Node, Revisited

Personalized PageRank (PPR) is a widely used node proximity measure in graph mining and network analysis. Given a source node $s$ and a target node $t$, the PPR value $π(s,t)$ represents the probability that a random walk from $s$ terminates at $t$, and thus indicates the bidirectional importance between $s$ and $t$. The majority of the existing work focuses on the single-source queries, which asks for the PPR value of a given source node $s$ and every node $t \in V$. However, the single-source query only reflects the importance of each node $t$ with respect to $s$. In this paper, we consider the {\em single-target PPR query}, which measures the opposite direction of importance for PPR. Given a target node $t$, the single-target PPR query asks for the PPR value of every node $s\in V$ to a given target node $t$. We propose RBS, a novel algorithm that answers approximate single-target queries with optimal computational complexity. We show that RBS improves three concrete applications: heavy hitters PPR query, single-source SimRank computation, and scalable graph neural networks. We conduct experiments to demonstrate that RBS outperforms the state-of-the-art algorithms in terms of both efficiency and precision on real-world benchmark datasets.