PageRank Centrality in Directed Graphs with Bounded In-Degree
We study the computational complexity of locally estimating a node's PageRank centrality in a directed graph $G$. For any node $t$, its PageRank centrality $π(t)$ is defined as the probability that a random walk in $G$, starting from a uniformly chosen node, terminates at $t$, where each step terminates with a constant probability $α\in(0,1)$. To obtain a multiplicative $\big(1\pm O(1)\big)$-approximation of $π(t)$ with probability $Ω(1)$, the previously best upper bound is $O(n^{1/2}\min\{ Δ_{in}^{1/2},Δ_{out}^{1/2},m^{1/4}\})$ from [Wang, Wei, Wen, Yang, STOC '24], where $n$ and $m$ denote the number of nodes and edges in $G$, and $Δ_{in}$ and $Δ_{out}$ upper bound the in-degrees and out-degrees of $G$, respectively. Using a refinement of the proof in the same paper, we establish a lower bound of $Ω(n^{1/2}\min\{Δ_{in}^{1/2}/n^γ,Δ_{out}^{1/2}/n^γ,m^{1/4}\})$, where $γ=\frac{1}{2}(2\max\{\log_{1/(1-α)}Δ_{in},1\}-1)^{-1}$. As $γ$ only depends on $Δ_{in}$ and $n^γ=O(1)$ for $Δ_{in}=Ω\left(n^{Ω(1)}\right)$, the known upper bound is tight if we only parameterize the complexity by $n$, $m$, and $Δ_{out}$. However, there remains a gap of $Ω(n^γ)$ when considering $Δ_{in}$, and this gap is large when $Δ_{in}$ is small. In the extreme case where $Δ_{in}\le1/(1-α)$, we have $γ=1/2$, leading to a gap of $Ω(n^{1/2})$ between the bounds $O(n^{1/2})$ and $Ω(1)$. In this paper, we present a new algorithm that achieves the above lower bound (up to logarithmic factors). The algorithm assumes that $n$ and the bounds $Δ_{in}$ and $Δ_{out}$ are known in advance. Our key technique is a novel randomized backwards propagation process that only propagates selectively based on Monte Carlo estimated PageRank scores.