Researcher profile

Svante Janson

Svante Janson contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
22works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

22 published item(s)

preprint2023arXiv

On Knuth's conjecture for back and forward arcs in Depth First Search in a random digraph with geometric outdegree distribution

Donald Knuth, in a draft of a coming volume of The Art of Computer Programming, has recently conjectured that in Depth-First Search of a random digraph with geometric outdegree distribution, the numbers of back and forward arcs have the same distribution. We show that this conjecture is equivalent to an equality between two generating functions defined by different recursions. Unfortunately, we have not been able so use this to prove the conjecture, which still is open, but we hope that this note will inspire others to succeed with the conjecture.

preprint2022arXiv

Asymptotic normality for $m$-dependent and constrained $U$-statistics, with applications to pattern matching in random strings and permutations

We study (asymmetric) $U$-statistics based on a stationary sequence of $m$-dependent variables; moreover, we consider constrained $U$-statistics, where the defining multiple sum only includes terms satisfying some restrictions on the gaps between indices. Results include a law of large numbers and a central limit theorem. Special attention is paid to degenerate cases where, after the standard normalization, the asymptotic variance vanishes; in these cases non-normal limits occur after a different normalization. The results are motivated by applications to pattern matching in random strings and permutations. We obtain both new results and new proofs of old results.

preprint2022arXiv

Depth-First Search performance in a random digraph with geometric outdegree distribution

We present an analysis of the depth-first search algorithm in a random digraph model with independent outdegrees having a geometric distribution. The results include asymptotic results for the depth profile of vertices, the height (maximum depth) and average depth, the number of trees in the forest, the size of the largest and second-largest trees, and the numbers of arcs of different types in the depth-first jungle. Most results are first order. For the height we show an asymptotic normal distribution. This analysis proposed by Donald Knuth in his next to appear volume of The Art of Computer Programming gives interesting insight in one of the most elegant and efficient algorithm for graph analysis due to Tarjan.

preprint2022arXiv

Edge coherence in multiplex networks

This paper introduces a nonparametric framework for the setting where multiple networks are observed on the same set of nodes, also known as multiplex networks. Our objective is to provide a simple parameterization which explicitly captures linear dependence between the different layers of networks. For non-Euclidean observations, such as shapes and graphs, the notion of "linear" must be defined appropriately. Taking inspiration from the representation of stochastic processes and the analogy of the multivariate spectral representation of a stochastic process with joint exchangeability of Bernoulli arrays, we introduce the notion of edge coherence as a measure of linear dependence in the graph limit space. Edge coherence is defined for pairs of edges from any two network layers and is the key novel parameter. We illustrate the utility of our approach by eliciting simple models such as a correlated stochastic blockmodel and a correlated inhomogeneous graph limit model.

preprint2022arXiv

Fluctuations of Subgraph Counts in Graphon Based Random Graphs

Given a graphon $W$ and a finite simple graph $H$, with vertex set $V(H)$, denote by $X_n(H, W)$ the number of copies of $H$ in a $W$-random graph on $n$ vertices. The asymptotic distribution of $X_n(H, W)$ was recently obtained by Hladký, Pelekis, and Šileikis (2021) in the case where $H$ is a clique. In this paper, we extend this result to any fixed graph $H$. Towards this we introduce a notion of $H$-regularity of graphons and show that if the graphon $W$ is not $H$-regular, then $X_n(H, W)$ has Gaussian fluctuations with scaling $n^{|V(H)|-\frac{1}{2}}$. On the other hand, if $W$ is $H$-regular, then the fluctuations are of order $n^{|V(H)|-1}$ and the limiting distribution of $X_n(H, W)$ can have both Gaussian and non-Gaussian components, where the non-Gaussian component is a (possibly) infinite weighted sum of centered chi-squared random variables with the weights determined by the spectral properties of a graphon derived from $W$. Our proofs use the asymptotic theory of generalized $U$-statistics developed by Janson and Nowicki (1991). We also investigate the structure of $H$-regular graphons for which either the Gaussian or the non-Gaussian component of the limiting distribution (but not both) is degenerate. Interestingly, there are also $H$-regular graphons $W$ for which both the Gaussian or the non-Gaussian components are degenerate, that is, $X_n(H, W)$ has a degenerate limit even under the scaling $n^{|V(H)|-1}$. We give an example of this degeneracy with $H=K_{1, 3}$ (the 3-star) and also establish non-degeneracy in a few examples. This naturally leads to interesting open questions on higher-order degeneracies.

preprint2022arXiv

Quantitative bounds in the central limit theorem for $m$-dependent random variables

For each $n\ge 1$, let $X_{n,1},\ldots,X_{n,N_n}$ be real random variables and $S_n=\sum_{i=1}^{N_n}X_{n,i}$. Let $m_n\ge 1$ be an integer. Suppose $(X_{n,1},\ldots,X_{n,N_n})$ is $m_n$-dependent, $E(X_{ni})=0$, $E(X_{ni}^2)<\infty$ and $σ_n^2:=E(S_n^2)>0$ for all $n$ and $i$. Then, \begin{gather*} d_W\Bigl(\frac{S_n}{σ_n},\,Z\Bigr)\le 30\,\bigl\{c^{1/3}+12\,U_n(c/2)^{1/2}\bigr\}\quad\quad\text{for all }n\ge 1\text{ and }c>0, \end{gather*} where $d_W$ is Wasserstein distance, $Z$ a standard normal random variable and $$U_n(c)=\frac{m_n}{σ_n^2}\,\sum_{i=1}^{N_n}E\Bigl[X_{n,i}^2\,1\bigl\{\abs{X_{n,i}}>c\,σ_n/m_n\bigr\}\Bigr].$$ Among other things, this estimate of $d_W\bigl(S_n/σ_n,\,Z\bigr)$ yields a similar estimate of $d_{TV}\bigl(S_n/σ_n,\,Z\bigr)$ where $d_{TV}$ is total variation distance.

preprint2022arXiv

The number of occurrences of patterns in a random tree or forest permutation

The classes of tree permutations and forest permutations were defined by Acan and Hitczenko (2016). We study random permutations of a given length from these classes, and in particular the number of occurrences of a fixed pattern in one of these random permutations. The main results show that the distributions of these numbers are asymptotically normal. The proof uses representations of random tree and forest permutations that enable us to express the number of occurrences of a pattern by a type of $U$-statistics; we then use general limit theorems for the latter.

preprint2021arXiv

Can smooth graphons in several dimensions be represented by smooth graphons on $[0,1]$?

A graphon that is defined on $[0,1]^d$ and is Hölder$(α)$ continuous for some $d\ge2$ and $α\in(0,1]$ can be represented by a graphon on $[0,1]$ that is Hölder$(α/d)$ continuous. We give examples that show that this reduction in smoothness to $α/d$ is the best possible, for any $d$ and $α$; for $α=1$, the example is a dot product graphon and shows that the reduction is the best possible even for graphons that are polynomials. A motivation for studying the smoothness of graphon functions is that this represents a key assumption in non-parametric statistical network analysis. Our examples show that making a smoothness assumption in a particular dimension is not equivalent to making it in any other latent dimension.

preprint2021arXiv

Short cycles in high genus unicellular maps

We study large uniform random maps with one face whose genus grows linearly with the number of edges, which are a model of discrete hyperbolic geometry. In previous works, several hyperbolic geometric features have been investigated. In the present work, we study the number of short cycles in a uniform unicellular map of high genus, and we show that it converges to a Poisson distribution. As a corollary, we obtain the law of the systole of uniform unicellular maps in high genus. We also obtain the asymptotic distribution of the vertex degrees in such a map.

preprint2020arXiv

Central limit theorems for additive functionals and fringe trees in tries

We give general theorems on asymptotic normality for additive functionals of random tries generated by a sequence of independent strings. These theorems are applied to show asymptotic normality of the distribution of random fringe trees in a random trie. Formulas for asymptotic mean and variance are given. In particular, the proportion of fringe trees of size $k$ (defined as number of keys) is asymptotically, ignoring oscillations, $c/(k(k-1))$ for $k\ge2$, where $c=1/(1+H)$ with $H$ the entropy of the digits. Another application gives asymptotic normality of the number of $k$-protected nodes in a random trie. For symmetric tries, it is shown that the asymptotic proportion of $k$-protected nodes (ignoring oscillations) decreases geometrically as $k\to\infty$.

preprint2020arXiv

Continuous time digital search tree and a border aggregation model

We consider the continuous-time version of the random digital search tree, and construct a coupling with a border aggregation model as studied in Thacker and Volkov (2018), showing a relation between the height of the tree and the time required for aggregation. This relation carries over to the corresponding discrete-time models. As a consequence we find a very precise asymptotic result for the time to aggregation, using recent results by Drmota et al.\ (2020) for the digital search tree.

preprint2020arXiv

Hidden Words Statistics for Large Patterns

We study here the so called subsequence pattern matching also known as hidden pattern matching in which one searches for a given pattern $w$ of length $m$ as a subsequence in a random text of length $n$. The quantity of interest is the number of occurrences of $w$ as a subsequence (i.e., occurring in not necessarily consecutive text locations). This problem finds many applications from intrusion detection, to trace reconstruction, to deletion channel, and to DNA-based storage systems. In all of these applications, the pattern $w$ is of variable length. To the best of our knowledge this problem was only tackled for a fixed length $m=O(1)$ [Flajolet, Szpankowski and Vallée, 2006]. In our main result we prove that for $m=o(n^{1/3})$ the number of subsequence occurrences is normally distributed. In addition, we show that under some constraints on the structure of $w$ the asymptotic normality can be extended to $m=o(\sqrt{n})$. For a special pattern $w$ consisting of the same symbol, we indicate that for $m=o(n)$ the distribution of number of subsequences is either asymptotically normal or asymptotically log normal. We conjecture that this dichotomy is true for all patterns. We use Hoeffding&#39;s projection method for $U$-statistics to prove our findings.

preprint2020arXiv

On the independence number of some random trees

We show that for many models of random trees, the independence number divided by the size converges almost surely to a constant as the size grows to infinity; the trees that we consider include random recursive trees, binary and $m$-ary search trees, preferential attachment trees, and others. The limiting constant is computed, analytically or numerically, for several examples. The method is based on Crump-Mode-Jagers branching processes.

preprint2020arXiv

The space $D$ in several variables: random variables and higher moments

We study the Banach space $D([0,1]^m)$ of functions of several variables that are (in a certain sense) right-continuous with left limits, and extend several results previously known for the standard case $m=1$. We give, for example, a description of the dual space, and we show that a bounded multilinear form always is measurable with respect to the $σ$-field generated by the point evaluations. These results are used to study random functions in the space. (I.e., random elements of the space.) In particular, we give results on existence of moments (in different senses) of such random functions, and we give an application to the Zolotarev distance between two such random functions.

preprint2019arXiv

Mean and variance of balanced Pólya urns

It is well known that in a small Pólya urn, i.e., an urn where second largest real part of an eigenvalue is at most half the largest eigenvalue, the distribution of the numbers of balls of different colours in the urn is asymptotically normal under weak additional conditions. We consider the balanced case, and then give asymptotics of the mean and the covariance matrix, showing that after appropriate normalization, the mean and covariance matrix converge to the mean and variance of the limiting normal distribution.

preprint2019arXiv

Preferential attachment without vertex growth: emergence of the giant component

We study the following preferential attachment variant of the classical Erdos-Renyi random graph process. Starting with an empty graph on n vertices, new edges are added one-by-one, and each time an edge is chosen with probability roughly proportional to the product of the current degrees of its endpoints (note that the vertex set is fixed). We determine the asymptotic size of the giant component in the supercritical phase, confirming a conjecture of Pittel from 2010. Our proof uses a simple method: we condition on the vertex degrees (of a multigraph variant), and use known results for the configuration model.

preprint2018arXiv

Inversions in split trees and conditional Galton--Watson trees

We study $I(T)$, the number of inversions in a tree $T$ with its vertices labeled uniformly at random, which is a generalization of inversions in permutations. We first show that the cumulants of $I(T)$ have explicit formulas involving the $k$-total common ancestors of $T$ (an extension of the total path length). Then we consider $X_n$, the normalized version of $I(T_n)$, for a sequence of trees $T_n$. For fixed $T_{n}$&#39;s, we prove a sufficient condition for $X_n$ to converge in distribution. As an application, we identify the limit of $X_n$ for complete $b$-ary trees. For $T_n$ being split trees, we show that $X_n$ converges to the unique solution of a distributional equation. Finally, when $T_n$&#39;s are conditional Galton--Watson trees, we show that $X_n$ converges to a random variable defined in terms of Brownian excursions. By exploiting the connection between inversions and the total path length, we are able to give results that are stronger and much broader compared to previous work by Panholzer and Seitz.

preprint2018arXiv

Preferential Attachment When Stable

We study an urn process with two urns, initialized with a ball each. Balls are added sequentially, the urn being chosen independently with probability proportional to the $α^{th}$ power $(α>1)$ of the existing number of balls. We study the (rare) event that the urn compositions are balanced after the addition of $2n-2$ new balls. We derive precise asymptotics of the probability of this event by embedding the process in continuous time. Quite surprisingly, a fine control on this probability may be leveraged to derive a lower tail Large Deviation Principle (LDP) for $L = \sum_{i=1}^{n} \frac{S_i^2}{i^2}$, where $\{S_n : n \geq 0\}$ is a simple symmetric random walk started at zero. We provide an alternate proof of the LDP via coupling to Brownian motion, and subsequent derivation of the LDP for a continuous time analogue of $L$. Finally, we turn our attention back to the urn process conditioned to be balanced, and provide a functional limit law describing the trajectory of the urn process.