Researcher profile

Allan Sly

Allan Sly contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

A Hierarchical Language Model with Predictable Scaling Laws and Provable Benefits of Reasoning

We introduce a family of synthetic languages with hierarchical structure -- generated by a broadcast process on trees -- for which the role of context length and reasoning in autoregressive generation can be analyzed precisely. At the heart of our analytic approach is an \emph{exact $k$-gram ansatz} in place of transformers with context length $k$, a substitution we then validate empirically. Using this ansatz we derive explicit asymptotic predictions for distributional statistics of the sequences produced by a trained model, instantiated in two settings. For the \emph{Ising broadcast process} (a soft-constrained language), we prove that the variance of the generated sum scales log-linearly in the context depth and its kurtosis converges to that of a Gaussian -- both deviating from the true language for any sublinear context. For the \emph{coloring broadcast process} (a hard-constrained language) in the freezing regime, bounded-context autoregression produces sequences that, with high probability, are inconsistent with \emph{any} valid coloring of the underlying tree. Together these results imply an $Ω(n)$ lower bound on the context length required to faithfully sample length-$n$ sequences. In contrast, we prove that an autoregressive \emph{reasoning} model with only $Θ(\log n)$ working memory can sample exactly from the true language -- an exponential improvement. We confirm both the lower-bound predictions and the reasoning-based upper bound empirically with transformers trained on the synthetic language; the trained models track our asymptotic predictions quantitatively across a wide range of context sizes.

preprint2026arXiv

Mixing times for the TASEP on the circle

We study mixing times for the totally asymmetric simple exclusion process (TASEP) on a circle of length $N$ with $k$ particles. We show that the mixing time is of order $N^2 \min(k,N-k)^{-1/2}$, and that the cutoff phenomenon does not occur. This confirms behavior which was separately predicted by Jara, Lacoin and Peres, and it is more broadly believed to hold for integrable models in the KPZ-universalty class. Our arguments rely on a connection to periodic last passage percolation with a detailed analysis of flat geodesics, as well as a novel random extension and time shift argument for last passage percolation.

preprint2022arXiv

Ising model on trees and factors of IID

We study the ferromagnetic Ising model on the infinite $d$-regular tree under the free boundary condition. This model is known to be a factor of IID in the uniqueness regime, when the inverse temperature $β\ge 0$ satisfies $\tanh β\le (d-1)^{-1}$. However, in the reconstruction regime ($\tanh β> (d-1)^{-\frac{1}{2}}$), it is not a factor of IID. We construct a factor of IID for the Ising model beyond the uniqueness regime via a strong solution to an infinite dimensional stochastic differential equation which partially answers a question of Lyons. The solution $\{X_t(v) \}$ of the SDE is distributed as \[ X_t(v) = tτ_v + B_t(v), \] where $\{τ_v \}$ is an Ising sample and $\{B_t(v) \}$ are independent Brownian motions indexed by the vertices in the tree. Our construction holds whenever $\tanh β\le c(d-1)^{-\frac{1}{2}}$, where $c>0$ is an absolute constant.

preprint2022arXiv

The SIR model in a moving population: propagation of infection and herd immunity

In a collection of particles performing independent random walks on $\mathbb Z^d$ we study the spread of an infection with SIR dynamics. Susceptible particles become infected when they meet an infected particle. Infected particles heal and are removed at rate $ν$. We show that when $ν$ is small, with positive probability the infection survives forever and grows linearly. Furthermore, after the infection reaches a region, it quickly passes through and leaves behind a $\textit{herd immunity}$ regime consisting of recovered particles, a small positive density of susceptible particles, and no infected particles. One notable feature of this model is the simultaneously existence of supercritical and subcritical phases on either side of an infection front of $O(1)$ width.

preprint2021arXiv

Nonexistence of Bigeodesics in Integrable Models of Last Passage Percolation

Bi-infinite geodesics are fundamental objects of interest in planar first passage percolation. A longstanding conjecture states that under mild conditions there are almost surely no bigeodesics, however the result has not been proved in any case. For the exactly solvable model of directed last passage percolation on $\mathbb{Z}^2$ with i.i.d. exponential passage times, we study the corresponding question and show that almost surely the only bigeodesics are the trivial ones, i.e., the horizontal and vertical lines. The proof makes use of estimates for last passage time available from the integrable probability literature to study coalescence structure of finite geodesics, thereby making rigorous a heuristic argument due to Newman.

preprint2020arXiv

Consistency Thresholds for the Planted Bisection Model

The planted bisection model is a random graph model in which the nodes are divided into two equal-sized communities and then edges are added randomly in a way that depends on the community membership. We establish necessary and sufficient conditions for the asymptotic recoverability of the planted bisection in this model. When the bisection is asymptotically recoverable, we give an efficient algorithm that successfully recovers it. We also show that the planted bisection is recoverable asymptotically if and only if with high probability every node belongs to the same community as the majority of its neighbors. Our algorithm for finding the planted bisection runs in time almost linear in the number of edges. It has three stages: spectral clustering to compute an initial guess, a "replica" stage to get almost every vertex correct, and then some simple local moves to finish the job. An independent work by Abbe, Bandeira, and Hall establishes similar (slightly weaker) results but only in the case of logarithmic average degree.

preprint2020arXiv

Learning Sparse Graphons and the Generalized Kesten-Stigum Threshold

The problem of learning graphons has attracted considerable attention across several scientific communities, with significant progress over the recent years in sparser regimes. Yet, the current techniques still require diverging degrees in order to succeed with efficient algorithms in the challenging cases where the local structure of the graph is homogeneous. This paper provides an efficient algorithm to learn graphons in the constant expected degree regime. The algorithm is shown to succeed in estimating the rank-$k$ projection of a graphon in the $L_2$ metric if the top $k$ eigenvalues of the graphon satisfy a generalized Kesten-Stigum condition.

preprint2020arXiv

Survival and extinction of epidemics on random graphs with general degrees

In this paper, we establish the necessary and sufficient criterion for the contact process on Galton-Watson trees (resp. random graphs) to exhibit the phase of extinction (resp. short survival). We prove that the survival threshold $λ_1$ for a Galton-Watson tree is strictly positive if and only if its offspring distribution $ξ$ has an exponential tail, i.e., $\mathbb{E} e^{cξ}<\infty$ for some $c>0$, settling a conjecture by Huang and Durrett [12]. On the random graph with degree distribution $μ$, we show that if $μ$ has an exponential tail, then for small enough $λ$ the contact process with the all-infected initial condition survives for $n^{1+o(1)}$-time w.h.p. (short survival), while for large enough $λ$ it runs over $e^{Θ(n)}$-time w.h.p. (long survival). When $μ$ is subexponential, we prove that the contact process w.h.p. displays long survival for any fixed $λ>0$.

preprint2020arXiv

The critical one-dimensional multi-particle DLA

We study one-dimensional multi-particle Diffusion Limited Aggregation (MDLA) at its critical density $λ=1$. Previous works have verified that the size of the aggregate $X_t$ at time $t$ is $t^{1/2}$ in the subcritical regime and linear in the supercritical regime. This paper establishes the conjecture that the growth rate at criticiality is $t^{2/3}$. Moreover, we derive the scaling limit proving that $$\big\{ t^{-2/3}X_{st} \big\}_{s\geq 0} \overset{d}{\rightarrow} \Big\{ \int_0^s Z_u du \Big\}_{s\geq 0}, $$ where the speed process $\{Z_t\}$ is a $(-\frac{1}{3})$-self-similar diffusion given by $Z_t = (3V_t)^{-2/3}$, where $V_t$ is the $\frac{8}{3}$-Bessel process. The proof shows that locally the speed process can be well approximated by a stochastic integral representation which itself can be approximated by a critical branching process with continuous edge lengths. From these representations, we determine its infinitesimal drift and variance to show that the speed asymptotically satisfies the SDE $dZ_t = 2Z_t^{5/2}dB_t$. To make these approximations, regularity properties of the process are established inductively via a multiscale argument.

preprint2019arXiv

Social learning equilibria

We consider a large class of social learning models in which a group of agents face uncertainty regarding a state of the world, share the same utility function, observe private signals, and interact in a general dynamic setting. We introduce Social Learning Equilibria, a static equilibrium concept that abstracts away from the details of the given extensive form, but nevertheless captures the corresponding asymptotic equilibrium behavior. We establish general conditions for agreement, herding, and information aggregation in equilibrium, highlighting a connection between agreement and information aggregation.