Researcher profile

Stephen Ebert

Stephen Ebert contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

ZAYA1-8B Technical Report

We present ZAYA1-8B, a reasoning-focused mixture-of-experts (MoE) model with 700M active and 8B total parameters, built on Zyphra's MoE++ architecture. ZAYA1-8B's core pretraining, midtraining, and supervised fine-tuning (SFT) were performed on a full-stack AMD compute, networking, and software platform. With under 1B active parameters, ZAYA1-8B matches or exceeds DeepSeek-R1-0528 on several challenging mathematics and coding benchmarks, and remains competitive with substantially larger open-weight reasoning models. ZAYA1-8B was trained from scratch for reasoning, with reasoning data included from pretraining onward using an answer-preserving trimming scheme. Post-training uses a four-stage RL cascade: reasoning warmup on math and puzzles; a 400-task RLVE-Gym curriculum; math and code RL with test-time compute traces and synthetic code environments built from competitive-programming references; and behavioral RL for chat and instruction following. We also introduce Markovian RSA, a test-time compute method that recursively aggregates parallel reasoning traces while carrying forward only bounded-length reasoning tails between rounds. In TTC evaluation, Markovian RSA raises ZAYA1-8B to 91.9\% on AIME'25 and 89.6\% on HMMT'25 while carrying forward only a 4K-token tail, narrowing the gap to much larger reasoning models including Gemini-2.5 Pro, DeepSeek-V3.2, and GPT-5-High.

preprint2022arXiv

$T \overline{T}$ Deformations of Supersymmetric Quantum Mechanics

We define a manifestly supersymmetric version of the $T \overline{T}$ deformation appropriate for a class of $(0+1)$-dimensional theories with $\mathcal{N} = 1$ or $\mathcal{N} = 2$ supersymmetry, including one presentation of the super-Schwarzian theory which is dual to JT supergravity. These deformations are written in terms of Noether currents associated with translations in superspace, so we refer to them collectively as $f(\mathcal{Q})$ deformations. We provide evidence that the $f(\mathcal{Q})$ deformations of $\mathcal{N} = 1$ and $\mathcal{N} = 2$ theories are on-shell equivalent to the dimensionally reduced supercurrent-squared deformations of $2d$ theories with $\mathcal{N} = (0,1)$ and $\mathcal{N} = (1,1)$ supersymmetry, respectively. In the $\mathcal{N} = 1$ case, we present two forms of the $f(\mathcal{Q})$ deformation which drive the same flow, and clarify their equivalence by studying the analogous equivalent deformations in the non-supersymmetric setting.

preprint2022arXiv

$T\overline{T}$-deformed free energy of the Airy model

Sharpening the holographic correspondence of Jackiw-Teitelboim (JT) gravity and its dual matrix model description at a finite radial cutoff $λ$ through the $T\overline{T}$ deformation is of interest. To proceed, we simplify the problem by considering the Airy model and deform Airy correlators in the same way as in $T\overline{T}$-deformed JT gravity. We use those correlators to compute the annealed and quenched free energies for both $λ> 0$ and $λ< 0$ from an integral representation of the replica trick. At the leading order in $λ$ and low temperatures, we confirm that the genus-zero quenched free energy monotonically decreases as a function of temperature when perturbation theory is valid. We then study the all-genus quenched free energy at low temperatures, where we discover and discuss subtleties due to non-perturbative effects in the Airy model, as well as the contributions from the non-perturbative branch under the $T\overline{T}$ deformation.

preprint2022arXiv

Field Theory of Interacting Boundary Gravitons

Pure three-dimensional gravity is a renormalizable theory with two free parameters labelled by $G$ and $Λ$. As a consequence, correlation functions of the boundary stress tensor in AdS$_3$ are uniquely fixed in terms of one dimensionless parameter, which is the central charge of the Virasoro algebra. The same argument implies that AdS$_3$ gravity at a finite radial cutoff is a renormalizable theory, but now with one additional parameter corresponding to the cutoff location. This theory is conjecturally dual to a $T\overline{T}$-deformed CFT, assuming that such theories actually exist. To elucidate this, we study the quantum theory of boundary gravitons living on a cutoff planar boundary and the associated correlation functions of the boundary stress tensor. We compute stress tensor correlation functions to two-loop order ($G$ being the loop counting parameter), extending existing tree level results. This is made feasible by the fact that the boundary graviton action simplifies greatly upon making a judicious field redefinition, turning into the Nambu-Goto action. After imposing Lorentz invariance, the correlators at this order are found to be unambiguous up to a single undetermined renormalization parameter.

preprint2021arXiv

Descendants in celestial CFT and emergent multi-collinear factorization

Multi-collinear factorization limits provide a window to study how locality and unitarity of scattering amplitudes can emerge dynamically from celestial CFT, the conjectured holographic dual to gauge and gravitational theories in flat space. To this end, we first use asymptotic symmetries to commence a systematic study of conformal and Kac-Moody descendants in the OPE of celestial gluons. Recursive application of these OPEs then equips us with a novel holographic method of computing the multi-collinear limits of gluon amplitudes. We perform this computation for some of the simplest helicity assignments of the collinear particles. The prediction from the OPE matches with Mellin transforms of the expressions in the literature to all orders in conformal descendants. In a similar vein, we conclude by studying multi-collinear limits of graviton amplitudes in the leading approximation of sequential double-collinear limits, again finding a consistency check against the leading order OPE of celestial gravitons.