Researcher profile

Sungyeon Yang

Sungyeon Yang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

ZAYA1-8B Technical Report

We present ZAYA1-8B, a reasoning-focused mixture-of-experts (MoE) model with 700M active and 8B total parameters, built on Zyphra's MoE++ architecture. ZAYA1-8B's core pretraining, midtraining, and supervised fine-tuning (SFT) were performed on a full-stack AMD compute, networking, and software platform. With under 1B active parameters, ZAYA1-8B matches or exceeds DeepSeek-R1-0528 on several challenging mathematics and coding benchmarks, and remains competitive with substantially larger open-weight reasoning models. ZAYA1-8B was trained from scratch for reasoning, with reasoning data included from pretraining onward using an answer-preserving trimming scheme. Post-training uses a four-stage RL cascade: reasoning warmup on math and puzzles; a 400-task RLVE-Gym curriculum; math and code RL with test-time compute traces and synthetic code environments built from competitive-programming references; and behavioral RL for chat and instruction following. We also introduce Markovian RSA, a test-time compute method that recursively aggregates parallel reasoning traces while carrying forward only bounded-length reasoning tails between rounds. In TTC evaluation, Markovian RSA raises ZAYA1-8B to 91.9\% on AIME'25 and 89.6\% on HMMT'25 while carrying forward only a 4K-token tail, narrowing the gap to much larger reasoning models including Gemini-2.5 Pro, DeepSeek-V3.2, and GPT-5-High.

preprint2022arXiv

de Sitter Microstates from $T\bar T+Λ_2$ and the Hawking-Page Transition

We obtain microstates accounting for the Gibbons-Hawking entropy in $dS_3$, along with a subleading logarithmic correction, from the solvable $T\bar T+Λ_2$ deformation of a seed CFT with sparse light spectrum. The microstates arise as the dressed CFT states near dimension $Δ=c/6$, associated with the Hawking-Page transition; they dominate the real spectrum of the deformed theory. We exhibit an analogue of the Hawking-Page transition in de Sitter. Appropriate generalizations of the $T\bar T+Λ_2$ deformation are required to treat model-dependent local bulk physics (subleading at large central charge) and higher dimensions. These results add considerably to the already strong motivation for the continued pursuit of such generalizations along with a more complete characterization of $T\bar T$ type theories, building from existing results in these directions.