Researcher profile

Jun Dai

Jun Dai contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

Draft Less, Retrieve More: Hybrid Tree Construction for Speculative Decoding

Speculative decoding (SD) accelerates large language model inference by leveraging a draft-then-verify paradigm. To maximize the acceptance rate, recent methods construct expansive draft trees, which unfortunately incur severe VRAM bandwidth and computational overheads that bottleneck end-to-end speedups. While dynamic-depth pruning can reduce this latency by removing marginal branches, it also discards potentially valid candidates, preventing the acceptance rate from reaching the upper bound of dense trees. In this paper, we identify a critical opportunity in resource allocation: the transition from dense to pruned drafting frees up significant computational budget. To break this Pareto tradeoff, we introduce Graft, a compensation framework that couples pruning and retrieval as mutually reinforcing operations. Pruning supplies sufficient budget for retrieval, while retrieval compensates for pruning-induced coverage loss and recovers accepted length. By employing a sequential `prune-then-graft' mechanism, Graft attaches highly predictive retrieved tokens into positions opened by pruning, filling the topological gaps with near-zero overhead. Graft is entirely training-free and lossless. Comprehensive evaluations show that Graft establishes a new Pareto frontier across practical deployment settings, including short-context generation, long-context generation, and large-scale models. On short-context benchmarks, it achieves up to 5.41$\times$ speedup and improves average speedup over EAGLE-3 by up to 21.8% on the large-scale Qwen3-235B. We also provide a preliminary exploration of applying Graft to the DFlash-style block drafting paradigm, offering initial evidence and insights for extending grafting beyond autoregressive draft trees.

preprint2026arXiv

When Hidden States Drift: Can KV Caches Rescue Long-Range Speculative Decoding?

Speculative decoding accelerates LLM inference, but SOTA hidden-state-based drafters suffer from long-range decay: draft accuracy degrades as the speculative step increases. Existing work attributes this decay to train-inference mismatch and proposes test-time training (TTT) as a remedy, yet we observe that long-range decay persists even in TTT-trained drafters. We revisit long-range decay from the perspective of context information preservation. In hidden-state reuse, we argue the target hidden state acts as a biased context compression: it aggregates historical token information according to the attention query at the current position, yielding a compact representation optimized for immediate next-token prediction. This compression can suppress information less relevant to the current query but important for later speculative steps. In contrast, the target model's KV cache serves as an explicit context, retaining the complete set of token-wise KV representations. We therefore posit the KV-Reuse Hypothesis: allowing the draft model to reuse the target KV cache can provide richer signals for long-horizon drafting. To test this hypothesis, we introduce KVShot, a diagnostic framework that compares three reuse paradigms: hidden-only, KV-only, and hybrid. Extensive evaluations on Qwen3-8B show that KV-Reuse improves long-range acceptance, although end-to-end speedups remain marginal under current training pipelines. Our analysis identifies two key structural bottlenecks: shallow drafters struggle to estimate target queries accurately, and draft-side KV projections receive sparse gradient signals. These findings suggest that realizing the full potential of KV-aware decoding requires moving beyond TTT toward block-wise training paradigms. By exposing these bottlenecks, KVShot provides a foundational diagnostic testbed and a clear roadmap for designing next-generation inference architectures.

preprint2022arXiv

A partial filament eruption in three steps induced by external magnetic reconnection

We present an investigation of partial filament eruption on 2012 June 17 in the active region NOAA 11504. For the first time, we observed the vertical splitting process during the partial eruption with high resolution narrow band images at 10830 . The active filament was rooted in a small sunspot of the active region. Particularly, it underwent the partial eruption in three steps, i.e. the precursor, the first eruption, and the second eruption, while the later two were associated with a C1.0 flare and a C3.9 flare, respectively. During the precursor, slow magnetic reconnection took place between the filament and the adjoining loops that also rooted in the sunspot. The continuous reconnection not only caused the filament to split into three groups of threads vertically but also formed a new filament, which was growing and accompanied brightening took place around the site. Subsequently, the growing filament erupted together with one group splitted threads, resulted in the first eruption. At the beginning of the first eruption, a subsequent magnetic reconnection occurred between the erupting splitted threads and another ambient magnetic loop. After about three minutes, the second eruption occurred as a result of the eruption of two larger unstable filaments induced by the magnetic reconnection. The high-resolution observation provides a direct evidence that magnetic reconnection between filament and its ambient magnetic fields could induce the vertical splitting of the filament, resulting in partial eruption.

preprint2022arXiv

Statistical analysis of circular-ribbon flares

Circular-ribbon flares (CFs) are a special type of solar flares owing to their particular magnetic topology. In this paper, we conducted a comprehensive statistical analysis of 134 CFs from 2011 September to 2017 June, including four B-class, 82 C-class, 40 M-class, and eight X-class flares, respectively. The flares were observed by the Atmospheric Imaging Assembly (AIA) on board the Solar Dynamics Observatory (SDO) spacecraft. The physical properties of CFs are derived, including the location, area ($A_{CF}$), equivalent radius ($r_{CF}$) assuming a semi-spherical fan dome, lifetime ($τ_{CF}$), and peak SXR flux in 1$-$8 Å. It is found that all CFs are located in active regions, with the latitudes between -30$^\circ$ and 30$^\circ$. The distributions of areas and lifetimes could be fitted with a log-normal function. There is a positive correlation between the lifetime and area. The peak SXR flux in 1$-$8 Å is well in accord with a power-law distribution with an index of $-$1.42. For the 134 CFs, 57\% of them are accompanied by remote brightenings or ribbons. A positive correlation exists between the total length ($L_{RB}$) and average distance ($D_{RB}$) of remote brightenings. About 47\% and 51\% of the 134 CFs are related to type III radio bursts and jets, respectively. The association rates are independent of flare energies. About 38\% of CFs are related to mini-filament eruptions, and the association rates increase with flare classes. Only 28\% of CFs are related to CMEs, meaning that a majority of them are confined rather than eruptive events. There is a positive correlation between the CME speed and peak SXR flux in 1$-$8 Å, and faster CMEs tend to be wider.

preprint2022arXiv

Sunspot shearing and sudden retraction motion associated with the 2013 August 17 M3.3 Flare

In this Letter, we give a detailed analysis to the M3.3 class flare that occurred on August 17, 2013 (SOL2013-08-17T18:16). It presents a clear picture of mutual magnetic interaction initially from the photosphere to the corona via the abrupt rapid shearing motion of a small sunspot before the flare, and then suddenly from the corona back to the photosphere via the sudden retraction motion of the same sunspot during the flare impulsive phase. About 10 hours before the flare, a small sunspot in the active region NOAA 11818 started to move northeast along a magnetic polarity inversion line (PIL), creating a shearing motion that changed the quasi-static state of the active region. A filament right above the PIL was activated following the movement of the sunspot and then got partially erupted. The eruption eventually led to the M3.3 flare. The sunspot was then suddenly pulled back to the opposite direction upon the flare onset. During the backward motion, the Lorentz force underwent a simultaneous impulsive change both in magnitude and direction. Its directional change is found to be conformable with the retraction motion. The observation provides direct evidence for the role of the shearing motion of the sunspot in powering and triggering the flare. It especially confirms that the abrupt motion of a sunspot during a solar flare is the result of a back reaction caused by the reconfiguration of the coronal magnetic field.

preprint2020arXiv

The Formation and Eruption of A Sigmoidal Filament Driven by Rotating Network Magnetic Fields

We present the formation and eruption of a sigmoidal filament driven by rotating network magnetic fields (RNFs) near the center of the solar disk, which was observed by the one-meter aperture New Vacuum Solar Telescope (NVST) at Fuxian Solar Observatory (FSO) on 2018 July 12. Counterclockwise RNFs twist two small-scale filaments at their northeastern foot-point region, giving a rotation of nearly 200 degree within about 140 minutes. The motion of the RNF has a tendency to accelerate at first and then decelerate obviously, as the average rotation speed increased from 10 to 150 ,and then slowed down to 50 . Coalescence then occurs between filaments F1 and F2. Meanwhile the fine structures in the southwestern region of the filament was involved in another interaction of coalescence. The subsequent EUV brightening due to plasma heating is observed in the two interaction regions. These interacting structures, including F1, F2 and the fine structures in the southwestern region, eventually evolve into a larger-scale sigmoidal filament twisted in the same direction as the RNFs gave. The twist of the sigmoidal filament has exceeded 4π and the filament erupted finally. The motion of the sigmoidal filament keeps uniform until a nearby jet collides, causing the filament to erupt faster. These results provide evidence that RNF plays an important role in the formation and eruption of the sigmoidal filament. The phenomena also suggests that the kink instability is the trigger mechanism for the filament eruption.

preprint2012arXiv

A First-principles Prediction of Two-Dimensional Superconductivity in Pristine B2C Single layer

Based on first-principles lattice dynamics and electron-phonon coupling calculations, B2C sheet is predicted to be a two-dimensional (2D) phonon-mediated superconductor with a relatively high transition temperature (Tc). The electron-phonon coupling parameter calculated is 0.92, and it is mainly contributed by low frequency out-of-plane phonon modes and electronic states with a π character. When the Coulomb pseudopotential is set to 0.10, the estimated temperature Tc is 19.2 K. To be best of our knowledge, B2C is the first pristine 2D superconductor with a Tc higher than the boiling point of liquid helium.