Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
20topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2026arXiv

AdaSpec: Adaptive Speculative Decoding for Fast, SLO-Aware Large Language Model Serving

Cloud-based Large Language Model (LLM) services often face challenges in achieving low inference latency and meeting Service Level Objectives (SLOs) under dynamic request patterns. Speculative decoding, which exploits lightweight models for drafting and LLMs for verification, has emerged as a compelling technique to accelerate LLM inference. However, existing speculative decoding solutions often fail to adapt to fluctuating workloads and dynamic system environments, resulting in impaired performance and SLO violations. In this paper, we introduce AdaSpec, an efficient LLM inference system that dynamically adjusts speculative strategies according to real-time request loads and system configurations. AdaSpec proposes a theoretical model to analyze and predict the efficiency of speculative strategies across diverse scenarios. Additionally, it implements intelligent drafting and verification algorithms to maximize performance while ensuring high SLO attainment. Experimental results on real-world LLM service traces demonstrate that AdaSpec consistently meets SLOs and achieves substantial performance improvements, delivering up to 66% speedup compared to state-of-the-art speculative inference systems. The source code is publicly available at https://github.com/cerebellumking/AdaSpec

preprint2026arXiv

Advanced Global Wildfire Activity Modeling with Hierarchical Graph ODE

Wildfires, as an integral component of the Earth system, are governed by a complex interplay of atmospheric, oceanic, and terrestrial processes spanning a vast range of spatiotemporal scales. Modeling their global activity on large timescales is therefore a critical yet challenging task. While deep learning has recently achieved significant breakthroughs in global weather forecasting, its potential for global wildfire behavior prediction remains underexplored. In this work, we reframe this problem and introduce the Hierarchical Graph ODE (HiGO), a novel framework designed to learn the multi-scale, continuous-time dynamics of wildfires. Specifically, we represent the Earth system as a multi-level graph hierarchy and propose an adaptive filtering message passing mechanism for both intra- and inter-level information flow, enabling more effective feature extraction and fusion. Furthermore, we incorporate GNN-parameterized Neural ODE modules at multiple levels to explicitly learn the continuous dynamics inherent to each scale. Through extensive experiments on the SeasFire Cube dataset, we demonstrate that HiGO significantly outperforms state-of-the-art baselines on long-range wildfire forecasting. Moreover, its continuous-time predictions exhibit strong observational consistency, highlighting its potential for real-world applications.

preprint2026arXiv

Advanced Long-term Earth System Forecasting

Reliable long-term forecasting of Earth system dynamics is fundamentally limited by instabilities in current artificial intelligence (AI) models during extended autoregressive simulations. These failures often originate from inherent spectral bias, leading to inadequate representation of critical high-frequency, small-scale processes and subsequent uncontrolled error amplification. Inspired by the nested grids in numerical models used to resolve small scales, we present TritonCast. At the core of its design is a dedicated latent dynamical core, which ensures the long-term stability of the macro-evolution at a coarse scale. An outer structure then fuses this stable trend with fine-grained local details. This design effectively mitigates the spectral bias caused by cross-scale interactions. In atmospheric science, it achieves state-of-the-art accuracy on the WeatherBench 2 benchmark while demonstrating exceptional long-term stability: executing year-long autoregressive global forecasts and completing multi-year climate simulations that span the entire available $2500$-day test period without drift. In oceanography, it extends skillful eddy forecast to $120$ days and exhibits unprecedented zero-shot cross-resolution generalization. Ablation studies reveal that this performance stems from the synergistic interplay of the architecture's core components. TritonCast thus offers a promising pathway towards a new generation of trustworthy, AI-driven simulations. This significant advance has the potential to accelerate discovery in climate and Earth system science, enabling more reliable long-term forecasting and deeper insights into complex geophysical dynamics.

preprint2026arXiv

Cavity-QED Simulation of a Maser beyond the Mean-Field Approximation

Based on the well-known Tavis-Cummings (TC) model of cavity quantum electrodynamics (QED), we introduce a method for quantum-mechanically simulating the dynamics of experimental masers beyond the mean-field approximation (MFA) that takes into account the spatial variation of the a.c. magnetic field of the maser's amplified microwave mode across its gain medium. The distribution in the coupling between the amplified mode and the medium's very large number (typically $10^{17}$) of spatially distributed quantum emitters can be determined straightforwardly for a given geometry and composition using an electromagnetic-field solver. Upon discretising this distribution as a histogram over a small finite number of bins, we assign -- as an approximation -- the same coupling to all emitters that fall within the same bin, where the value of this coupling equals the center value of the bin's range. With our approximate Hamiltonian arranged as a weighted sum over these bins, we generate expressions for expectation values of operators in the Heisenberg picture to second order in cumulant expansion, using the publicly available QuantumCumulants.jl package in Julia. For ten evenly spaced bins, our model, which can be run on a laptop computer, is used to simulate the recorded output from an experimental maser with a pentacene-doped para-terphenyl gain medium. We find that it replicates the quantum-mechanical features of the measured maser's dynamics, in particular its damped collective Rabi oscillations, more closely than the standard TC model under the MFA can, with an R$^2$ value of 0.774, as opposed to 0.265. Our model should thus aid the quantitative engineering of improved, optimised maser designs.

preprint2026arXiv

Clipped Affine Policy: Low-Complexity Near-Optimal Online Power Control for Energy Harvesting Communications over Fading Channels

This paper investigates online power control for point-to-point energy harvesting communications over wireless fading channels. A linear-policy-based approximation is derived for the relative-value function in the Bellman equation of the power control problem. This approximation leads to two fundamental power control policies: optimistic and robust clipped affine policies, both taking the form of a clipped affine function of the battery level and the reciprocal of channel signal-to-noise ratio coefficient. They are essentially battery-limited weighted directional waterfilling policies operating between adjacent time slots. By leveraging the relative-value approximation and derived policies, a domain-knowledge-enhanced reinforcement learning (RL) algorithm is proposed for online power control. The proposed approach is further extended to scenarios with energy and/or channel lookahead. Comprehensive simulation results demonstrate that the proposed methods achieve a good balance between computational complexity and optimality. In particular, the robust clipped affine policy (combined with RL, using at most five parameters) outperforms all existing approaches across various scenarios, with less than 2\% performance loss relative to the optimal policy.

preprint2026arXiv

Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models

In recent years, diffusion models trained on equilibrium molecular distributions have proven effective for sampling biomolecules. Beyond direct sampling, the score of such a model can also be used to derive the forces that act on molecular systems. However, while classical diffusion sampling usually recovers the training distribution, the corresponding energy-based interpretation of the learned score is often inconsistent with this distribution, even for low-dimensional toy systems. We trace this inconsistency to inaccuracies of the learned score at very small diffusion timesteps, where the model must capture the correct evolution of the data distribution. In this regime, diffusion models fail to satisfy the Fokker-Planck equation, which governs the evolution of the score. We interpret this deviation as one source of the observed inconsistencies and propose an energy-based diffusion model with a Fokker-Planck-derived regularization term to enforce consistency. We demonstrate our approach by sampling and simulating multiple biomolecular systems, including fast-folding proteins, and by introducing a state-of-the-art transferable Boltzmann emulator for dipeptides that supports simulation and achieves improved consistency and efficient sampling. Our code, model weights, and self-contained JAX and PyTorch notebooks are available at https://github.com/noegroup/ScoreMD.

preprint2026arXiv

ElasticDiT: Efficient Diffusion Transformers via Elastic Architecture and Sparse Attention for High-Resolution Image Generation on Mobile Devices

The Diffusion Transformer (DiT) architecture is the state-of-the-art paradigm for high-fidelity image generation, underpinning models like Stable Diffusion-3 and FLUX.1. However, deploying these models on resource-constrained mobile devices entails prohibitive computational and memory overhead. While efficiency-driven approaches like Linear-DiT and static pruning alleviate bottlenecks, they often incur quality degradation. Unlike cloud environments, mobile constraints require a single-model paradigm that dynamically balances fidelity and latency. We introduce ElasticDiT, which achieves this dynamic trade-off by adjusting spatial compression ratios and DiT block depths. By integrating Shift Sparse Block Attention (SSBA) and a Tiny DWT-Distilled VAE (T-DVAE), ElasticDiT reduces inference latency and memory footprint while maintaining image quality. Experiments confirm that ElasticDiT effectively covers a wide range of fidelity-latency trade-offs within a single set of parameters. By jointly adjusting compression and depth, a single ElasticDiT model can be reconfigured on-the-fly to outperform task-specific baselines. Specifically, our flex lite variant achieves an HPS of 32.87, surpassing the Flux model, while maintaining competitive quality at 84.16 percent average sparsity through SSBA. Furthermore, the plug-and-play T-DVAE provides SD3-level reconstruction with only 1/8x the computational cost of standard VAEs, and Flow-GRPO boosts semantic alignment (GenEval: 66.93 to 73.62). These results demonstrate that ElasticDiT offers a versatile, hardware-adaptive solution that eliminates the need for multiple specialized models, providing a promising path for future high-resolution image generation on mobile devices.

preprint2026arXiv

FaST: Efficient and Effective Long-Horizon Forecasting for Large-Scale Spatial-Temporal Graphs via Mixture-of-Experts

Spatial-Temporal Graph (STG) forecasting on large-scale networks has garnered significant attention. However, existing models predominantly focus on short-horizon predictions and suffer from notorious computational costs and memory consumption when scaling to long-horizon predictions and large graphs. Targeting the above challenges, we present FaST, an effective and efficient framework based on heterogeneity-aware Mixture-of-Experts (MoEs) for long-horizon and large-scale STG forecasting, which unlocks one-week-ahead (672 steps at a 15-minute granularity) prediction with thousands of nodes. FaST is underpinned by two key innovations. First, an adaptive graph agent attention mechanism is proposed to alleviate the computational burden inherent in conventional graph convolution and self-attention modules when applied to large-scale graphs. Second, we propose a new parallel MoE module that replaces traditional feed-forward networks with Gated Linear Units (GLUs), enabling an efficient and scalable parallel structure. Extensive experiments on real-world datasets demonstrate that FaST not only delivers superior long-horizon predictive accuracy but also achieves remarkable computational efficiency compared to state-of-the-art baselines. Our source code is available at: https://github.com/yijizhao/FaST.

preprint2026arXiv

Harmonic-Recycling Rectification Based on Novel Compact Dual-Band Resonator

Harmonic generation during radio frequency (RF)-dc conversion causes performance degradation of a microwave rectifying circuit. To suppress and recycle the harmonic power, this letter proposes a novel compact dual-band resonator (DBR) based on a microstrip coupled transmission line. It presents open-circuits at the second and third harmonic frequencies, which effectively block the higher order harmonic for power recycling. The conventional input cascading filters for harmonic rejection can be eliminated, simplifying the circuit topology and reducing loss. Theoretical analyses were carried out and corresponding equations were formulated for the proposed DBR. For validation, two rectifying circuits with/without the DBR operating at 2.2 GHz were fabricated and tested. Using the proposed DBR at 10 dBm RF power, the suppression of the second and third harmonic powers is enhanced by 18.4 and 7.6 dB, respectively. Besides, an improvement of RF-dc power conversion efficiency (PCE) was observed; specifically, PCE reached 73.2% at 10 dBm compared to 71.6% obtained from an equivalent rectifier.

preprint2026arXiv

Long-time behavior of a nonlocal Cahn-Hilliard equation with nonlocal dynamic boundary condition and singular potentials

We investigate the long-time behavior of a nonlocal Cahn-Hilliard equation in a bounded domain $Ω\subset\mathbb{R}^d$ $(d\in\{2,3\})$, subject to a kinetic rate-dependent nonlocal dynamic boundary condition. The kinetic rate $1/L$, with $L\in[0,+\infty)$, distinguishes different types of bulk-surface interactions. For general singular potentials, including the physically relevant logarithmic potential, we establish the existence of a global attractor $\mathcal{A}_m^L$ in a suitable complete metric space for any $L\in[0,+\infty)$. Moreover, we verify that the global attractor $\mathcal{A}_m^0$ is stable with respect to perturbations $\mathcal{A}_m^L$ for small $L>0$. When $L\in(0,+\infty)$, based on the strict separation property of global weak solutions, we further prove the existence of exponential attractors via a short-trajectory type technique, which also implies that the global attractor has finite fractal dimension. Finally, for this case, we show that every global weak solution converges to a single equilibrium in $\mathcal{L}^\infty$ as time goes to infinity, using a generalized Łojasiewicz-Simon inequality and an Alikakos-Moser type iteration.

preprint2026arXiv

MiMo-V2-Flash Technical Report

We present MiMo-V2-Flash, a Mixture-of-Experts (MoE) model with 309B total parameters and 15B active parameters, designed for fast, strong reasoning and agentic capabilities. MiMo-V2-Flash adopts a hybrid attention architecture that interleaves Sliding Window Attention (SWA) with global attention, with a 128-token sliding window under a 5:1 hybrid ratio. The model is pre-trained on 27 trillion tokens with Multi-Token Prediction (MTP), employing a native 32k context length and subsequently extended to 256k. To efficiently scale post-training compute, MiMo-V2-Flash introduces a novel Multi-Teacher On-Policy Distillation (MOPD) paradigm. In this framework, domain-specialized teachers (e.g., trained via large-scale reinforcement learning) provide dense and token-level reward, enabling the student model to perfectly master teacher expertise. MiMo-V2-Flash rivals top-tier open-weight models such as DeepSeek-V3.2 and Kimi-K2, despite using only 1/2 and 1/3 of their total parameters, respectively. During inference, by repurposing MTP as a draft model for speculative decoding, MiMo-V2-Flash achieves up to 3.6 acceptance length and 2.6x decoding speedup with three MTP layers. We open-source both the model weights and the three-layer MTP weights to foster open research and community collaboration.

preprint2026arXiv

NeuralOM: Neural Ocean Model for Subseasonal-to-Seasonal Simulation

Long-term, high-fidelity simulation of slow-changing physical systems, such as the ocean and climate, presents a fundamental challenge in scientific computing. Traditional autoregressive machine learning models often fail in these tasks as minor errors accumulate and lead to rapid forecast degradation. To address this problem, we propose NeuralOM, a general neural operator framework designed for simulating complex, slow-changing dynamics. NeuralOM's core consists of two key innovations: (1) a Progressive Residual Correction Framework that decomposes the forecasting task into a series of fine-grained refinement steps, effectively suppressing long-term error accumulation; and (2) a Physics-Guided Graph Network whose built-in adaptive messaging mechanism explicitly models multi-scale physical interactions, such as gradient-driven flows and multiplicative couplings, thereby enhancing physical consistency while maintaining computational efficiency. We validate NeuralOM on the challenging task of global Subseasonal-to-Seasonal (S2S) ocean simulation. Extensive experiments demonstrate that NeuralOM not only surpasses state-of-the-art models in forecast accuracy and long-term stability, but also excels in simulating extreme events. For instance, at a 60-day lead time, NeuralOM achieves a 13.3% lower RMSE compared to the best-performing baseline, offering a stable, efficient, and physically-aware paradigm for data-driven scientific computing. Code link: https://github.com/YuanGao-YG/NeuralOM.

preprint2026arXiv

Optimal Control of a Navier-Stokes-Cahn-Hilliard System for Membrane-fluid Interaction

We consider an optimal control problem for a two-dimensional Navier-Stokes-Cahn-Hilliard system arising in the modeling of fluid-membrane interaction. The fluid dynamics is governed by the incompressible Navier-Stokes equations, which are nonlinearly coupled with a sixth-order Cahn-Hilliard type equation representing the deformation of a flexible membrane through a phase-field variable. Building on the previously established existence and uniqueness of global strong solutions for the coupled system, we introduce an external forcing term acting on the fluid as the control variable. Then we seek to minimize a tracking-type cost functional, demonstrating the existence of an optimal control and deriving the associated first-order necessary optimality conditions. A key issue is to establish sufficient regularity for solutions of the adjoint system, which is crucial for the rigorous derivation of optimality conditions in the fluid dynamic setting.

preprint2026arXiv

Quantifier Elimination Meets Treewidth

In this paper, we address the complexity barrier inherent in Fourier-Motzkin elimination (FME) and cylindrical algebraic decomposition (CAD) when eliminating a block of (existential) quantifiers. To mitigate this, we propose exploiting structural sparsity in the variable dependency graph of quantified formulas. Utilizing tools from parameterized algorithms, we investigate the role of treewidth, a parameter that measures the graph's tree-likeness, in the process of quantifier elimination. A novel dynamic programming framework, structured over a tree decomposition of the dependency graph, is developed for applying FME and CAD, and is also extensible to general quantifier elimination procedures. Crucially, we prove that when the treewidth is a constant, the framework achieves a significant exponential complexity improvement for both FME and CAD, reducing the worst-case complexity bound from doubly exponential to single exponential. Preliminary experiments on sparse linear real arithmetic (LRA) and nonlinear real arithmetic (NRA) benchmarks confirm that our algorithm outperforms the existing popular heuristic-based approaches on instances exhibiting low treewidth.

preprint2026arXiv

Speak While Watching: Unleashing TRUE Real-Time Video Understanding Capability of Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) have achieved strong performance across many tasks, yet most systems remain limited to offline inference, requiring complete inputs before generating outputs. Recent streaming methods reduce latency by interleaving perception and generation, but still enforce a sequential perception-generation cycle, limiting real-time interaction. In this work, we target a fundamental bottleneck that arises when extending MLLMs to real-time video understanding: the global positional continuity constraint imposed by standard positional encoding schemes. While natural in offline inference, this constraint tightly couples perception and generation, preventing effective input-output parallelism. To address this limitation, we propose a parallel streaming framework that relaxes positional continuity through three designs: Overlapped, Group-Decoupled, and Gap-Isolated. These designs enable simultaneous perception and generation, allowing the model to process incoming inputs while producing responses in real time. Extensive experiments reveal that Group-Decoupled achieves the best efficiency-performance balance, maintaining high fluency and accuracy while significantly reducing latency. We further show that the proposed framework yields up to 2x acceleration under balanced perception-generation workloads, establishing a principled pathway toward speak-while-watching real-time systems. We make all our code publicly available: https://github.com/EIT-NLP/Speak-While-Watching.

preprint2025arXiv

Emergence of quantum spin liquid and spin-flop phase in Kitaev antiferromagnets in a [111] magnetic field

Kitaev magnets have emerged as pivotal systems for investigating frustrated magnetism, providing a unique platform to explore quantum phases governed by the interplay between bond-dependent anisotropy and external magnetic fields. However, the quantum phase diagrams, particularly near the dominant antiferromagnetic Kitaev regime, remain puzzling despite extensive studies. In this work, we perform unbiased exact diagonalization calculations of the Kitaev-$Γ$ model in a [111] magnetic field on a $C_{6}$-symmetric 24-site cluster. By calculating the $\mathbb{Z}_2$ flux density and the topological entanglement entropy, we reveal multiple phase transitions and identify signatures of both scalar and vector chiral orders in the intermediate-field regime between the Kitaev spin liquid and the polarized phase. As the negative $Γ$ interaction increases, we discover a proximate quantum spin liquid featured by a three-peak specific heat and a spin-flop phase at a moderate magnetic field. Our findings provide insight into the field-induced intermediate phases in the antiferromagnetic Kitaev model and pave the way for the hunt for emergent phases in real materials.

preprint2025arXiv

Global Weak Solutions of a Thermodynamically Consistent Diffuse Interface Model for Nonhomogeneous Incompressible Two-phase Flows with a Soluble Surfactant

We study a thermodynamically consistent diffuse interface model that describes the motion of a two-phase flow of two viscous incompressible Newtonian fluids with unmatched densities and a soluble surfactant in a bounded domain of two or three dimensions. The resulting hydrodynamic system consists of a nonhomogeneous Navier-Stokes system for the (volume averaged) velocity $\mathbf{u}$ and a coupled Cahn-Hilliard system for the phase-field variables $ϕ$ and $ψ$ that represent the difference in volume fractions of the binary fluids and the surfactant concentration, respectively. For the initial boundary value problem with physically relevant singular potentials subject to a no-slip boundary condition for the fluid velocity and homogeneous Neumann boundary conditions for the phase-field variables and the chemical potentials, we first establish the existence of global weak solutions in the case of non-degenerate mobilities based on a suitable semi-implicit time discretization. Next, we prove the existence of global weak solutions for a class of general degenerate mobilities, with the aid of a new type of approximations for both the mobilities and the singular parts of the potential densities.

preprint2024arXiv

Faster Differentially Private Top-$k$ Selection: A Joint Exponential Mechanism with Pruning

We study the differentially private top-$k$ selection problem, aiming to identify a sequence of $k$ items with approximately the highest scores from $d$ items. Recent work by Gillenwater et al. (ICML '22) employs a direct sampling approach from the vast collection of $d^{\,Θ(k)}$ possible length-$k$ sequences, showing superior empirical accuracy compared to previous pure or approximate differentially private methods. Their algorithm has a time and space complexity of $\tilde{O}(dk)$. In this paper, we present an improved algorithm with time and space complexity $O(d + k^2 / ε\cdot \ln d)$, where $ε$ denotes the privacy parameter. Experimental results show that our algorithm runs orders of magnitude faster than their approach, while achieving similar empirical accuracy.

preprint2024arXiv

Global Weak Solutions to a Navier-Stokes-Cahn-Hilliard System with Chemotaxis and Mass Transport: Cross Diffusion versus Logistic Degradation

We analyze a diffuse interface model that describes the dynamics of incompressible two-phase flows influenced by interactions with a soluble chemical substance, encompassing the chemotaxis effect, mass transport, and reactions. In the resulting coupled evolutionary system, the macroscopic fluid velocity field $\boldsymbol{v}$ satisfies a Navier--Stokes system driven by a capillary force, the phase field variable $φ$ is governed by a convective Cahn--Hilliard equation incorporating a mass source and a singular potential (e.g., the Flory--Huggins type), and the chemical concentration $σ$ obeys an advection-reaction-diffusion equation with logistic degradation, exhibiting a cross-diffusion structure akin to the Keller--Segel model for chemotaxis. Under general structural assumptions, we establish the existence of global weak solutions to the initial boundary value problem within a bounded smooth domain $Ω\subset \mathbb{R}^d$, $d=2,3$. The proof hinges on a novel semi-Galerkin scheme for a suitably regularized system, featuring a non-standard approximation of the singular potential. Moreover, with more restrictive assumptions on coefficients and data, we establish regularity properties and uniqueness of global weak solutions in the two-dimensional case. Our analysis contributes to a further understanding of phase separation processes under the interplay of fluid dynamics and chemotaxis, in particular, the influence of cross diffusion and logistic degradation.