Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
15works
0followers
17topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

15 published item(s)

preprint2026arXiv

Classical solution of the FeMo-cofactor model to chemical accuracy and its implications

The main source of reduced nitrogen for living things comes from nitrogenase, which converts N2 to NH3 at the FeMo-cofactor (FeMo-co). Because of its role in supporting life, the uncertainty surrounding the catalytic cycle, and its compositional richness with eight transition metal ions, FeMo-co has fascinated scientists for decades. After much effort, the complete atomic structure was resolved. However, its electronic structure, central to reactivity, remains under intense debate. FeMo-co's complexity, arising from many unpaired electrons, has led to suggestions that it lies beyond the reach of classical computing. Consequently, there has been much interest in the potential of quantum algorithms to compute its electronic structure. Estimating the cost to compute the ground-state to chemical accuracy (~1 kcal/mol) within one or more FeMo-co models is a common benchmark of quantum algorithms in quantum chemistry, with numerous resource estimates in the literature. Here we address how to perform the same task using classical computation. We use a 76 orbital/152 qubit resting state model, the subject of most quantum resource estimates. Based on insight into the multiple configuration nature of the states, we devise classical protocols that yield rigorous or empirical upper bounds to the ground-state energy. Extrapolating these we predict the ground-state energy with an estimated uncertainty on the order of chemical accuracy. Having performed this long-discussed computational task, we next consider implications beyond the model. We distill a simpler computational procedure which we apply to reveal the electronic landscape in realistic representations of the cofactor. We thus illustrate a path to a precise computational understanding of FeMo-co electronic structure.

preprint2026arXiv

Library Drift: Diagnosing and Fixing a Silent Failure Mode in Self-Evolving LLM Skill Libraries

Self-evolving skill libraries face a silent failure mode we term \emph{library drift}: unbounded skill accumulation without outcome-driven lifecycle management causes retrieval degradation, false-positive injections, and performance stagnation. Recent evaluation confirms the symptom--LLM-authored skills deliver +0.0pp gain while human-curated ones deliver +16.2pp (SkillsBench)--yet the underlying mechanism has not been isolated. We provide (1) a reproducible trigger: ablations that isolate drift--one disables skill injection (flat floor, +0.002), one imposes premature retirement (active harm, $-$0.019); (2) trace-level diagnostics: an append-only evidence log with per-skill contribution scores, attribution verdicts, and router engagement metrics that make the failure visible before it reaches end-task scores; and (3) a verified fix: a minimal governance recipe (outcome-driven retirement + bounded active-cap + meta-skill authoring prior) that lifts held-out pass@1 from a 0.258 baseline to a late-window mean of 0.584 (rolling gain $+$0.328) on MBPP+ hard-100 over 100 rounds. Eight ablations decompose which governance mechanisms are load-bearing and which are subsumed, providing a concrete playbook for diagnosing library drift in any self-evolving agent.

preprint2026arXiv

MiMo-V2-Flash Technical Report

We present MiMo-V2-Flash, a Mixture-of-Experts (MoE) model with 309B total parameters and 15B active parameters, designed for fast, strong reasoning and agentic capabilities. MiMo-V2-Flash adopts a hybrid attention architecture that interleaves Sliding Window Attention (SWA) with global attention, with a 128-token sliding window under a 5:1 hybrid ratio. The model is pre-trained on 27 trillion tokens with Multi-Token Prediction (MTP), employing a native 32k context length and subsequently extended to 256k. To efficiently scale post-training compute, MiMo-V2-Flash introduces a novel Multi-Teacher On-Policy Distillation (MOPD) paradigm. In this framework, domain-specialized teachers (e.g., trained via large-scale reinforcement learning) provide dense and token-level reward, enabling the student model to perfectly master teacher expertise. MiMo-V2-Flash rivals top-tier open-weight models such as DeepSeek-V3.2 and Kimi-K2, despite using only 1/2 and 1/3 of their total parameters, respectively. During inference, by repurposing MTP as a draft model for speculative decoding, MiMo-V2-Flash achieves up to 3.6 acceptance length and 2.6x decoding speedup with three MTP layers. We open-source both the model weights and the three-layer MTP weights to foster open research and community collaboration.

preprint2026arXiv

Tailored Prompts, Targeted Protection: Vulnerability-Specific LLM Analysis for Smart Contracts

Smart contracts on blockchains are prone to diverse security vulnerabilities that can lead to significant financial losses due to their immutable nature. Existing detection approaches often lack flexibility across vulnerability types and rely heavily on manually crafted expert rules. In this paper, we present an LLM-based framework for practical smart contract vulnerability detection. We construct and release a large-scale dataset comprising 31,165 professionally annotated vulnerability instances collected from over 3,200 real-world projects across 15 major blockchain platforms. Our approach leverages precise AST-based context extraction and vulnerability-specific prompt design to instantiate customized detectors for 13 prevalent vulnerability categories. Experimental results demonstrate strong effectiveness, achieving an average positive recall of 0.92 and an average negative recall of 0.85, highlighting the potential of carefully engineered contextual prompting for scalable and high-precision smart contract security analysis.

preprint2025arXiv

MiMo-Audio: Audio Language Models are Few-Shot Learners

Existing audio language models typically rely on task-specific fine-tuning to accomplish particular audio tasks. In contrast, humans are able to generalize to new audio tasks with only a few examples or simple instructions. GPT-3 has shown that scaling next-token prediction pretraining enables strong generalization capabilities in text, and we believe this paradigm is equally applicable to the audio domain. By scaling MiMo-Audio's pretraining data to over one hundred million of hours, we observe the emergence of few-shot learning capabilities across a diverse set of audio tasks. We develop a systematic evaluation of these capabilities and find that MiMo-Audio-7B-Base achieves SOTA performance on both speech intelligence and audio understanding benchmarks among open-source models. Beyond standard metrics, MiMo-Audio-7B-Base generalizes to tasks absent from its training data, such as voice conversion, style transfer, and speech editing. MiMo-Audio-7B-Base also demonstrates powerful speech continuation capabilities, capable of generating highly realistic talk shows, recitations, livestreaming and debates. At the post-training stage, we curate a diverse instruction-tuning corpus and introduce thinking mechanisms into both audio understanding and generation. MiMo-Audio-7B-Instruct achieves open-source SOTA on audio understanding benchmarks (MMSU, MMAU, MMAR, MMAU-Pro), spoken dialogue benchmarks (Big Bench Audio, MultiChallenge Audio) and instruct-TTS evaluations, approaching or surpassing closed-source models. Model checkpoints and full evaluation suite are available at https://github.com/XiaomiMiMo/MiMo-Audio.

preprint2025arXiv

STED and Consistency Scoring: A Framework for Evaluating LLM Structured Output Reliability

Large Language Models (LLMs) are increasingly deployed for structured data generation, yet output consistency remains critical for production applications. We introduce a comprehensive framework for evaluating and improving consistency in LLM-generated structured outputs. Our approach combines: (1) STED (Semantic Tree Edit Distance), a novel similarity metric balancing semantic flexibility with structural strictness when comparing JSON outputs, and (2) a consistency scoring framework aggregating multiple STED measurements across repeated generations to quantify reliability. Through systematic experiments on synthetic datasets with controlled schema, expression, and semantic variations, we demonstrate STED achieves superior performance ($0.86-0.90$ similarity for semantic equivalents, $0.0$ for structural breaks) compared to existing metrics including TED, BERTScore, and DeepDiff. Applying our framework to benchmark six LLMs reveals significant variations: Claude-3.7-Sonnet demonstrates exceptional consistency, maintaining near-perfect structural reliability even at high temperatures ($T=0.9$), while models like Claude-3-Haiku and Nova-Pro exhibit substantial degradation requiring careful tuning. Our framework enables practical applications including targeted model selection for structured tasks, iterative prompt refinement for reproducible results, and diagnostic analysis to identify inconsistency root causes. This work provides theoretical foundations and practical tools for ensuring reliable structured output generation in LLM-based production systems.

preprint2022arXiv

BiTAT: Neural Network Binarization with Task-dependent Aggregated Transformation

Neural network quantization aims to transform high-precision weights and activations of a given neural network into low-precision weights/activations for reduced memory usage and computation, while preserving the performance of the original model. However, extreme quantization (1-bit weight/1-bit activations) of compactly-designed backbone architectures (e.g., MobileNets) often used for edge-device deployments results in severe performance degeneration. This paper proposes a novel Quantization-Aware Training (QAT) method that can effectively alleviate performance degeneration even with extreme quantization by focusing on the inter-weight dependencies, between the weights within each layer and across consecutive layers. To minimize the quantization impact of each weight on others, we perform an orthonormal transformation of the weights at each layer by training an input-dependent correlation matrix and importance vector, such that each weight is disentangled from the others. Then, we quantize the weights based on their importance to minimize the loss of the information from the original weights/activations. We further perform progressive layer-wise quantization from the bottom layer to the top, so that quantization at each layer reflects the quantized distributions of weights and activations at previous layers. We validate the effectiveness of our method on various benchmark datasets against strong neural quantization baselines, demonstrating that it alleviates the performance degeneration on ImageNet and successfully preserves the full-precision model performance on CIFAR-100 with compact backbone networks.

preprint2022arXiv

Fast all-optical random number generator

We propose a simple and all-optical method for fast random number generation based on the laser mode hopping. Through periodically restarting a two-mode laser operating in the bistable state, a random number stream can be generated due to the spontaneous emission noise. To validate the feasibility of this method, we perform a theoretical simulation using the common vertical-cavity surface-emitting laser (VCSEL) with two polarization modes. Numerical results demonstrate that fast 2.5 Gb/s random number streams can be continuously obtained with verified randomness. This scheme provide a fully monolithic solution for random number generator, due to its simple and all-optical structure.

preprint2022arXiv

On Effective Scheduling of Model-based Reinforcement Learning

Model-based reinforcement learning has attracted wide attention due to its superior sample efficiency. Despite its impressive success so far, it is still unclear how to appropriately schedule the important hyperparameters to achieve adequate performance, such as the real data ratio for policy optimization in Dyna-style model-based algorithms. In this paper, we first theoretically analyze the role of real data in policy training, which suggests that gradually increasing the ratio of real data yields better performance. Inspired by the analysis, we propose a framework named AutoMBPO to automatically schedule the real data ratio as well as other hyperparameters in training model-based policy optimization (MBPO) algorithm, a representative running case of model-based methods. On several continuous control tasks, the MBPO instance trained with hyperparameters scheduled by AutoMBPO can significantly surpass the original one, and the real data ratio schedule found by AutoMBPO shows consistency with our theoretical analysis.

preprint2022arXiv

Tests of gravitational scalar polarization and constraints of chameleon $f(R)$ gravity from comprehensive analysis of binary pulsars

Chameleon $f(R)$ gravity is equivalent to a class of scalar-tensor theories of gravity with chameleon screening mechanism allowing the theory to satisfy local tests of gravity. Within the framework of chameleon $f(R)$, we study the impact of the chameleon mechanism on the orbital evolution of binary pulsars, and calculate in detail the post-Keplerian (PK) effects (periastron advance, Einstein delay, Shapiro delay, orbital period decay and eccentricity decay) of binary orbit. The differences in PK effects between general relativity (GR) and chameleon $f(R)$ are elegantly quantified by a combination of star's compactness and theory parameter. We use the mass-radius relation to break the degeneracy between these two parameters, thus allowing us to constrain the theory. We simulate the temporal evolution of the orbital period and eccentricity of neutron star (NS) - white dwarf (WD) binaries, and the results indicate that the orbital evolution is typically faster than in GR due to the emission of dipole radiation in chameleon $f(R)$. We use the observables of PK parameters from the three NS-WD binary pulsars to place constraints on chameleon $f(R)$ and possible deviations from GR by performing Monte-Carlo simulations. We find that PSR J1738$+$0333 is the most constraining test of chameleon $f(R)$ in these systems. Our results show no solid evidence of the existence of helicity-0 or helicity-1 polarization states inducing dipole radiation, exclude significant strong-field deviations and confirm that GR is still valid for strong-field asymmetric systems.

preprint2021arXiv

Dynamics of anisotropic oxygen-ion migration in strained cobaltites

Orientation control of oxygen vacancy channel (OVC) is a highly desirable for tailoring oxygen diffusion as it serves fast transport channel in ion conductors, which is widespread exploited in solid-state fuel cells, catalysts, and ion-batteries. Direct observation of oxygen-ions hopping towards preferential vacant sites is a key to clarifying migration pathways. Here we report the anisotropic oxygen-ion migration mediated by strain in ultrathin cobaltites via in-situ thermal activation in an atomic-resolved transmission electron microscopy. Oxygen migration pathways are constructed on the basis of the atomic structure during the OVC switching, which is manifested as the vertical-to-horizontal OVC switching under tensile strain, but the horizontal-to-diagonal switching under compression. We evaluate the topotactic structural changes to OVC, determine the crucial role of tolerance factor for OVC stability and establish the strain-dependent phase diagram. Our work provides a practical guide for engineering OVC orientation that is applicable ionic-oxide electronics.

preprint2021arXiv

Hierarchical Neural Architecture Search via Operator Clustering

Recently, the efficiency of automatic neural architecture design has been significantly improved by gradient-based search methods such as DARTS. However, recent literature has brought doubt to the generalization ability of DARTS, arguing that DARTS performs poorly when the search space is changed, i.e, when different set of candidate operators are used. Regularization techniques such as early stopping have been proposed to partially solve this problem. In this paper, we tackle this problem from a different perspective by identifying two contributing factors to the collapse of DARTS when the search space changes: (1) the correlation of similar operators incurs unfavorable competition among them and makes their relative importance score unreliable and (2) the optimization complexity gap between the proxy search stage and the final training. Based on these findings, we propose a new hierarchical search algorithm. With its operator clustering and optimization complexity match, the algorithm can consistently find high-performance architecture across various search spaces. For all the five variants of the popular cell-based search spaces, the proposed algorithm always obtains state-of-the-art architecture with best accuracy on the CIFAR-10, CIFAR-100 and ImageNet over other well-established DARTS-alike algorithms. Code is available at https://github.com/susan0199/StacNAS.

preprint2020arXiv

Constraining Screened Modified Gravity by Space-borne Gravitational-wave Detectors

The screened modified gravity (SMG) is a unified theoretical framework, which describes the scalar-tensor gravity with screening mechanism. Based on the gravitational-wave (GW) waveform derived in our previous work \citep{liu2018waveforms}, in this article we investigate the potential constraints on SMG theory through the GW observation of the future space-borne GW detectors, including LISA, TianQin and Taiji. We find that, for the EMRIs consisting of a massive black hole and a neutron star, if the EMRIs are at Virgo cluster, the GW signals can be detected by the detectors at quite high significant level, and the screened parameter $ε_{\rm NS}$ can be constrained at about $\mathcal{O}(10^{-5})$, which is more than one order of magnitude tighter than the potential constraint given by ground-based Einstein telescope. However, for the EMRIs consisting of a massive black hole and a white dwarf, it is more difficult to be detected than the previous case. For the specific SMG models, including chameleon, symmetron and dilaton, we find these constraints are complementary with that from Cassini experiment, but weaker than those from lunar laser ranging observations and binary pulsars, due to the strong gravitational potentials on the surface of neutron stars. By analyzing the deviation of GW waveform in SMG from that in general relativity, as anticipated, we find the dominant contribution of the SMG constraining comes from the correction terms in the GW phases, rather than the extra polarization modes or the correction terms in the GW amplitudes.

preprint2020arXiv

Constraints of general screened modified gravities from comprehensive analysis of binary pulsars

Testing gravity by binary pulsars nowadays becomes a key issue. Screened modified gravity is a kind of scalar-tensor theory with screening mechanism in order to satisfy the tight Solar System tests. In this paper, we investigate how the screening mechanism affects the orbital dynamics of binary pulsars, and calculate in detail the five post-Keplerian (PK) parameters in this theory. These parameters differ from those of general relativity (GR), and the differences are quantified by the scalar charges, which lead to the dipole radiation in this theory. We combine the observables of PK parameters for the ten binary pulsars, respectively, to place the constraints on the scalar charges and possible deviations from GR. The dipole radiation in the neutron star (NS) - white dwarf (WD) binaries leads to more stringent constraints on deviations from GR. The most constraining systems for the scalar charges of NS and WD are PSR~B1913$+$16 and PSR~J1738$+$0333, respectively. The results of all tests exclude significant strong-field deviations and show good agreement with GR.

preprint2020arXiv

Phonon Magic Angle in Two-Dimensional Puckered Homostructures

The emergence of twistronics provides an unprecedented platform to modulate the band structure, resulting in exotic electronic phenomena ranging from ferromagnetism to superconductivity. However, such concept on phonon engineering is still lacking. Here, we extend the 'twistnonics' to 2D puckered materials with a 'phonon magic angle' discovered by molecular dynamics simulation. The phonon magic angle, with the TP-1 and TP-2 direction overlapped, remains a high level or even enhances phonon transport capability due to van der Waals confinement. This novel phenomenon originates from the confined vdW interaction and ordered atomic vibration caused by the perfect lattice arrangement that the atoms of the top layer can be stuck to the spaces of the bottom layer. Moreover, it is found that both the in-plane and out-of-plane thermal transport properties can be effectively regulated by applying the twist. Through the phononic and electronic analysis, the deterioration of phonon transport capability for other twist angles are attributed to the suppression of acoustic phonon modes, reduction of phonon lifetimes and mismatched lattice vibration between layers. Our findings shed light on the twistnonics of low-dimensional asymmetrical materials and can be further extended to electronic and photonic devices.