Researcher profile

Chun Chen

Chun Chen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

Direct Detection of Type II-P Supernova Progenitors with the Euclid and CSST Surveys

A central goal in supernova (SN) research is to identify and characterize their progenitors. However, this is very difficult due to the limited archival images with sufficient depth and spatial resolution required for direct progenitor detection and due to the circumstellar dust which often biases the estimate of their intrinsic parameters. This field will be revolutionized by Euclid and the upcoming Chinese Space Station Survey Telescope (CSST), which conduct deep, wide-field, high-resolution and multi-band imaging surveys. We analyze their detection capability by comparing the model magnitudes of red supergiant (RSG) progenitors with the detection limits under different conditions, and we estimate the annual detection rates with Monte-Carlo simulations. We explore how to recover the intrinsic properties of SN progenitors with the help of radiation transfer calculations in circumstellar dust. We find the optical and near-infrared filters of the Euclid and CSST are highly effective for detecting RSG progenitors. We predict that archival images from the completed 2 surveys will enable $\lesssim13$ (or 24) progenitor detections per year within the mass range of 8--16 (or 8--25)M_\odot, an order of magnitude higher than the current detection rate of $\sim1$ detection per year. In the presence of circumstellar dust, the emerging spectral energy distribution (SED) of the progenitor is mainly affected by the optical depth and is almost independent of dust temperature in the Euclid and CSST filters. Our mock tests demonstrate that one can derive the progenitor mass and dust optical depth simultaneously by fitting the observed SED over the 11 filters of the 2 surveys while fixing the dust temperature to a typical value. Euclid and CSST will significantly enlarge the sample of direct progenitor detections with accurate mass measurements, which is crucial to resolve the long-standing RSG problem.

preprint2026arXiv

LoopTrap: Termination Poisoning Attacks on LLM Agents

Modern LLM agents solve complex tasks by operating in iterative execution loops, where they repeatedly reason, act, and self-evaluate progress to determine when a task is complete. In this work, we show that while this self-directed loop facilitates autonomy, it also introduces a critical risk: by injecting malicious prompts into the agent's context, an adversary can distort the agent's termination judgment, making it believe the task remains incomplete and leading to unbounded computation.To understand this threat, we define and systematically characterize it as Termination Poisoning and design 10 representative attack strategies. Through a empirical study spanning 8 LLM agents and 60 tasks, we demonstrate that different LLM agents exhibit distinct behavioral signatures that determine which strategies succeed. These transferable patterns can serve as principled guidance for crafting effective attacks against previously unseen agents and tasks, enabling scalable red-teaming beyond manually designed templates. Building on these insights, we introduce LoopTrap, an automated red-teaming framework that synthesizes target-specific malicious prompts by exploiting agent behavioral tendencies. LoopTrap first constructs a behavioral profile of the target agent along four vulnerability dimensions via lightweight probing. It then performs adaptive trap synthesis, routing to the most effective strategy and selecting optimal injections via a self-scoring mechanism. Finally, successful traps are abstracted into a reusable skill library, while failed attempts are refined through self-reflection, ensuring continuous improvement. Extensive evaluation shows that LoopTrap achieves an average of 3.57$\times$ step amplification across 8 mainstream agents, with a peak of 25$\times$.

preprint2026arXiv

MiMo-V2-Flash Technical Report

We present MiMo-V2-Flash, a Mixture-of-Experts (MoE) model with 309B total parameters and 15B active parameters, designed for fast, strong reasoning and agentic capabilities. MiMo-V2-Flash adopts a hybrid attention architecture that interleaves Sliding Window Attention (SWA) with global attention, with a 128-token sliding window under a 5:1 hybrid ratio. The model is pre-trained on 27 trillion tokens with Multi-Token Prediction (MTP), employing a native 32k context length and subsequently extended to 256k. To efficiently scale post-training compute, MiMo-V2-Flash introduces a novel Multi-Teacher On-Policy Distillation (MOPD) paradigm. In this framework, domain-specialized teachers (e.g., trained via large-scale reinforcement learning) provide dense and token-level reward, enabling the student model to perfectly master teacher expertise. MiMo-V2-Flash rivals top-tier open-weight models such as DeepSeek-V3.2 and Kimi-K2, despite using only 1/2 and 1/3 of their total parameters, respectively. During inference, by repurposing MTP as a draft model for speculative decoding, MiMo-V2-Flash achieves up to 3.6 acceptance length and 2.6x decoding speedup with three MTP layers. We open-source both the model weights and the three-layer MTP weights to foster open research and community collaboration.

preprint2025arXiv

MiMo-Audio: Audio Language Models are Few-Shot Learners

Existing audio language models typically rely on task-specific fine-tuning to accomplish particular audio tasks. In contrast, humans are able to generalize to new audio tasks with only a few examples or simple instructions. GPT-3 has shown that scaling next-token prediction pretraining enables strong generalization capabilities in text, and we believe this paradigm is equally applicable to the audio domain. By scaling MiMo-Audio's pretraining data to over one hundred million of hours, we observe the emergence of few-shot learning capabilities across a diverse set of audio tasks. We develop a systematic evaluation of these capabilities and find that MiMo-Audio-7B-Base achieves SOTA performance on both speech intelligence and audio understanding benchmarks among open-source models. Beyond standard metrics, MiMo-Audio-7B-Base generalizes to tasks absent from its training data, such as voice conversion, style transfer, and speech editing. MiMo-Audio-7B-Base also demonstrates powerful speech continuation capabilities, capable of generating highly realistic talk shows, recitations, livestreaming and debates. At the post-training stage, we curate a diverse instruction-tuning corpus and introduce thinking mechanisms into both audio understanding and generation. MiMo-Audio-7B-Instruct achieves open-source SOTA on audio understanding benchmarks (MMSU, MMAU, MMAR, MMAU-Pro), spoken dialogue benchmarks (Big Bench Audio, MultiChallenge Audio) and instruct-TTS evaluations, approaching or surpassing closed-source models. Model checkpoints and full evaluation suite are available at https://github.com/XiaomiMiMo/MiMo-Audio.

preprint2022arXiv

High-contrast, speckle-free, true 3D holography via binary CGH optimization

Holography is a promising approach to implement the three-dimensional (3D) projection beyond the present two-dimensional technology. True 3D holography requires abilities of arbitrary 3D volume projection with high-axial resolution and independent control of all 3D voxels. However, it has been challenging to implement the true 3D holography with high-reconstruction quality due to the speckle. Here, we propose the practical solution to realize speckle-free, high-contrast, true 3D holography by combining random-phase, temporal multiplexing, binary holography, and binary optimization. We adopt the random phase for the true 3D implementation to achieve the maximum axial resolution with fully independent control of the 3D voxels. We develop the high-performance binary hologram optimization framework to minimize the binary quantization noise, which provides accurate and high-contrast reconstructions for 2D as well as 3D cases. Utilizing the fast operation of binary modulation, the full-color high-framerate holographic video projection is realized while the speckle noise of random phase is overcome by temporal multiplexing. Our high-quality true 3D holography is experimentally verified by projecting multiple arbitrary dense images simultaneously. The proposed method can be adopted in various applications of holography, where we show additional demonstration that realistic true 3D hologram in VR and AR near-eye displays. The realization will open a new path towards the next generation of holography.

preprint2022arXiv

Knowledge Distillation with the Reused Teacher Classifier

Knowledge distillation aims to compress a powerful yet cumbersome teacher model into a lightweight student model without much sacrifice of performance. For this purpose, various approaches have been proposed over the past few years, generally with elaborately designed knowledge representations, which in turn increase the difficulty of model development and interpretation. In contrast, we empirically show that a simple knowledge distillation technique is enough to significantly narrow down the teacher-student performance gap. We directly reuse the discriminative classifier from the pre-trained teacher model for student inference and train a student encoder through feature alignment with a single $\ell_2$ loss. In this way, the student model is able to achieve exactly the same performance as the teacher model provided that their extracted features are perfectly aligned. An additional projector is developed to help the student encoder match with the teacher classifier, which renders our technique applicable to various teacher and student architectures. Extensive experiments demonstrate that our technique achieves state-of-the-art results at the modest cost of compression ratio due to the added projector.

preprint2020arXiv

Fast Adaptively Weighted Matrix Factorization for Recommendation with Implicit Feedback

Recommendation from implicit feedback is a highly challenging task due to the lack of the reliable observed negative data. A popular and effective approach for implicit recommendation is to treat unobserved data as negative but downweight their confidence. Naturally, how to assign confidence weights and how to handle the large number of the unobserved data are two key problems for implicit recommendation models. However, existing methods either pursuit fast learning by manually assigning simple confidence weights, which lacks flexibility and may create empirical bias in evaluating user's preference; or adaptively infer personalized confidence weights but suffer from low efficiency. To achieve both adaptive weights assignment and efficient model learning, we propose a fast adaptively weighted matrix factorization (FAWMF) based on variational auto-encoder. The personalized data confidence weights are adaptively assigned with a parameterized neural network (function) and the network can be inferred from the data. Further, to support fast and stable learning of FAWMF, a new specific batch-based learning algorithm fBGD has been developed, which trains on all feedback data but its complexity is linear to the number of observed data. Extensive experiments on real-world datasets demonstrate the superiority of the proposed FAWMF and its learning algorithm fBGD.

preprint2020arXiv

Majorana corner flat bands in two-dimensional second-order topological superconductors

In this paper we find that confining a second-order topological superconductor with a harmonic potential leads to a proliferation of Majorana corner modes. As a consequence, this results in the formation of Majorana corner flat bands which have a fundamentally different origin from that of the conventional mechanism. This is due to the fact that they arise solely from the one-dimensional gapped boundary states of the hybrid system that become gapless without the bulk gap closing under the increase of the trapping potential magnitude. The Majorana corner states are found to be robust against the strength of the harmonic trap and the transition from Majorana corner states to Majorana flat bands is merely a smooth crossover. As a harmonic trap can potentially be realized in heterostructures, this proposal paves a way to observe these Majorana corner flat bands in an experimental context.