Researcher profile

Xin Cheng

Xin Cheng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2026arXiv

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

While Mixture-of-Experts (MoE) scales capacity via conditional computation, Transformers lack a native primitive for knowledge lookup, forcing them to inefficiently simulate retrieval through computation. To address this, we introduce conditional memory as a complementary sparsity axis, instantiated via Engram, a module that modernizes classic $N$-gram embedding for O(1) lookup. By formulating the Sparsity Allocation problem, we uncover a U-shaped scaling law that optimizes the trade-off between neural computation (MoE) and static memory (Engram). Guided by this law, we scale Engram to 27B parameters, achieving superior performance over a strictly iso-parameter and iso-FLOPs MoE baseline. Most notably, while the memory module is expected to aid knowledge retrieval (e.g., MMLU +3.4; CMMLU +4.0), we observe even larger gains in general reasoning (e.g., BBH +5.0; ARC-Challenge +3.7) and code/math domains~(HumanEval +3.0; MATH +2.4). Mechanistic analyses reveal that Engram relieves the backbone's early layers from static reconstruction, effectively deepening the network for complex reasoning. Furthermore, by delegating local dependencies to lookups, it frees up attention capacity for global context, substantially boosting long-context retrieval (e.g., Multi-Query NIAH: 84.2 to 97.0). Finally, Engram establishes infrastructure-aware efficiency: its deterministic addressing enables runtime prefetching from host memory, incurring negligible overhead. We envision conditional memory as an indispensable modeling primitive for next-generation sparse models.

preprint2026arXiv

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

General reasoning represents a long-standing and formidable challenge in artificial intelligence. Recent breakthroughs, exemplified by large language models (LLMs) and chain-of-thought prompting, have achieved considerable success on foundational reasoning tasks. However, this success is heavily contingent upon extensive human-annotated demonstrations, and models' capabilities are still insufficient for more complex problems. Here we show that the reasoning abilities of LLMs can be incentivized through pure reinforcement learning (RL), obviating the need for human-labeled reasoning trajectories. The proposed RL framework facilitates the emergent development of advanced reasoning patterns, such as self-reflection, verification, and dynamic strategy adaptation. Consequently, the trained model achieves superior performance on verifiable tasks such as mathematics, coding competitions, and STEM fields, surpassing its counterparts trained via conventional supervised learning on human demonstrations. Moreover, the emergent reasoning patterns exhibited by these large-scale models can be systematically harnessed to guide and enhance the reasoning capabilities of smaller models.

preprint2026arXiv

SyncDPO: Enhancing Temporal Synchronization in Video-Audio Joint Generation via Preference Learning

Recent advancements in video-audio joint generation have achieved remarkable success in semantic correspondence. However, achieving precise temporal synchronization, which requires fine-grained alignment between audio events and their visual triggers, remains a challenging problem. The post-training method for joint generation is largely dominated by Supervised Fine-Tuning, but the commonly used Mean Squared Error loss provides insufficient penalties for subtle temporal misalignments. Direct Preference Optimization offers an alternative by introducing explicit misaligned counterparts to better improve temporal sensitivity. In this paper we propose a post-training framework SyncDPO, leveraging DPO to improve the temporal sensitivity of V-A joint generation. Conventional DPO pipelines typically depend on costly sampling-and-ranking procedures to construct preference pairs, resulting in substantial computational cost. To improve efficiency, we introduce a suite of on-the-fly rule-based negative construction strategies that distort temporal structures without incurring additional annotation or sampling. We demonstrate that the temporal alignment capability can be effectively reinforced by providing explicit negative supervision through temporally distorted V-A pairs. Accordingly, we implement a curriculum learning strategy that progressively increases the difficulty of negative samples, transitioning from coarse misalignment to subtle inconsistencies. Extensive objective and subjective experiments across four diverse benchmarks, ranging from ambient sound videos to human speech videos, demonstrate that SyncDPO significantly outperforms other methods in improving model's temporal alignment capability. It also demonstrates superior generalization on out-of-distribution benchmark by capturing intrinsic motion-sound dynamics. Demo and code is available in https://syncdpo.github.io/syncdpo/.

preprint2022arXiv

Can we detect coronal mass ejections through asymmetries of Sun-as-a-star extreme-ultraviolet spectral line profiles?

Coronal mass ejections (CMEs) are the largest-scale eruptive phenomena in the solar system. Associated with enormous plasma ejections and energy release, CMEs have an important impact on the solar-terrestrial environment. Accurate predictions of the arrival times of CMEs at the Earth depend on the precise measurements on their three-dimensional velocities, which can be achieved using simultaneous line-of-sight (LOS) and plane-of-sky (POS) observations. Besides the POS information from routine coronagraph and extreme ultraviolet (EUV) imaging observations, spectroscopic observations could unveil the physical properties of CMEs including their LOS velocities. We propose that spectral line asymmetries measured by Sun-as-a-star spectrographs can be used for routine detections of CMEs and estimations of their LOS velocities during their early propagation phases. Such observations can also provide important clues for the detection of CMEs on other solar-like stars. However, few studies have concentrated on whether we can detect CME signals and accurately diagnose CME properties through Sun-as-a-star spectral observations. In this work, we constructed a geometric CME model and derived the analytical expressions for full-disk integrated EUV line profiles during CMEs. For different CME properties and instrumental configurations, full disk-integrated line profiles were synthesized. We further evaluated the detectability and diagnostic potential of CMEs from the synthetic line profiles. Our investigations provide important constraints on the future design of Sun-as-a-star spectrographs for CME detections through EUV line asymmetries.

preprint2022arXiv

Federated Learning-Based Localization with Heterogeneous Fingerprint Database

Fingerprint-based localization plays an important role in indoor location-based services, where the position information is usually collected in distributed clients and gathered in a centralized server. However, the overloaded transmission as well as the potential risk of divulging private information burdens the application.Owning the ability to address these challenges, federated learning (FL)-based fingerprinting localization comes into people's sights, which aims to train a global model while keeping raw data locally. However, in distributed machine learning (ML) scenarios, the unavoidable database heterogeneity usually degrades the performance of existing FL-based localization algorithm (FedLoc). In this paper, we first characterize the database heterogeneity with a computable metric, i.e., the area of convex hull, and verify it by experimental results. Then, a novel heterogeneous FL-based localization algorithm with the area of convex hull-based aggregation (FedLoc-AC) is proposed. Extensive experimental results, including real-word cases are conducted. We can conclude that the proposed FedLoc-AC can achieve an obvious prediction gain compared to FedLoc in heterogeneous scenarios and has almost the same prediction error with it in homogeneous scenarios. Moreover, the extension of FedLoc-AC in multi-floor cases is proposed and verified.

preprint2022arXiv

Influence of magnetic reconnection on the eruptive catastrophes of coronal magnetic flux ropes

Large-scale solar eruptive activities have a close relationship with coronal magnetic flux ropes. Previous numerical studies have found that the equilibrium of a coronal flux rope system could be disrupted if the axial magnetic flux of the rope exceeds a critical value, so that the catastrophe occurs, initiating the flux rope to erupt. Further studies discovered that the catastrophe does not necessarily exist: the flux rope system with certain photospheric flux distributions could be non-catastrophic. It is noteworthy that most previous numerical studies are under the ideal magnetohydrodynamic (MHD) condition, so that it is still elusive whether there is the catastrophe associated with the critical axial flux if magnetic reconnection is included in the flux rope system. In this paper, we carried out numerical simulations to investigate the evolutions of coronal magnetic rope systems under the ideal MHD and the resistive condition. Under the ideal MHD condition, our simulation results demonstrate that the flux rope systems with either too compact or too weak photospheric magnetic source regions are non-catastrophic versus varying axial flux of the rope, and thus no eruption could be initiated; if there is magnetic reconnection in the rope system, however, those flux rope systems could change to be capable of erupting via the catastrophe associated with increasing axial flux. Therefore, magnetic reconnection could significantly influence the catastrophic behaviors of flux rope system. It should be both the magnetic topology and the local physical parameters related to magnetic reconnection that determine whether the increasing axial flux is able to cause flux rope eruptions.

preprint2022arXiv

Optimal Measurement of Drone Swarm in RSS-based Passive Localization with Region Constraints

Passive geolocation by multiple unmanned aerial vehicles (UAVs) covers a wide range of military and civilian applications including rescue, wild life tracking and electronic warfare. The sensor-target geometry is known to significantly affect the localization precision. The existing sensor placement strategies mainly work on the cases without any constraints on the sensors locations. However, UAVs cannot fly/hover simply in arbitrary region due to realistic constraints, such as the geographical limitations, the security issues, and the max flying speed. In this paper, optimal geometrical configurations of UAVs in received signal strength (RSS)-based localization under region constraints are investigated. Employing the D-optimal criteria, i.e., minimizing the determinate of Fisher information matrix (FIM), such optimal problem is formulated. Based on the rigorous algebra and geometrical derivations, optimal and also closed form configurations of UAVs under different flying states are proposed. Finally, the effectiveness and practicality of the proposed configurations are demonstrated by simulation examples.

preprint2022arXiv

Providing Location Information at Edge Networks: A Federated Learning-Based Approach

Recently, the development of mobile edge computing has enabled exhilarating edge artificial intelligence (AI) with fast response and low communication cost. The location information of edge devices is essential to support the edge AI in many scenarios, like smart home, intelligent transportation systems and integrated health care. Taking advantages of deep learning intelligence, the centralized machine learning (ML)-based positioning technique has received heated attention from both academia and industry. However, some potential issues, such as location information leakage and huge data traffic, limit its application. Fortunately, a newly emerging privacy-preserving distributed ML mechanism, named federated learning (FL), is expected to alleviate these concerns. In this article, we illustrate a framework of FL-based localization system as well as the involved entities at edge networks. Moreover, the advantages of such system are elaborated. On practical implementation of it, we investigate the field-specific issues associated with system-level solutions, which are further demonstrated over a real-word database. Moreover, future challenging open problems in this field are outlined.

preprint2022arXiv

Rapid Phase Ambiguity Elimination Methods for DOA Estimator via Hybrid Massive MIMO Receive Array

For a sub-connected hybrid multiple-input multiple-output (MIMO) receiver with $K$ subarrays and $N$ antennas, there exists a challenging problem of how to rapidly remove phase ambiguity in only single time-slot. First, a DOA estimator of maximizing received power (Max-RP) is proposed to find the maximum value of $K$-subarray output powers, where each subarray is in charge of one sector, and the center angle of the sector corresponding to the maximum output is the estimated true DOA. To make an enhancement on precision, Max-RP plus quadratic interpolation (Max-RP-QI) method is designed. In the proposed Max-RP-QI, a quadratic interpolation scheme is adopted to interpolate the three DOA values corresponding to the largest three receive powers of Max-RP. Finally, to achieve the CRLB, a Root-MUSIC plus Max-RP-QI scheme is developed. Simulation results show that the proposed three methods eliminate the phase ambiguity during one time-slot and also show low-computational-complexities. In particular, the proposed Root-MUSIC plus Max-RP-QI scheme can reach the CRLB, and the proposed Max-RP and Max-RP-QI are still some performance losses $2dB\thicksim4dB$ compared to the CRLB.

preprint2022arXiv

Two Low-complexity DOA Estimators for Massive/Ultra-massive MIMO Receive Array

Eigen-decomposition-based direction finding methods of using large-scale/ultra-large-scale fully-digital receive antenna arrays lead to a high or ultra-high complexity. To address the complexity dilemma, in this paper, three low-complexity estimators are proposed: partitioned subarray auto-correlation combining (PSAC), partitioned subarray cross-correlation combining (PSCC) and power iteration max correlation successive convex approximation (PI-Max-CSCA). Compared with the conventional no-partitioned direction finding method like root multiple signal classification (Root-MUSIC), in the PSAC method, the total set of antennas are equally partitioned into subsets of antennas, called subarrays, each subarray performs independent DOA estimation, and all DOA estimates are coherently combined to give the final estimation. For a better performance, the cross-correlation among sub-arrays is further exploited in the PSCC method to achieve the near-Cramer-Rao lower bound (CRLB) performance with the help of auto-correlation. To further reduce the complexity, in the PI-Max-CSCA method, using a fraction of all subarrays to make an initial coarse direction measurement (ICDM), the power iterative method is adopted to compute the more precise steering vector (SV) by exploiting the total array, and a more accurate DOA value is found using ICDM and SV through the maximum correlation method solved by successive convex approximation. Simulation results show that as the number of antennas goes to large-scale, the proposed three methods can achieve a dramatic complexity reduction over conventional Root-MUISC. Particularly, the PSCC and PI-Max-CSCA can reach the CRLB while the PSAC shows a substantial performance loss.

preprint2021arXiv

Annihilation of Magnetic Islands at the Top of Solar Flare Loops

The dynamics of magnetic reconnection in the solar current sheet (CS) is studied by high-resolution 2.5-dimensional MHD simulation. With the commence of magnetic reconnection, a number of magnetic islands are formed intermittently and move quickly upward and downward along the CS. When colliding with the semi-closed flux of flare loops, the downflow islands cause a second reconnection with a rate even comparable with that in the main CS. Though the time-integrated magnetic energy release is still dominated by the reconnection in main CS, the second reconnection can release substantial magnetic energy, annihilating the main islands and generating secondary islands with various scales at the flare loop top. The distribution function of the flux of the second islands is found to follow a power-law varying from $f\left(ψ\right)\simψ^{-1}$ (small scale) to $ψ^{-2}$ (large scale), which seems to be independent with background plasma $β$ and if including thermal conduction. However, the spatial scale and the strength of the termination shocks driven by main reconnection outflows or islands decrease if $β$ increases or thermal conduction is included. We suggest that the annihilation of magnetic islands at the flare loop top, which is not included in the standard flare model, plays a non-negligible role in releasing magnetic energy to heat flare plasma and accelerate particles.

preprint2021arXiv

Comparison of Helium Abundance between ICMEs and Solar Wind near 1 AU

The Helium abundance, defined as $A_{He}=n_{He}/n_{H}\times 100$, is $\sim$8.5 in the photosphere and seldom exceeds 5 in fast solar wind. Previous statistics have demonstrated that $A_{He}$ in slow solar wind correlates tightly with sunspot number. However, less attention is paid to the solar cycle dependence of $A_{He}$ within interplanetary coronal mass ejections (ICMEs) and comparing the $A_{He}$ characteristics of ICMEs and solar wind. In this paper we conduct a statistical comparison of Helium abundance between ICMEs and solar wind near 1 AU with observations of \textit{Advanced Composition Explorer} from 1998 to 2019, and find that the ICME $A_{He}$ also exhibits the obvious solar cycle dependence. Meanwhile, we find that the $A_{He}$ is obviously higher within ICMEs compared to solar wind, and the means within 37\% and 12\% of ICMEs exceed 5 and 8.5, respectively. It is interesting to answer where and how the high Helium abundance originates. Our statistics demonstrate that 21\% (3\%) of ICME (slow wind) $A_{He}$ data points exceed 8.5 around solar maximum, which decreases dramatically near minimum, while no such high $A_{He}$ values appear in the fast wind throughout the whole solar cycle. This indicates that the high $A_{He}$ (e.g., $>$8.5) emanates from active regions as more ICMEs and slow wind originates from active regions around maximum, and supports that both active regions and quiet-Sun regions are the sources of slow wind. We suggest that the high $A_{He}$ from active regions could be explained by means of the magnetic loop confinement model and/or photoionization effect.

preprint2021arXiv

Spectral compression by phase doubling in second harmonic generation

In second harmonic generation, the phase of the optical field is doubled which has important implication. Here the phase doubling effect is utilized to solve a long-standing challenge in power scaling of single frequency laser. When a (-π/2, π/2) binary phase modulation is applied to a single frequency seed laser to broaden the spectrum and suppress the stimulated Brillouin scattering in high power fiber amplifier, the second harmonic of the phase-modulated laser will return to single frequency, because the (-π/2, π/2) modulation is doubled to (-π, π) for the second harmonic. A compression rate as high as 95% is demonstrated in the experiment limited by the electronic bandwidth of the setup, which can be improved with optimized devices.

preprint2020arXiv

Real-time Human Activity Recognition Using Conditionally Parametrized Convolutions on Mobile and Wearable Devices

Recently, deep learning has represented an important research trend in human activity recognition (HAR). In particular, deep convolutional neural networks (CNNs) have achieved state-of-the-art performance on various HAR datasets. For deep learning, improvements in performance have to heavily rely on increasing model size or capacity to scale to larger and larger datasets, which inevitably leads to the increase of operations. A high number of operations in deep leaning increases computational cost and is not suitable for real-time HAR using mobile and wearable sensors. Though shallow learning techniques often are lightweight, they could not achieve good performance. Therefore, deep learning methods that can balance the trade-off between accuracy and computation cost is highly needed, which to our knowledge has seldom been researched. In this paper, we for the first time propose a computation efficient CNN using conditionally parametrized convolution for real-time HAR on mobile and wearable devices. We evaluate the proposed method on four public benchmark HAR datasets consisting of WISDM dataset, PAMAP2 dataset, UNIMIB-SHAR dataset, and OPPORTUNITY dataset, achieving state-of-the-art accuracy without compromising computation cost. Various ablation experiments are performed to show how such a network with large capacity is clearly preferable to baseline while requiring a similar amount of operations. The method can be used as a drop-in replacement for the existing deep HAR architectures and easily deployed onto mobile and wearable devices for real-time HAR applications.