Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
19topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2026arXiv

Beyond Continuity: Simulation-free Reconstruction of Discrete Branching Dynamics from Single-cell Snapshots

Inferring cellular trajectories from destructive snapshots is complicated by the challenges of stochasticity and non-conservative mass dynamics such as cell proliferation and apoptosis. Existing unbalanced Optimal Transport (OT) methods treat mass as a continuous fluid, performing inference at the population level. However, this macroscopic view often fails to capture the discrete, jump-like nature of birth-death events at single-cell resolution, which is essential for understanding lineage branching and fate decisions. We present Unbalanced Schrödinger Bridge (USB), a simulation-free framework for learning underlying dynamics that effectively integrates both stochastic and unbalanced effects which also models the discrete, jump-like birth-death dynamics at single-cell resolution. Theoretically, USB provides a tractable solution to the Branching Schrödinger Bridge (BSB) problem, offering a rigorous microscopic interpretation where individual cells undergo both Brownian motion and discrete birth-death jumps. Technically, the method implements an efficient solver by introducing a simulation-free training objective that effectively scales to high-dimensional omics data. Empirically, we demonstrate on both simulated and real-world datasets that USB not only achieves trajectory reconstruction performance better than or comparable to deterministic baselines but also uniquely enables realistic discrete simulation of birth-death dynamics at single-cell resolution.

preprint2026arXiv

Bridging Values and Behavior: A Hierarchical Framework for Proactive Embodied Agents

Current embodied agents are often limited to passive instruction-following or reactive need-satisfaction, lacking a stable, high-order value framework essential for long-term, self-directed behavior and resolving motivational conflicts. We introduce \textit{ValuePlanner}, a hierarchical cognitive architecture that decouples high-level value scheduling from low-level action execution. \textit{ValuePlanner} employs an LLM-based cognitive module to generate symbolic subgoals by reasoning through abstract value trade-offs, which are then translated into executable action plans by a classical PDDL planner. This process is refined via a closed-loop feedback mechanism. Evaluating such autonomy requires methods beyond task-success rates, and we therefore propose a value-centric evaluation suite measuring cumulative value gain, preference alignment, and behavioral diversity. Experiments in the TongSim household environment demonstrate that \textit{ValuePlanner} arbitrates competing values to generate coherent, long-horizon, self-directed behavior absent from instruction-following and needs-driven baselines. Our work offers a structured approach to bridging intrinsic values and grounded behavior for autonomous agents.

preprint2026arXiv

Byzantine-Robust Distributed Sparse Learning Revisited

We revisit Byzantine robust distributed estimation for high-dimensional sparse linear models. By combining local $\ell_1$-regularized robust estimation with robust aggregation at the server, the framework applies to pseudo-Huber regression, quantile regression, and sparse SVM. We show that the resulting estimators yield non-asymptotic guarantees and attain near-optimal statistical rates under mild conditions, while remaining communication-efficient. Simulations confirm strong robustness in estimation, support recovery and classification accuracy under various Byzantine attacks.

preprint2026arXiv

How vehicles change lanes after encountering crashes: Empirical analysis and modeling

When a traffic crash occurs, following vehicles need to change lanes to bypass the obstruction. We define these maneuvers as post crash lane changes. In such scenarios, vehicles in the target lane may refuse to yield even after the lane change has already begun, increasing the complexity and crash risk of post crash LCs. However, the behavioral characteristics and motion patterns of post crash LCs remain unknown. To address this gap, we construct a post crash LC dataset by extracting vehicle trajectories from drone videos captured after crashes. Our empirical analysis reveals that, compared to mandatory LCs (MLCs) and discretionary LCs (DLCs), post crash LCs exhibit longer durations, lower insertion speeds, and higher crash risks. Notably, 79.4% of post crash LCs involve at least one instance of non yielding behavior from the new follower, compared to 21.7% for DLCs and 28.6% for MLCs. Building on these findings, we develop a novel trajectory prediction framework for post crash LCs. At its core is a graph based attention module that explicitly models yielding behavior as an auxiliary interaction aware task. This module is designed to guide both a conditional variational autoencoder and a Transformer based decoder to predict the lane changer's trajectory. By incorporating the interaction aware module, our model outperforms existing baselines in trajectory prediction performance by more than 10% in both average displacement error and final displacement error across different prediction horizons. Moreover, our model provides more reliable crash risk analysis by reducing false crash rates and improving conflict prediction accuracy. Finally, we validate the model's transferability using additional post crash LC datasets collected from different sites.

preprint2026arXiv

The AI Hippocampus: How Far are We From Human Memory?

Memory plays a foundational role in augmenting the reasoning, adaptability, and contextual fidelity of modern Large Language Models and Multi-Modal LLMs. As these models transition from static predictors to interactive systems capable of continual learning and personalized inference, the incorporation of memory mechanisms has emerged as a central theme in their architectural and functional evolution. This survey presents a comprehensive and structured synthesis of memory in LLMs and MLLMs, organizing the literature into a cohesive taxonomy comprising implicit, explicit, and agentic memory paradigms. Specifically, the survey delineates three primary memory frameworks. Implicit memory refers to the knowledge embedded within the internal parameters of pre-trained transformers, encompassing their capacity for memorization, associative retrieval, and contextual reasoning. Recent work has explored methods to interpret, manipulate, and reconfigure this latent memory. Explicit memory involves external storage and retrieval components designed to augment model outputs with dynamic, queryable knowledge representations, such as textual corpora, dense vectors, and graph-based structures, thereby enabling scalable and updatable interaction with information sources. Agentic memory introduces persistent, temporally extended memory structures within autonomous agents, facilitating long-term planning, self-consistency, and collaborative behavior in multi-agent systems, with relevance to embodied and interactive AI. Extending beyond text, the survey examines the integration of memory within multi-modal settings, where coherence across vision, language, audio, and action modalities is essential. Key architectural advances, benchmark tasks, and open challenges are discussed, including issues related to memory capacity, alignment, factual consistency, and cross-system interoperability.

preprint2026arXiv

Variable Basis Mapping for Real-Time Volumetric Visualization

Real-time visualization of large-scale volumetric data remains challenging, as direct volume rendering and voxel-based methods suffer from prohibitively high computational cost. We propose Variable Basis Mapping (VBM), a framework that transforms volumetric fields into 3D Gaussian Splatting (3DGS) representations through wavelet-domain analysis. First, we precompute a compact Wavelet-to-Gaussian Transition Bank that provides optimal Gaussian surrogates for canonical wavelet atoms across multiple scales. Second, we perform analytical Gaussian construction that maps discrete wavelet coefficients directly to 3DGS parameters using a closed-form, mathematically principled rule. Finally, a lightweight image-space fine-tuning stage further refines the representation to improve rendering fidelity. Experiments on diverse datasets demonstrate that VBM significantly accelerates convergence and enhances rendering quality, enabling real-time volumetric visualization.

preprint2025arXiv

BEDA: Belief Estimation as Probabilistic Constraints for Performing Strategic Dialogue Acts

Strategic dialogue requires agents to execute distinct dialogue acts, for which belief estimation is essential. While prior work often estimates beliefs accurately, it lacks a principled mechanism to use those beliefs during generation. We bridge this gap by first formalizing two core acts Adversarial and Alignment, and by operationalizing them via probabilistic constraints on what an agent may generate. We instantiate this idea in BEDA, a framework that consists of the world set, the belief estimator for belief estimation, and the conditional generator that selects acts and realizes utterances consistent with the inferred beliefs. Across three settings, Conditional Keeper Burglar (CKBG, adversarial), Mutual Friends (MF, cooperative), and CaSiNo (negotiation), BEDA consistently outperforms strong baselines: on CKBG it improves success rate by at least 5.0 points across backbones and by 20.6 points with GPT-4.1-nano; on Mutual Friends it achieves an average improvement of 9.3 points; and on CaSiNo it achieves the optimal deal relative to all baselines. These results indicate that casting belief estimation as constraints provides a simple, general mechanism for reliable strategic dialogue.

preprint2025arXiv

Vestigial $d$-wave charge-$4e$ Superconductivity from Bidirectional Pair Density Waves

We analyze the leading vestigial instability due to the melting of a bidirectional pair-density-wave state in two dimensions. In a previous work by one of the authors, it was found that the interplay between pair-density-wave fluctuations with ordering momenta along the $x$ and $y$ directions can provide a strong attractive interaction for charge-$4e$ superconductivity in the $d$-wave channel. In this work, we go beyond the artificial large-$M$ mean-field theory previously adopted and compute the phase diagram by incorporating phase fluctuations of the pair-density-wave order parameters. By investigating the relevance of various topological defects, we show that the interaction in the $d$-wave channel, together with the strong anisotropy of phase fluctuations around the pair-density-wave ordering momenta, favors a vestigial charge-$4e$ superconducting order at intermediate temperatures. By contrast, a competing charge-density-wave vestigial order does not develop, due to the suppression of its stiffness.

preprint2024arXiv

Many-body higher-order topological invariant for $C_n$-symmetric insulators

Higher-order topological insulators in two spatial dimensions display fractional corner charges. While fractional charges in one dimension are known to be captured by a many-body bulk invariant, computed by the Resta formula, a many-body bulk invariant for higher-order topology and the corresponding fractional corner charges remains elusive despite several attempts. Inspired by recent work by Tada and Oshikawa, we propose a well-defined many-body bulk invariant for $C_n$ symmetric higher-order topological insulators, which is valid for both non-interacting and interacting systems. Instead of relating them to the bulk quadrupole moment as was previously done, we show that in the presence of $C_n$ rotational symmetry, this bulk invariant can be directly identified with quantized fractional corner charges. In particular, we prove that the corner charge is quantized as $e/n$ with $C_n$ symmetry, leading to a $\mathbb{Z}_n$ classification for higher-order topological insulators in two dimensions.

preprint2024arXiv

STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering

Recently we have witnessed the rapid development of video question answering models. However, most models can only handle simple videos in terms of temporal reasoning, and their performance tends to drop when answering temporal-reasoning questions on long and informative videos. To tackle this problem we propose STAIR, a Spatial-Temporal Reasoning model with Auditable Intermediate Results for video question answering. STAIR is a neural module network, which contains a program generator to decompose a given question into a hierarchical combination of several sub-tasks, and a set of lightweight neural modules to complete each of these sub-tasks. Though neural module networks are already widely studied on image-text tasks, applying them to videos is a non-trivial task, as reasoning on videos requires different abilities. In this paper, we define a set of basic video-text sub-tasks for video question answering and design a set of lightweight modules to complete them. Different from most prior works, modules of STAIR return intermediate outputs specific to their intentions instead of always returning attention maps, which makes it easier to interpret and collaborate with pre-trained models. We also introduce intermediate supervision to make these intermediate outputs more accurate. We conduct extensive experiments on several video question answering datasets under various settings to show STAIR's performance, explainability, compatibility with pre-trained models, and applicability when program annotations are not available. Code: https://github.com/yellow-binary-tree/STAIR

preprint2022arXiv

An Efficient Algorithm for the Partitioning Min-Max Weighted Matching Problem

The Partitioning Min-Max Weighted Matching (PMMWM) problem is an NP-hard problem that combines the problem of partitioning a group of vertices of a bipartite graph into disjoint subsets with limited size and the classical Min-Max Weighted Matching (MMWM) problem. Kress et al. proposed this problem in 2015 and they also provided several algorithms, among which MP$_{\text{LS}}$ is the state-of-the-art. In this work, we observe there is a time bottleneck in the matching phase of MP$_{\text{LS}}$. Hence, we optimize the redundant operations during the matching iterations, and propose an efficient algorithm called the MP$_{\text{KM-M}}$ that greatly speeds up MP$_{\text{LS}}$. The bottleneck time complexity is optimized from $O(n^3)$ to $O(n^2)$. We also prove the correctness of MP$_{\text{KM-M}}$ by the primal-dual method. To test the performance on diverse instances, we generate various types and sizes of benchmarks, and carried out an extensive computational study on the performance of MP$_{\text{KM-M}}$ and MP$_{\text{LS}}$. The evaluation results show that our MP$_{\text{KM-M}}$ greatly shortens the runtime as compared with MP$_{\text{LS}}$ while yielding the same solution quality.

preprint2022arXiv

Memory Augmented Lookup Dictionary based Language Modeling for Automatic Speech Recognition

Recent studies have shown that using an external Language Model (LM) benefits the end-to-end Automatic Speech Recognition (ASR). However, predicting tokens that appear less frequently in the training set is still quite challenging. The long-tail prediction problems have been widely studied in many applications, but only been addressed by a few studies for ASR and LMs. In this paper, we propose a new memory augmented lookup dictionary based Transformer architecture for LM. The newly introduced lookup dictionary incorporates rich contextual information in training set, which is vital to correctly predict long-tail tokens. With intensive experiments on Chinese and English data sets, our proposed method is proved to outperform the baseline Transformer LM by a great margin on both word/character error rate and tail tokens error rate. This is achieved without impact on the decoding efficiency. Overall, we demonstrate the effectiveness of our proposed method in boosting the ASR decoding performance, especially for long-tail tokens.

preprint2022arXiv

Mixed QCD-EW corrections for Higgs leptonic decay via $HW^+W^-$ vertex

We consider the two-loop corrections to the $HW^+W^-$ vertex at order $αα_s$. We construct a canonical basis for the two-loop integrals using the Baikov representation and the intersection theory. By solving the $ε$-form differential equations, we obtain fully analytic expressions for the master integrals in terms of multiple polylogarithms, which allow fast and accurate numeric evaluation for arbitrary configurations of external momenta. We apply our analytic results to the decay process $H \to ν_e e W$, and study both the integrated and differential decay rates. Our results can also be applied to the Higgs production process via $W$ boson fusion.

preprint2022arXiv

The dynamical exponent of a quantum critical itinerant ferromagnet: a Monte Carlo study

We consider the effect of the coupling between 2D quantum rotors near an XY ferromagnetic quantum critical point and spins of itinerant fermions. We analyze how this coupling affects the dynamics of rotors and the self-energy of fermions.A common belief is that near a $q=0$ ferromagnetic transition, fermions induce an $Ω/q$ Landau damping of rotors (i.e., the dynamical critical exponent is $z=3$) and Landau overdamped rotors give rise to non-Fermi liquid fermionic self-energy $Σ\propto ω^{2/3}$. This behavior has been confirmed in previous quantum Monte Carlo (QMC) studies.Here we show that for the XY case the behavior is different.We report the results of large scale quantum Monte Carlo simulations,which show that at small frequencies $z=2$ and $Σ\propto ω^{1/2}$. We argue that the new behavior is associated with the fact that a fermionic spin is by itself not a conserved quantity due to spin-spin coupling to rotors, and a combination of self-energy and vertex corrections replaces $1/q$ in the Landau damping by a constant. We discuss the implication of these results to experiments.

preprint2022arXiv

The USTC-Ximalaya system for the ICASSP 2022 multi-channel multi-party meeting transcription (M2MeT) challenge

We propose two improvements to target-speaker voice activity detection (TS-VAD), the core component in our proposed speaker diarization system that was submitted to the 2022 Multi-Channel Multi-Party Meeting Transcription (M2MeT) challenge. These techniques are designed to handle multi-speaker conversations in real-world meeting scenarios with high speaker-overlap ratios and under heavy reverberant and noisy condition. First, for data preparation and augmentation in training TS-VAD models, speech data containing both real meetings and simulated indoor conversations are used. Second, in refining results obtained after TS-VAD based decoding, we perform a series of post-processing steps to improve the VAD results needed to reduce diarization error rates (DERs). Tested on the ALIMEETING corpus, the newly released Mandarin meeting dataset used in M2MeT, we demonstrate that our proposed system can decrease the DER by up to 66.55/60.59% relatively when compared with classical clustering based diarization on the Eval/Test set.

preprint2020arXiv

Chiral Dirac Superconductors: Second-order and Boundary-obstructed Topology

We analyze the topological properties of a chiral ${p}+i{p}$ superconductor for a two-dimensional metal/semimetal with four Dirac points. Such a system has been proposed to realize second-order topological superconductivity and host corner Majorana modes. We show that with an additional $\mathsf{C}_4$ rotational symmetry, the system is in an intrinsic higher-order topological superconductor phase, and with a lower and more natural $\mathsf{C}_2$ symmetry, is in a boundary-obstructed topological superconductor phase. The boundary topological obstruction is protected by a bulk Wannier gap. However, we show that the well-known nested-Wilson loop is in general unquantized despite the particle-hole symmetry, and thus fails as a topological invariant. Instead, we show that the higher-order topology and boundary-obstructed topology can be characterized using an alternative defect classification approach, in which the corners of a finite sample is treated as a defect of a space-filling Hamiltonian. We establish "Dirac+$({p}+i{p})$" as a sufficient condition for second-order topological superconductivity.