Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
21works
0followers
17topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

21 published item(s)

preprint2026arXiv

CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing Games

Recent advances in Vision-Language-Action models (VLAs) have expanded the capabilities of embodied intelligence. However, significant challenges remain in real-time decision-making in complex 3D environments, which demand second-level responses, high-resolution perception, and tactical reasoning under dynamic conditions. To advance the field, we introduce CombatVLA, an efficient VLA model optimized for combat tasks in 3D action role-playing games(ARPGs). Specifically, our CombatVLA is a 3B model trained on video-action pairs collected by an action tracker, where the data is formatted as action-of-thought (AoT) sequences. Thereafter, CombatVLA seamlessly integrates into an action execution framework, allowing efficient inference through our truncated AoT strategy. Experimental results demonstrate that CombatVLA not only outperforms all existing models on the combat understanding benchmark but also achieves a 50-fold acceleration in game combat. Moreover, it has a higher task success rate than human players. We will open-source all resources, including the action tracker, dataset, benchmark, model weights, training code, and the implementation of the framework at https://combatvla.github.io/.

preprint2026arXiv

Crisis-Bench: Benchmarking Strategic Ambiguity and Reputation Management in Large Language Models

Standard safety alignment optimizes Large Language Models (LLMs) for universal helpfulness and honesty, effectively instilling a rigid "Boy Scout" morality. While robust for general-purpose assistants, this one-size-fits-all ethical framework imposes a "transparency tax" on professional domains requiring strategic ambiguity and information withholding, such as public relations, negotiation, and crisis management. To measure this gap between general safety and professional utility, we introduce Crisis-Bench, a multi-agent Partially Observable Markov Decision Process (POMDP) that evaluates LLMs in high-stakes corporate crises. Spanning 80 diverse storylines across 8 industries, Crisis-Bench tasks an LLM-based Public Relations (PR) Agent with navigating a dynamic 7-day corporate crisis simulation while managing strictly separated Private and Public narrative states to enforce rigorous information asymmetry. Unlike traditional benchmarks that rely on static ground truths, we introduce the Adjudicator-Market Loop: a novel evaluation metric where public sentiment is adjudicated and translated into a simulated stock price, creating a realistic economic incentive structure. Our results expose a critical dichotomy: while some models capitulate to ethical concerns, others demonstrate the capacity for Machiavellian, legitimate strategic withholding in order to stabilize the simulated stock price. Crisis-Bench provides the first quantitative framework for assessing "Reputation Management" capabilities, arguing for a shift from rigid moral absolutism to context-aware professional alignment.

preprint2026arXiv

DDA-Thinker: Decoupled Dual-Atomic Reinforcement Learning for Reasoning-Driven Image Editing

Recent image editing models have achieved strong visual fidelity but often struggle with tasks requiring complex reasoning. To investigate and enhance the reasoning-grounded planning for image editing, we propose DDA-Thinker, a Thinker-centric framework designed for the independent optimization of a planning module (Thinker) over a fixed generative model (Editor). This decoupled Thinker-centric paradigm facilitates a controlled analysis of the planning module and makes its contribution under a fixed Editor easier to assess. To effectively guide this Thinker, we introduce a dual-atomic reinforcement learning framework. This framework decomposes feedback into two distinct atomic rewards implemented through verifiable checklists: a cognitive-atomic reward to directly assess the quality of the Thinker's executable plan, which serves as the actionable outcome of the Thinker's reasoning, and a visual-atomic reward to assess the final image quality. To improve checklist quality, our checklist synthesis is grounded not only in the source image and user instruction but also in a rational reference description of the ideal post-edit scene. To support this training, we further develop a two-stage data curation pipeline that first synthesizes a diverse and reasoning-focused dataset, then applies difficulty-aware refinement to curate an effective training curriculum for reinforcement learning. Extensive experiments on reasoning-driven image editing benchmarks, including RISE-Bench and KRIS-Bench, demonstrate that our approach substantially improves overall performance. Our method enables a community model to achieve results competitive with strong proprietary models, highlighting the practical potential of Thinker-centric optimization under a fixed-editor setting.

preprint2026arXiv

Deep Pre-Alignment for VLMs

Most Vision Language Models (VLMs) directly map outputs from ViT encoders to the LLM via a lightweight projector. While effective, recent analysis suggests this architecture suffers from an alignment challenge: visual features remain distant from the text space in the initial layers of the LLM, forcing the model to waste critical depth~\cite{zhang-etal-2024-investigating,artzy-schwartz-2024-attend} on superficial modality alignment rather than deep understanding and complex reasoning. In this work, we propose Deep Pre-Alignment (DPA), a novel architecture that replaces the standard ViT encoder with a small VLM as perceiver, ensuring visual features are deeply aligned with the text space of the target large language model. Comprehensive experiments demonstrate the effectiveness of DPA. On the 4B parameter scale, DPA outperforms baselines by 1.9 points across 8 multimodal benchmarks, with gains widening to 3.0 points at the 32B scale. Moreover, by offloading alignment to the perceiver, DPA achieves a 32.9\% reduction in language capability forgetting over 3 text benchmarks. We further demonstrate that these gains are consistent across different LLM families including Qwen3 and LLaMA 3.2, highlighting the generality of our approach. Beyond performance, DPA also offers a seamless upgrade path for current VLM development, requiring only a modular replacement for the visual encoder with marginal computation overhead.

preprint2026arXiv

Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models

Large vision-language models (LVLMs) excel at visual understanding, but face efficiency challenges due to quadratic complexity in processing long multi-modal contexts. While token compression can reduce computational costs, existing approaches are designed for single-view LVLMs and fail to consider the unique multi-view characteristics of high-resolution LVLMs with dynamic cropping. Existing methods treat all tokens uniformly, but our analysis reveals that global thumbnails can naturally guide the compression of local crops by providing holistic context for informativeness evaluation. In this paper, we first analyze dynamic cropping strategy, revealing both the complementary nature between thumbnails and crops, and the distinctive characteristics across different crops. Based on our observations, we propose ``Global Compression Commander'' (\textit{i.e.}, \textbf{GlobalCom$^2$}), a novel plug-and-play token compression framework for HR-LVLMs. GlobalCom$^2$ leverages thumbnail as the ``commander'' to guide the compression of local crops, adaptively preserving informative details while eliminating redundancy. Extensive experiments show that GlobalCom$^2$ maintains over \textbf{90\%} performance while compressing \textbf{90\%} visual tokens, reducing FLOPs and peak memory to \textbf{9.1\%} and \textbf{60\%}.

preprint2026arXiv

Reinforcement Learning of Large Language Models for Interpretable Credit Card Fraud Detection

E-commerce platforms and payment solution providers face increasingly sophisticated fraud schemes, ranging from identity theft and account takeovers to complex money laundering operations that exploit the speed and anonymity of digital transactions. However, despite their theoretical promise, the application of Large Language Models (LLMs) to fraud detection in real-world financial contexts remains largely unexploited, and their practical effectiveness in handling domain-specific e-commerce transaction data has yet to be empirically validated. To bridge this gap between conventional machine learning limitations and the untapped potential of LLMs in fraud detection, this paper proposes a novel approach that employs Reinforcement Learning (RL) to post-train lightweight language models specifically for fraud detection tasks using only raw transaction data. We utilize the Group Sequence Policy Optimization (GSPO) algorithm combined with a rule-based reward system to fine-tune language models of various sizes on a real-life transaction dataset provided by a Chinese global payment solution company. Through this reinforcement learning framework, the language models are encouraged to explore diverse trust and risk signals embedded within the textual transaction data, including patterns in customer information, shipping details, product descriptions, and order history. Our experimental results demonstrate the effectiveness of this approach, with post-trained language models achieving substantial F1-score improvements on held-out test data. Our findings demonstrate that the observed performance improvements are primarily attributable to the exploration mechanism inherent in reinforcement learning, which allows models to discover novel fraud indicators beyond those captured by traditional engineered features.

preprint2022arXiv

Collision centrality and energy dependence of strange hadron production in Au + Au collisions at \sqrt{s_{NN}}= 7.7-54.4 GeV

We apply an equal-velocity quark combination model to systematically study the transverse momentum (p_{T}) spectra of strange hadrons K_{S}^{0}, ϕ, Λ, Ξ^{-}, Ω^{-}, \barΛ, \barΞ^{+} and \barΩ^{+} at mid-rapidity in Au+Au collisions at \sqrt{s_{NN}}= 7.7, 11.5, 19.6, 27, 39, 54.4 GeV. Relative deviation between the model calculation and experimental data of these eight hadrons is generally about 2-3% at \sqrt{s_{NN}}= 27, 39, 54.4 GeV and in central collisions at 7.7, 11.5, 19.6 GeV. The deviation slightly increases up to about 4% in the semi-central and peripheral collision at \sqrt{s_{NN}}= 7.7, 11.5, 19.6 GeV. We systematically explain the dependence of two baryon-to-meson ratios \barΛ/K_{S}^{0} and Ω/ϕon p_{T}, collision centrality and collision energy by the property of quark p_{T} spectra at hadronization. We derive the analytic relations between R_{CP} of hadrons and those of quarks, and we use them to naturally explain the species and p_{T} dependence of R_{CP} of those strange hadrons.

preprint2022arXiv

Production characteristics of light (anti-)nuclei from (anti-)nucleon coalescence in heavy ion collisions at energies employed at the RHIC beam energy scan

With the kinetic freeze-out nucleons and antinucleons obtained from the quark combination model, we study the production of light nuclei and antinuclei in the (anti-)nucleon coalescence mechanism in relativistic heavy ion collisions. We derive analytic formulas of the momentum distributions of different light nuclei and apply them to compute transverse momentum ($p_T$) spectra of (anti-)deuterons ($d$, $\bar d$) and (anti-)tritons ($t$, $\bar t$) in Au-Au collisions at $\sqrt{s_{NN}}=$7.7, 11.5, 19.6, 27, 39, 54.4 GeV. We find that the experimental data available for these $p_T$ spectra can be well reproduced. We further study the yields and yield ratios of different light (anti-)nuclei and naturally explain their interesting behaviors as a function of the collision energy. We especially point out that the multi-particle yield ratio $tp/d^2$ should be carefully corrected from hyperon weak decays for protons to probe the production characteristics of light nuclei. All of our results show that the coalescence mechanism for (anti-)nucleons plays a dominant role for the production of light nuclei and antinuclei at the RHIC beam energy scan energies.

preprint2022arXiv

Production of single-charm hadrons by quark combination mechanism in $p$-Pb collisions at $\sqrt{s_{NN}}=5.02$ TeV

If QGP-like medium is created in $p$-Pb collisions at extremely high collision energies, charm quarks that move in the medium can hadronize by capturing the co-moving light quark(s) or anti-quark(s) to form the charm hadrons. Using light quark $p_{T}$ spectra extracted from the experimental data of light-flavor hadrons and a charm quark $p_{T}$ spectrum that is consistent with perturbative QCD calculations, the central-rapidity data of $p_{T}$ spectra and the spectrum ratios for $D$ mesons in the low $p_{T}$ range ($p_{T}\lesssim7$ GeV/$c$) in minimum-bias $p$-Pb collisions at $\sqrt{s_{NN}}=5.02$ TeV are well described by quark combination mechanism in equal-velocity combination approximation. The $Λ_{c}^{+}/D^{0}$ ratio in quark combination mechanism exhibits the typical increase-peak-decrease behavior as the function of $p_{T}$, and the shape of the ratio for $p_{T}\gtrsim3$ GeV/$c$ is in agreement with the data of ALICE collaboration in central rapidity region $-0.96<y<0.04$ and the preliminary data of LHCb collaboration in forward rapidity region $1.5<y<4.0$. The global production of single-charm baryons is quantified using the data and the possible enhancement (relative to light flavor baryons) is discussed. The $p_{T}$ spectra of $Ξ_{c}^{0}$, $Ω_{c}^{0}$ in minimum-bias events and those of single-charm hadrons in high-multiplicity event classes are predicted, which serves as the further test of the possible change of the hadronization characteristic for low $p_{T}$ charm quarks in the small system created in $p$-Pb collisions at LHC energies.

preprint2021arXiv

Elliptic flow of hadrons in equal-velocity quark combination mechanism in relativistic heavy-ion collisions

We apply a quark combination model with equal-velocity combination (EVC) approximation to study the elliptic flow ($v_{2}$) of hadrons in heavy-ion collisions in a wide collision energy range ($\sqrt{s_{NN}}=$ 27 - 5020 GeV). Utilizing the simple relationship between $v_{2}$ of hadrons and those of quarks under EVC, we find that $v_{2}$ of up/down quarks obtained by experimental data of proton is consistent with that obtained by data of $Λ$ and $Ξ$. $v_{2}$ of strange quarks obtained by data of $Ω$ is consistent with that obtained by data of $Λ$ and $Ξ$, and at RHIC energies it is also consistent with that obtained by data of $ϕ$. This means that $v_{2}$ of these hadrons have a common quark-level source. Using data of $D^0$, we obtain $v_{2}$ of charm quarks with $p_T\lesssim 6$ GeV/c. We find that under EVC charm quark dominates $v_{2}$ of $D$ mesons at low $p_{T}$ but light-flavor quarks significantly contribute to $v_{2}$ of $D$ mesons in the range $3\lesssim p_{T}\lesssim8$ GeV/c. We predict $v_{2}$ of charmed baryons $Λ_{c}^{+}$ and $Ξ_{c}^{0}$ which show a significant enhancement at intermediate $p_{T}$ due to the double contribution of light-flavor quarks. The properties of the obtained quark $v_{2}$ under EVC are studied and a regularity for $v_{2}$ of quarks as the function of $p_{T}/m$ is found.

preprint2021arXiv

Multivariate functional group sparse regression: functional predictor selection

In this paper, we propose methods for functional predictor selection and the estimation of smooth functional coefficients simultaneously in a scalar-on-function regression problem under high-dimensional multivariate functional data setting. In particular, we develop two methods for functional group-sparse regression under a generic Hilbert space of infinite dimension. We show the convergence of algorithms and the consistency of the estimation and the selection (oracle property) under infinite-dimensional Hilbert spaces. Simulation studies show the effectiveness of the methods in both the selection and the estimation of functional coefficients. The applications to the functional magnetic resonance imaging (fMRI) reveal the regions of the human brain related to ADHD and IQ.

preprint2021arXiv

Signals of quark combination at hadronization in $pp$ collisions at $\sqrt{s}=200$ GeV

We find signals of quark combination at hadronization from the experimental data of $p_{T}$ spectra of hadrons at mid-rapidity in $pp$ collisions at $\sqrt{s}=200$ GeV. The first is the constituent quark number scaling property for $p_{T}$ spectra of $Ω^{-}$ and $ϕ$ and that for $p_{T}$ spectra of $p$ and $ρ^{0}$. The second is that $p_{T}$ spectra of $Λ$, $Ξ^{-}$, and $K^{*0}$ can be self-consistently described using the spectrum of strange quarks from $ϕ$ data and that of up/down quarks from $p$ data in the equal-velocity combination mechanism. The third is that experimental data for $p_{T}$ spectrum of $D^{*+}$ are also well described using the spectrum of up/down quarks from $p$ data and that of charm quarks from perturbative QCD calculations. These results indicate a similarity between hadron production in $pp$ collisions at $\sqrt{s}=200$ GeV and that at LHC energies. We predict $p_{T}$ spectra of single-charm hadrons and their spectrum ratios. We suggest systematic measurements in $pp$ collisions at $\sqrt{s}=200$ GeV in future so as to better understand the property of small parton system created in $pp$ collisions at different collision energies.

preprint2020arXiv

Accurate prediction of nanovoid structures and energetics in bcc metals

Knowledge on structures and energetics of nanovoids is fundamental to understand defect evolution in metals. Yet there remain no reliable methods able to determine essential structural details or to provide accurate assessment of energetics for general nanovoids. Here, we performed systematic first-principles investigations to examine stable structures and energetics of nanovoids in bcc metals, explicitly demonstrated the stable structures can be precisely determined by minimizing their Wigner-Seitz area, and revealed a linear relationship between formation energy and Wigner-Seitz area of nanovoids. We further developed a new physics-based model to accurately predict stable structures and energetics for arbitrary-sized nanovoids. This model was well validated by first-principles calculations and recent nanovoid annealing experiments, and showed distinct advantages over the widely used spherical approximation. The present work offers mechanistic insights that crucial for understanding nanovoid formation and evolution, being a critical step towards predictive control and prevention of nanovoid related damage processes in structural metals.

preprint2020arXiv

Deep-MAPS: Machine Learning based Mobile Air Pollution Sensing

Mobile and ubiquitous sensing of urban air quality has received increased attention as an economically and operationally viable means to survey atmospheric environment with high spatial-temporal resolution. This paper proposes a machine learning based mobile air pollution sensing framework, called Deep-MAPS, and demonstrates its scientific and financial values in the following aspects. (1) Based on a network of fixed and mobile air quality sensors, we perform spatial inference of PM2.5 concentrations in Beijing (3,025 km2, 19 Jun-16 Jul 2018) for a spatial-temporal resolution of 1km-by-1km and 1 hour, with over 85% accuracy. (2) We leverage urban big data to generate insights regarding the potential cause of pollution, which facilitates evidence-based sustainable urban management. (3) To achieve such spatial-temporal coverage and accuracy, Deep-MAPS can save up to 90% hardware investment, compared with ubiquitous sensing that relies primarily on fixed sensors.

preprint2020arXiv

Hydrogen clustering in bcc metals: atomic origin and strong stress anisotropy

Hydrogen (H) induced damage in metals has been a long-standing woe for many industrial applications. One form of such damage is linked to H clustering, for which the atomic origin remains contended, particularly for non-hydride forming metals. In this work, we systematically studied H clustering behavior in bcc metals represented by W, Fe, Mo, and Cr, combining first-principles calculations, atomistic and Monte Carlo simulations. H clustering has been shown to be energetically favorable, and can be strongly facilitated by anisotropic stress field, dominated by the tensile component along one of the <001> crystalline directions. We showed that the stress effect can be well predicted by the continuum model based on H formation volume tensor, and that H clustering is thermodynamically possible at edge dislocations, evidenced by nanohydride formation at rather low levels of H concentration. Moreover, anisotropy in the stress effect is well reflected in nanohydride morphology around dislocations, with nanohydride growth occurring in the form of thin platelet structures that maximize one <001> tension. In particular, the <001> type edge dislocation, with the <001> tensile component maximized, has been shown to be highly effective in facilitating H aggregation, thus expected to play an important role in H clustering in bcc metals, in close agreement with recent experimental observations. This work explicitly and quantitatively clarifies the anisotropic nature of stress effect on H energetics and H clustering behaviors, offering mechanistic insights critical towards understanding H-induced damages in metals.

preprint2020arXiv

Optimistic Distributionally Robust Policy Optimization

Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO), as the widely employed policy based reinforcement learning (RL) methods, are prone to converge to a sub-optimal solution as they limit the policy representation to a particular parametric distribution class. To address this issue, we develop an innovative Optimistic Distributionally Robust Policy Optimization (ODRPO) algorithm, which effectively utilizes Optimistic Distributionally Robust Optimization (DRO) approach to solve the trust region constrained optimization problem without parameterizing the policies. Our algorithm improves TRPO and PPO with a higher sample efficiency and a better performance of the final policy while attaining the learning stability. Moreover, it achieves a globally optimal policy update that is not promised in the prevailing policy based RL algorithms. Experiments across tabular domains and robotic locomotion tasks demonstrate the effectiveness of our approach.

preprint2020arXiv

Quark number scaling of $p_{T}$ spectra for $Ω$ and $ϕ$ in relativistic heavy-ion collisions

We show that the experimental data of transverse momentum ($p_{T}$) spectra of $Ω$ baryon and $ϕ$ meson at mid-rapidity in heavy-ion collisions exhibit the constituent quark number scaling in a wide energy range from RHIC to LHC. Such a scaling behavior is a direct consequence of quark combination mechanism via equal velocity combination and provides a very convenient way to extract the $p_{T}$ spectrum of strange quarks at hadronization. We present the results of strange quarks obtained from the available data and study the properties in particular the energy dependence of the averaged transverse momentum $\langle p_{T}\rangle$ and the transverse radial flow velocity $\langleβ\rangle$ with a hydrodynamics-motivated blast-wave model.

preprint2019arXiv

Charmed hadron production via equal-velocity quark combination in ultra-relativistic heavy ion collisions

Recent data on the production of $D$ mesons and $Λ_c^+$ baryons in heavy ion collisions at the Relativistic Heavy Ion Collider and the Large Hadron Collider exhibit a number of striking characteristics such as enhanced yield ratios $D_s^+/D^0$, $Λ_c^+/D^0$ and their transverse momentum dependences. In this paper, we derive the momentum dependence of open charm mesons and singly charmed baryons produced in ultra-relativistic heavy ion collisions via the equal-velocity quark combination. We present analytic expressions and numerical results of yield ratios and compare them with the experimental data available. We make predictions for other charmed hadrons.

preprint2019arXiv

Photonic hooks from Janus microcylinders

Recently, a type of curved light beams, photonic hooks (PHs), was theoretically predicted and experimentally observed. The production of photonic hook (PH) is due to the breaking of structural symmetry of a plane-wave illuminated microparticle. Herein, we presented and implemented a new approach, of utilizing the symmetry-broken of the microparticles in material composition, for the generation of PHs from Janus microcylinders. Finite element method based numerical simulation and energy flow diagram represented theoretical analysis were used to investigate the field distribution characteristics and formation mechanism of the PHs. The full width at half-maximum (FWHM) of the PH (~0.29$λ$) is smaller than the FWHM of the photonic nanojet (~0.35$λ$) formed from a circular microcylinder with the same geometric radius. By changing the refractive index contrasts between upper and lower half-cylinders, or rotating the Janus microcylinder relative to the central axis, the shape profiles of the PHs can be efficiently modulated. The tunability of the PHs through simple stretching or compression operations, for the Janus microcylinder constituted by one solid inorganic half-cylinder and the other flexible polymer half-cylinder, was studied and discussed as well.

preprint2019arXiv

Statistical method in quark combination model

We present a new method of solving the probability distribution for baryons, antibaryons and mesons at the hadronization of constituent quark and antiquark system. The hadronization is governed by the quark combination rule in the quark combination model developed by the Shandong Group. We use the method of the generating function to derive the outcome of the quark combination rule, which is much simpler and easier to be generalized than the original method. Furthermore, we use the formula of the quark combination rule and its generalization to study the property of multiplicity distribution of net-protons. Taking a naive case of quark number fluctuations and correlations at hadronization, we calculate ratios of multiplicity cumulants of final-state net-protons and discuss the potential applicability of quark combination model in studying hadronic multiplicity fluctuations and the underlying phase transition property in relativistic heavy-ion collisions.

preprint2018arXiv

New feature of low $p_{T}$ charm quark hadronization in $pp$ collisions at $\sqrt{s}=7$ TeV

Treating the light-flavor constituent quarks and antiquarks that can well describe the data of light-flavor hadrons in $pp$ collisions at $\sqrt{s}=7$ TeV as the underlying source of chromatically neutralizing the charm quarks of low transverse momenta ($p_{T}$), we show that the experimental data of $p_{T}$ spectra of single-charm hadrons $D^{0,+}$, $D^{*+}$ $D_{s}^{+}$, $Λ_{c}^{+}$ and $Ξ_{c}^{0}$ at mid-rapidity in the low $p_{T}$ range ($2\lesssim p_{T}\lesssim7$ GeV/$c$) in $pp$ collisions at $\sqrt{s}=7$ TeV can be well understood by the equal-velocity combination of perturbatively-created charm quarks and those light-flavor constituent quarks and antiquarks. This suggests a possible new scenario of low $p_{T}$ charm quark hadronization, in contrast to the traditional fragmentation mechanism, in $pp$ collisions at LHC energies. This is also another support for the exhibition of the effective constituent quark degrees of freedom for the small parton system created in $pp$ collisions at LHC energies.