Source author record

Xiaohui Hu

Xiaohui Hu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language hep-ph hep-ex Artificial Intelligence Machine Learning

Catalog footprint

What is connected

7works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

VeRPO: Verifiable Dense Reward Policy Optimization for Code Generation

Effective reward design is a central challenge in Reinforcement Learning (RL) for code generation. Mainstream pass/fail outcome rewards enforce functional correctness via executing unit tests, but the resulting sparsity limits potential performance gains. While recent work has explored external Reward Models (RM) to generate richer, continuous rewards, the learned RMs suffer from reward misalignment and prohibitive computational cost. In this paper, we introduce \textbf{VeRPO} (\textbf{V}erifiable D\textbf{e}nse \textbf{R}eward \textbf{P}olicy \textbf{O}ptimization), a novel RL framework for code generation that synthesizes \textit{robust and dense rewards fully grounded in verifiable execution feedback}. The core idea of VeRPO is constructing dense rewards from weighted partial success: by dynamically estimating the difficulty weight of each unit test based on the execution statistics during training, a dense reward is derived from the sum of weights of the passed unit tests. To solidify the consistency between partial success and end-to-end functional correctness, VeRPO further integrates the dense signal with global execution outcomes, establishing a robust and dense reward paradigm relying solely on verifiable execution feedback. Extensive experiments across diverse benchmarks and settings demonstrate that VeRPO consistently outperforms outcome-driven and RM-based baselines, achieving up to +8.83\% gain in pass@1 with negligible time cost (< 0.02\%) and zero GPU memory overhead.

preprint2025arXiv

Training Report of TeleChat3-MoE

TeleChat3-MoE is the latest series of TeleChat large language models, featuring a Mixture-of-Experts (MoE) architecture with parameter counts ranging from 105 billion to over one trillion,trained end-to-end on Ascend NPU cluster. This technical report mainly presents the underlying training infrastructure that enables reliable and efficient scaling to frontier model sizes. We detail systematic methodologies for operator-level and end-to-end numerical accuracy verification, ensuring consistency across hardware platforms and distributed parallelism strategies. Furthermore, we introduce a suite of performance optimizations, including interleaved pipeline scheduling, attention-aware data scheduling for long-sequence training,hierarchical and overlapped communication for expert parallelism, and DVM-based operator fusion. A systematic parallelization framework, leveraging analytical estimation and integer linear programming, is also proposed to optimize multi-dimensional parallelism configurations. Additionally, we present methodological approaches to cluster-level optimizations, addressing host- and device-bound bottlenecks during large-scale training tasks. These infrastructure advancements yield significant throughput improvements and near-linear scaling on clusters comprising thousands of devices, providing a robust foundation for large-scale language model development on hardware ecosystems.

preprint2022arXiv

MME-CRS: Multi-Metric Evaluation Based on Correlation Re-Scaling for Evaluating Open-Domain Dialogue

Automatic open-domain dialogue evaluation is a crucial component of dialogue systems. Recently, learning-based evaluation metrics have achieved state-of-the-art performance in open-domain dialogue evaluation. However, these metrics, which only focus on a few qualities, are hard to evaluate dialogue comprehensively. Furthermore, these metrics lack an effective score composition approach for diverse evaluation qualities. To address the above problems, we propose a Multi-Metric Evaluation based on Correlation Re-Scaling (MME-CRS) for evaluating open-domain dialogue. Firstly, we build an evaluation metric composed of 5 groups of parallel sub-metrics called Multi-Metric Evaluation (MME) to evaluate the quality of dialogue comprehensively. Furthermore, we propose a novel score composition method called Correlation Re-Scaling (CRS) to model the relationship between sub-metrics and diverse qualities. Our approach MME-CRS ranks first on the final test data of DSTC10 track5 subtask1 Automatic Open-domain Dialogue Evaluation Challenge with a large margin, which proved the effectiveness of our proposed approach.

preprint2022arXiv

Supporting Medical Relation Extraction via Causality-Pruned Semantic Dependency Forest

Medical Relation Extraction (MRE) task aims to extract relations between entities in medical texts. Traditional relation extraction methods achieve impressive success by exploring the syntactic information, e.g., dependency tree. However, the quality of the 1-best dependency tree for medical texts produced by an out-of-domain parser is relatively limited so that the performance of medical relation extraction method may degenerate. To this end, we propose a method to jointly model semantic and syntactic information from medical texts based on causal explanation theory. We generate dependency forests consisting of the semantic-embedded 1-best dependency tree. Then, a task-specific causal explainer is adopted to prune the dependency forests, which are further fed into a designed graph convolutional network to learn the corresponding representation for downstream task. Empirically, the various comparisons on benchmark medical datasets demonstrate the effectiveness of our model.

preprint2016arXiv

Study of nonleptonic $B_{q}^{\ast}$ ${\to}$ $D_{q}V$ and $P_{q} D^*$ weak decays

Motivated by the powerful capability of measurement for the $b$-flavored hadron rare decays at LHC and SuperKEKB/Belle-II, the nonleptonic $\bar{B}^{\ast}$ ${\to}$ $D\bar{D}^{\ast}$, $D{ρ^-}$, $DK^{\ast-}$, $πD^{\ast}$ and $KD^{\ast}$ weak decays are studied in detail. With the amplitudes calculated with factorization approach and the form factors of $B^{\ast}$ transition into pseudoscalar meson evaluated with the BSW model, branching fractions and polarization fractions are firstly presented. Numerically, the CKM-favored $\bar{B}_{q}^{\ast}$ ${\to}$ $D_{q}D_{s}^{{\ast}-}$ and $D_{q}ρ^{-}$ decays have large branching fractions, $\sim$ $10^{-8}$, which should be sought for with priority and firstly observed by LHC and Belle-II experiments. The $\bar{B}^{\ast}_q$ ${\to}$ $D_qK^{\ast}$ and $D_qρ$ decays are dominated by the longitudinal polarization states. While, the parallel polarization fractions of $\bar{B}^{\ast}_q$ ${\to}$ $D_q\bar{D}^{\ast}$ decays are comparable with the longitudinal ones, numerically, $f_{\parallel}$ $+$ $f_{L}$ ${\simeq}$ 95\% and $f_{L}:f_{\parallel}$ $\simeq$ $5:4$. Some comparisons between $\bar{B}^{*0}_q$ $\to$ $D_q V$ and their corresponding $\bar{B}^{0}_q$ $\to$ $D^*_q V$ decays are performed, and the relation $ f_{L,\parallel}(\bar{B}^{\ast 0}\to D V)\simeq f_{L,\parallel}(\bar{B}^0\to D^{\ast +} V^-) $ is presented. Besides, with the implication of $SU(3)$ flavor symmetry, some useful ratios $ R_{\rm du}$ and $ R_{\rm ds}$ are discussed in detail, and suggested to be verified experimentally.

preprint2015arXiv

Constraints on hard spectator scattering and annihilation corrections in $B_{u,d}$ ${\to}$ $PV$ decays within QCD factorization

In this paper, we investigate the contributions of hard spectator scattering and annihilation in $B\to PV$ decays within the QCD factorization framework. With available experimental data on $B\to πK^{\ast}$, $ρK$, $πρ$ and $Kϕ$ decays, comprehensive $χ^2$ analyses of the parameters $X_{A,H}^{i,f}(ρ_{A,H}^{i,f},ϕ_{A,H}^{i,f})$ are performed, where $X_A^f$ ($X_A^i$) and $X_H$ are used to parameterize the endpoint divergences of the (non)factorizable annihilation and hard spectator scattering amplitudes, respectively. Based on $χ^2$ analyses, it is observed that (1) The topology-dependent parameterization scheme is feasible for $B\to PV$ decays; (2) At the current accuracy of experimental measurements and theoretical evaluations, $X_H=X_A^i$ is allowed by $B\to PV$ decays, but $X_{H}\neq X_A^f$ at $68%$ C. L.; (3) With the simplification $X_H=X_A^i$, parameters $X_A^f$ and $X_A^i$ should be treated individually. The above-described findings are very similar to those obtained from $B\to PP$ decays. Numerically, for $B\to PV$ decays, we obtain $(ρ_{A,H}^i,ϕ_{A,H}^i[^{\circ}]) =(2.87^{+0.66}_{-1.95}, -145^{+14}_{-21})$ and $(ρ_A^f,ϕ_A^f[^{\circ}]) = (0.91^{+0.12}_{-0.13}, -37^{+10}_{-9})$ at $68%$ C. L.. With the best-fit values, most of the theoretical results are in good agreement with the experimental data within errors. However, significant corrections to the color-suppressed tree amplitude $α_2$ related to a large $ρ_H$ result in the wrong sign for $A^{dir}_{CP}(B^- \to π^0 K^{{\ast}-})$ compared with the most recent BABAR data, which presents a new obstacle in solving "$ππ$" and "$πK$" puzzles through $α_2$. A crosscheck with measurements at Belle (or Belle II) and LHCb, which offer higher precision, is urgently expected to confirm or refute such possible mismatch.

preprint2015arXiv

Probing Spectator Scattering and Annihilation Corrections in $B_{s}$ $\to$ $PV$ Decays

Motivated by the recent LHCb measurements on $\bar{B}_{s}$ $\to$ $π^{-}K^{*+}$ and $\bar{B}_{s}$ $\to$ $K^{\pm}K^{*\mp}$ decay modes, we revisit the $B_{s}$ $\to$ $PV$ decays within QCD factorization framework. The effects of hard-spectator scattering and annihilation corrections are studied in detail. After performing a $χ^2$-fit on the end-point parameters $X_A^{i,f}$ ($ρ_A^{i,f}$, $ϕ_A^{i,f}$) and $X_H$ ($ρ_H$, $ϕ_H$) with available data, it is found that although some possible mismatches exist, the universalities of $X_A^{i,f}$ and $X_H$ in $B_s$ and $B_{u,d}$ systems are still allowed within theoretical uncertainties and experimental errors. With the end-point parameters gotten from $B_{u,d}$ $\to$ $PV$ decays, the numerical results and detailed analyses for the observables of $\bar{B}_{s}$ ${\to}$ $πK^{\ast}$, $ρK$, $πρ$, $πϕ$ and $Kϕ$ decay modes are presented. In addition, we have identified a few useful observables, especially the ones of $\bar{B}_{s}$ $\to$ $π^{0}ϕ$ decay for instance, for probing hard-spectator scattering and annihilation contributions.

Xiaohui Hu

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

VeRPO: Verifiable Dense Reward Policy Optimization for Code Generation

Training Report of TeleChat3-MoE

MME-CRS: Multi-Metric Evaluation Based on Correlation Re-Scaling for Evaluating Open-Domain Dialogue

Supporting Medical Relation Extraction via Causality-Pruned Semantic Dependency Forest

Study of nonleptonic $B_{q}^{\ast}$ ${\to}$ $D_{q}V$ and $P_{q} D^*$ weak decays

Constraints on hard spectator scattering and annihilation corrections in $B_{u,d}$ ${\to}$ $PV$ decays within QCD factorization

Probing Spectator Scattering and Annihilation Corrections in $B_{s}$ $\to$ $PV$ Decays