Researcher profile

Tao Zhong

Tao Zhong contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

Collab-Solver: Collaborative Solving Policy Learning for Mixed-Integer Linear Programming

Mixed-integer linear programming (MILP) has been a fundamental problem in combinatorial optimization. Conventional MILP solving mainly relies on carefully designed heuristics embedded in the branch-and-bound framework. Driven by the strong capabilities of neural networks, recent research is exploring the value of machine learning alongside conventional MILP solving. Although learning-based MILP methods have shown great promise, existing works typically learn policies for individual modules in MILP solvers in isolation, without considering their interdependence, which limits both solving efficiency and solution quality. To address this limitation, we propose Collab-Solver, a novel multi-agent-based policy learning framework for MILP that enables collaborative policy optimization for multiple modules. Specifically, we formulate the collaboration between cut selection and branching in MILP solving as a Stackelberg game. Under this formulation, we develop a two-phase learning paradigm to stabilize collaborative policy learning: the first phase performs data-communicated policy pretraining, and the second phase further orchestrates the policy learning for various modules. Extensive experiments on both synthetic and large-scale real-world MILP datasets demonstrate that the jointly learned policies significantly improve solving performance. Moreover, the policies learned by Collab-Solver have also demonstrated excellent generalization abilities across different instance sets.

preprint2026arXiv

HodgeCover: Higher-Order Topological Coverage Drives Compression of Sparse Mixture-of-Experts

Sparse Mixture-of-Experts (MoE) layers route tokens through a handful of experts, and learning-free compression of these layers reduces inference cost without retraining. A subtle obstruction blocks every existing compressor in this family: three experts can each be pairwise compatible yet form an irreducible cycle when merged together, so any score that ranks experts on pairwise signals is structurally blind to which triples are jointly mergeable. We show the obstruction is a precise mathematical object, the harmonic kernel of the simplicial Laplacian on a 2-complex whose vertices are experts, whose edges carry KL merge barriers, and whose faces carry triplet barriers; Hodge-decomposing the edge-barrier signal isolates the kernel exactly. We turn the diagnostic into a selection objective: HodgeCover greedily covers the harmonic-critical edges and triplet-critical triangles, and a hybrid variant of HodgeCover pairs it with off-the-shelf weight pruning on survivors. On three open-weight Sparse MoE backbones under aggressive expert reduction, HodgeCover matches state-of-the-art learning-free baselines on the expert-reduction axis, leads on the aggressive-compression frontier of the hybrid axis, and uniquely balances retained mass across all four Hodge components. These results show that exposing the harmonic kernel of a learned MoE structure changes which compressor wins at the regime that matters most.

preprint2026arXiv

Neural Fields for NV-Center Inverse Sensing

Inverse problems in scientific sensing are often solved with either hand-designed regularizers or supervised networks trained on simulated labels, yet both can fail when the forward model is nonlinear, spectrally coupled, and physically delicate. We study this issue for noise sensing based on nitrogen-vacancy (NV) centers in diamond, where a quantum sensor measures magnetic-noise spectra generated by sparse spin sources. We show that replacing a common scalar/coherent forward approximation with a tensor power-summed dipolar operator changes the inverse landscape and exposes a center-collapse failure mode in free-density optimization. We propose NeTMY, an amortization-free coordinate neural field coupled to the differentiable NV forward model, with annealed positional encoding, multiscale optimization, sparsity/gating, and spectrum-fidelity losses. Across sparse synthetic reconstructions generated by the corrected operator, NeTMY achieves the best localization and distributional metrics in the tested benchmark. Mechanism experiments show that NeTMY does not directly execute the raw density-space gradient; its parameterization smooths and redistributes updates, mitigating the center-collapse pathology. These results position NV quantum sensing as a useful testbed for physics-faithful neural inverse problems.

preprint2026arXiv

Topology-Preserving Neural Operator Learning via Hodge Decomposition

In this paper, we study solution operators of physical field equations on geometric meshes from a function-space perspective. We reveal that Hodge orthogonality fundamentally resolves spectral interference by isolating unlearnable topological degrees of freedom from learnable geometric dynamics, enabling an additive approximation confined to structure-preserving subspaces. Building on Hodge theory and operator splitting, we derive a principled operator-level decomposition. The result is a Hybrid Eulerian-Lagrangian architecture with an algebraic-level inductive bias we call Hodge Spectral Duality (HSD). In our framework, we use discrete differential forms to capture topology-dominated components and an orthogonal auxiliary ambient space to represent complex local dynamics. Our method achieves superior accuracy and efficiency on geometric graphs with enhanced fidelity to physical invariants. Our code is available at https://github.com/ContinuumCoder/Hodge-Spectral-Duality

preprint2024arXiv

Investigation for $D^+ \to π^+ ν\barν$ decay process within QCDSR approach

In the paper, we investigate the charmed meson rare decay process $D^+ \to π^+ν\barν$ by using QCD sum rules approach. Firstly, the pion twist-2 and twist-3 distribution amplitude $ξ$-moments $\langleξ_{2;π}^n\rangle|_μ$ up to 10th-order and $\langle ξ_{3;π}^{(p,σ),n}\rangle|_μ$ up to fourth-order are calculated by using QCD sum rule under background field theory. After constructing the light-cone harmonic oscillator model for pion twist-2, 3 DAs, we get their behaviors by matching the calculated $ξ$-moments. Then, the $D\to π$ transition form factors are calculated by using QCD light-cone sum rules approach. The vector form factor at large recoil region is $f_+^{D\toπ}(0) = 0.627^{+0.120} _{-0.080}$. By taking the rapidly $z(q^2,t)$ converging simplified series expansion, we present the TFFs and the corresponding angular coefficients in the whole squared momentum transfer physical region. Furthermore, we display the semileptonic decay process $\bar D^0 \to π^+ e\bar ν_e$ differential decay widths and branching fraction with ${\cal B}(\bar D^0\toπ^+e\barν_e) = 0.308^{+0.155}_{-0.066} \times 10^{2}$. The $\bar D^0\toπ^+e\barν_e$ differential/total predictions for forward-backward asymmetry, $q^2$-differential flat terms and lepton polarization asymmetry are also given. After considering the non-standard neutrino interactions, the predictions for the $D^+ \to π^+ ν\barν$ branching fraction is ${\cal B}(D^+ \to π^+ {ν}{\barν}) = 1.85^{+0.93}_{-0.46}\times10^{-8}$.

preprint2023arXiv

Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-Experts

In this paper, we tackle the problem of domain shift. Most existing methods perform training on multiple source domains using a single model, and the same trained model is used on all unseen target domains. Such solutions are sub-optimal as each target domain exhibits its own specialty, which is not adapted. Furthermore, expecting single-model training to learn extensive knowledge from multiple source domains is counterintuitive. The model is more biased toward learning only domain-invariant features and may result in negative knowledge transfer. In this work, we propose a novel framework for unsupervised test-time adaptation, which is formulated as a knowledge distillation process to address domain shift. Specifically, we incorporate Mixture-of-Experts (MoE) as teachers, where each expert is separately trained on different source domains to maximize their specialty. Given a test-time target domain, a small set of unlabeled data is sampled to query the knowledge from MoE. As the source domains are correlated to the target domains, a transformer-based aggregator then combines the domain knowledge by examining the interconnection among them. The output is treated as a supervision signal to adapt a student prediction network toward the target domain. We further employ meta-learning to enforce the aggregator to distill positive knowledge and the student network to achieve fast adaptation. Extensive experiments demonstrate that the proposed method outperforms the state-of-the-art and validates the effectiveness of each proposed component. Our code is available at https://github.com/n3il666/Meta-DMoE.

preprint2023arXiv

Properties of the $η_q$ leading-twist distribution amplitude and its effects to the $B/D^+ \toη^{(\prime)}\ell^+ ν_\ell$ decays

The $η^{(\prime)}$-mesons in the quark-flavor basis are mixtures of two mesonic states $|η_{q}\rangle=|\bar u u+\bar d d\rangle/\sqrt 2$ and $|η_{s}\rangle=|\bar s s\rangle$. In the previous work, we have made a detailed study on the $η_{s}$ leading-twist distribution amplitude. As a sequential work, in the present paper, we fix the $η_q$ leading-twist distribution amplitude by using the light-cone harmonic oscillator model for its wave function and by using the QCD sum rules within the QCD background field to calculate its moments. The input parameters of $η_q$ leading-twist distribution amplitude $ϕ_{2;η_q}$ at an initial scale $μ_0\sim 1$ GeV are then fixed by using those moments. The sum rules for the $0_{\rm th}$-order moment can also be used to fix the magnitude of $η_q$ decay constant, which gives $f_{η_q}=0.141\pm0.005$ GeV. As an application of the present derived $ϕ_{2;η_q}$, we calculate the transition form factors $B(D)^+ \toη^{(\prime)}$ by using the QCD light-cone sum rules up to twist-4 accuracy and by including the next-to-leading order QCD corrections to the twist-2 part, and then fix the related CKM matrix element and the decay width for the semi-leptonic decays $B(D)^+ \toη^{(\prime)}\ell^+ ν_\ell$.

preprint2022arXiv

$a_1(1260)$-meson longitudinal twist-2 distribution amplitude and the $D\to a_1(1260)\ell^+ν_\ell$ decay processes

In the paper, we investigate the moments $\langleξ_{2;a_1}^{\|;n}\rangle$ of the axial-vector $a_1(1260)$-meson distribution amplitude by using the QCD sum rules approach under the background field theory. By considering the vacuum condensates up to dimension-six and the perturbative part up to next-to-leading order QCD corrections, its first five moments at an initial scale $μ_0=1~{\rm GeV}$ are $\langleξ_{2;a_1}^{\|;2}\rangle|_{μ_0} = 0.223 \pm 0.029$, $\langleξ_{2;a_1}^{\|;4}\rangle|_{μ_0} = 0.098 \pm 0.008$, $\langleξ_{2;a_1}^{\|;6}\rangle|_{μ_0} = 0.056 \pm 0.006$, $\langleξ_{2;a_1}^{\|;8}\rangle|_{μ_0} = 0.039 \pm 0.004$ and $\langleξ_{2;a_1}^{\|;10}\rangle|_{μ_0} = 0.028 \pm 0.003$, respectively. We then construct a light-cone harmonic oscillator model for $a_1(1260)$-meson longitudinal twist-2 distribution amplitude $ϕ_{2;a_1}^{\|}(x,μ)$, whose model parameters are fitted by using the least squares method. As an application of $ϕ_{2;a_1}^{\|}(x,μ)$, we calculate the transition form factors (TFFs) of $D\to a_1(1260)$ in large and intermediate momentum transfers by using the QCD light-cone sum rules approach. At the largest recoil point ($q^2=0$), we obtain $ A(0) = 0.130_{ - 0.013}^{ + 0.015}$, $V_1(0) = 1.898_{-0.121}^{+0.128}$, $V_2(0) = 0.228_{-0.021}^{ + 0.020}$, and $V_0(0) = 0.217_{ - 0.025}^{ + 0.023}$. By applying the extrapolated TFFs to the semi-leptonic decay $D^{0(+)} \to a_1^{-(0)}(1260)\ell^+ν_\ell$, we obtain ${\cal B}(D^0\to a_1^-(1260) e^+ν_e) = (5.261_{-0.639}^{+0.745}) \times 10^{-5}$, ${\cal B}(D^+\to a_1^0(1260) e^+ν_e) = (6.673_{-0.811}^{+0.947}) \times 10^{-5}$, ${\cal B}(D^0\to a_1^-(1260) μ^+ ν_μ)=(4.732_{-0.590}^{+0.685}) \times 10^{-5}$, ${\cal B}(D^+ \to a_1^0(1260) μ^+ ν_μ)=(6.002_{-0.748}^{+0.796}) \times 10^{-5}$.

preprint2022arXiv

Investigating the ratio of CKM matrix elements $|V_{ub}|/|V_{cb}|$ from semileptonic decay $B_s^0\to K^-μ^+ν_μ$ and kaon twist-2 distribution amplitude

In this paper, we calculate the ratio of Cabibbo-Kobayashi-Maskawa matrix elements, $|V_{ub}|/|V_{cb}|$, based on the semileptonic decay $B_s^0\to K^-μ^+ν_μ$. Its key component, the $B_s\to K$ transition form factor $f^{B_s\to K}_+(q^2)$, is studied within the QCD light-cone sum rules approach by using a chiral correlator. The derived $f^{B_s\to K}_+(q^2)$ is dominated by the leading-twist part, and to improve its precision, we construct a new model for the kaon leading-twist distribution amplitude $ϕ_{2;K}(x,μ)$, whose parameters are fixed by using the least squares method with the help of the moments calculated by using the QCD sum rules within the background field theory. The first four moments at the initial scale $μ_0 = 1~{\rm GeV}$ are, $\langle ξ^1\rangle _{2;K} = -0.0438^{+0.0053}_{-0.0075}$, $\langle ξ^2\rangle _{2;K} = 0.262 \pm 0.010$, $\langle ξ^3\rangle _{2;K} = -0.0210^{+0.0024}_{-0.0035}$ and $\langle ξ^4\rangle _{2;K} = 0.132 \pm 0.006$, respectively. And their corresponding Gegenbauer moments are, $a^{2;K}_1 = -0.0731^{+0.0089}_{-0.0124}$, $a^{2;K}_2 = 0.182^{+0.029}_{-0.030}$, $a^{2;K}_3 = -0.0114^{+0.0008}_{-0.0016}$ and $a^{2;K}_4 = 0.041^{-0.003}_{+0.005}$, respectively. At the large recoil region, we obtain $f^{B_s\to K} _+ (0) = 0.270^{+0.022}_{-0.030}$. By extrapolating $f^{B_s\to K}_+(q^2)$ to all the physical allowable region, we obtain a $|V_{ub}|$-independent decay width for the semileptonic decay $B_s^0\to K^-μ^+ν_μ$, $5.626^{+1.271}_{-0.864} \times 10^{-12}\ {\rm GeV}$, which then leads to $|V_{ub}|/|V_{cb}| = 0.072\pm0.005$.

preprint2022arXiv

The ratio $\mathcal{R}(D_s)$ for $B_s \to D_s \ellν_\ell$ by using the QCD light-cone sum rules within the framework of heavy quark effective field theory

In the paper, we study the $B_s\to D_s$ transition form factors by using the light-cone sum rules within the framework of heavy quark effective field theory. We adopt a chiral current correlation function to do the calculation, the resultant transition form factors $f_+^{B_s\to D_s}(q^2)$ and $f_0^{B_s\to D_s}(q^2)$ are dominated by the contribution of $D_s$-meson leading-twist distribution amplitude, while the contributions from less certain $D_s$-meson twist-3 distribution amplitudes are greatly suppressed. At the largest recoil point, we obtain $f_{+,0}^{B_s \to D_s}(0)=0.533^{+0.112}_{-0.094}$. By further extrapolating the transition form factors into all the physically allowable $q^2$ region with the help of the $z$-series parametrization approach, we calculate the branching fractions $\mathcal{B}(B_s \to D_s \ell^\prime ν_{\ell^\prime})$ with $(\ell^\prime= e,μ)$ and $\mathcal{B}(B_s \to D_s τν_τ)$, which gives $\mathcal{R}(D_s)=0.334\pm 0.017$.

preprint2021arXiv

$η^{(\prime)}$-meson twist-2 distribution amplitude within QCD sum rule approach and its application to the semi-leptonic decay $ D_s^+ \toη^{(\prime)}\ell^+ ν_\ell$

In this paper, we make a detailed discussion on the $η$ and $η'$-meson leading-twist light-cone distribution amplitude $ϕ_{2;η^{(\prime)}}(u,μ)$ by using QCD sum rules approach under the background field theory. Taking both the non-perturbative condensates up to dimension-six and NLO QCD corrections to the perturbative part, its first three moments $\langleξ^n_{2;η^{(\prime)}}\rangle|_{μ_0} $ with $n = (2,4,6)$ at initial scale $μ_0 = 1$ GeV can be determined. e.g. $\langleξ_{2;η}^2\rangle|_{μ_0} =0.231_{-0.013}^{+0.010}$, $\langleξ_{2;η}^4 \rangle|_{μ_0} =0.109_{-0.007}^{+0.007}$, and $\langleξ_{2;η}^6 \rangle|_{μ_0} =0.066_{-0.006}^{+0.006}$ for $η$-meson, $\langleξ_{2;η'}^2\rangle|_{μ_0} =0.211_{-0.017}^{+0.015}$, $\langleξ_{2;η'}^4 \rangle|_{μ_0} =0.093_{-0.009}^{+0.009}$, and $\langleξ_{2;η'}^6 \rangle|_{μ_0} =0.054_{-0.008}^{+0.008}$ for $η'$-meson. Next, we calculate $D_s\toη^{(\prime)}$ TFFs $f^{η^{(\prime)}}_+(q^2)$ within QCD light-cone sum rules approach up to NLO level. The values at large recoil region are $f^η_+(0) = 0.476_{-0.036}^{+0.040}$ and $f^{η'}_+(0) = 0.544_{-0.042}^{+0.046}$. After extrapolating TFFs to the allowable physical regions within the series expansion, we obtain the branching fractions of the semi-leptonic decay, i.e. $D_s^+\toη^{(\prime)}\ell^+ ν_\ell$, i.e. ${\cal B}(D_s^+\toη^{(\prime)} e^+ν_e)=2.346_{-0.331}^{+0.418}(0.792_{-0.118}^{+0.141})\times10^{-2}$ and ${\cal B}(D_s^+\toη^{(\prime)} μ^+ν_μ)=2.320_{-0.327}^{+0.413}(0.773_{-0.115}^{+0.138})\times10^{-2}$ for $\ell = (e, μ)$ channels respectively. And in addition to that, the mixing angle for $η-η'$ with $φ$ and ratio for the different decay channels ${\cal R}_{η'/η}^\ell$ are given, which show good agreement with the recent BESIII measurements.