Researcher profile

Di Zhang

Di Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
20works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

20 published item(s)

preprint2026arXiv

Steering Visual Generation in Unified Multimodal Models with Understanding Supervision

Unified multimodal models are envisioned to bridge the gap between understanding and generation. Yet, to achieve competitive performance, state-of-the-art models adopt largely decoupled understanding and generation components. This design, while effective for individual tasks, weakens the connection required for mutual enhancement, leaving the potential synergy empirically uncertain. We propose to explicitly restore this synergy by introducing Understanding-Oriented Post-Training (UNO), a lightweight framework that treats understanding not only as a distinct task, but also a direct supervisory signal to steer generative representations. By incorporating objectives that encode semantic abstraction (captioning) and structural details (visual regression), we enable effective gradient flow from understanding to generation. Extensive experiments on image generation and editing demonstrate that understanding can serve as an effective catalyst for generation.

preprint2023arXiv

ClusterLog: Clustering Logs for Effective Log-based Anomaly Detection

With the increasing prevalence of scalable file systems in the context of High Performance Computing (HPC), the importance of accurate anomaly detection on runtime logs is increasing. But as it currently stands, many state-of-the-art methods for log-based anomaly detection, such as DeepLog, have encountered numerous challenges when applied to logs from many parallel file systems (PFSes), often due to their irregularity and ambiguity in time-based log sequences. To circumvent these problems, this study proposes ClusterLog, a log pre-processing method that clusters the temporal sequence of log keys based on their semantic similarity. By grouping semantically and sentimentally similar logs, this approach aims to represent log sequences with the smallest amount of unique log keys, intending to improve the ability of a downstream sequence-based model to effectively learn the log patterns. The preliminary results of ClusterLog indicate not only its effectiveness in reducing the granularity of log sequences without the loss of important sequence information but also its generalizability to different file systems' logs.

preprint2023arXiv

High-throughput combinatorial approach expedites the synthesis of a lead-free relaxor ferroelectric system

Developing novel lead-free ferroelectric materials is crucial for next-generation microelectronic technologies that are energy efficient and environment friendly. However, materials discovery and property optimization are typically time-consuming due to the limited throughput of traditional synthesis methods. In this work, we use a high-throughput combinatorial synthesis approach to fabricate lead-free ferroelectric superlattices and solid solutions of (Ba0.7Ca0.3)TiO3 (BCT) and Ba(Zr0.2Ti0.8)O3 (BZT) phases with continuous variation of composition and layer thickness. High-resolution X-ray diffraction (XRD) and analytical scanning transmission electron microscopy (STEM) demonstrate high film quality and well-controlled compositional gradients. Ferroelectric and dielectric property measurements identify the optimal property point achieved at the morphotropic phase boundary (MPB) with a composition of 48BZT-52BCT. Displacement vector maps reveal that ferroelectric domain sizes are tunable by varying {BCT-BZT}N superlattice geometry. This high-throughput synthesis approach can be applied to many other material systems to expedite new materials discovery and properties optimization, allowing for the exploration of a large area of phase space within a single growth.

preprint2022arXiv

AMCAD: Adaptive Mixed-Curvature Representation based Advertisement Retrieval System

Graph embedding based retrieval has become one of the most popular techniques in the information retrieval community and search engine industry. The classical paradigm mainly relies on the flat Euclidean geometry. In recent years, hyperbolic (negative curvature) and spherical (positive curvature) representation methods have shown their superiority to capture hierarchical and cyclic data structures respectively. However, in industrial scenarios such as e-commerce sponsored search platforms, the large-scale heterogeneous query-item-advertisement interaction graphs often have multiple structures coexisting. Existing methods either only consider a single geometry space, or combine several spaces manually, which are incapable and inflexible to model the complexity and heterogeneity in the real scenario. To tackle this challenge, we present a web-scale Adaptive Mixed-Curvature ADvertisement retrieval system (AMCAD) to automatically capture the complex and heterogeneous graph structures in non-Euclidean spaces. Specifically, entities are represented in adaptive mixed-curvature spaces, where the types and curvatures of the subspaces are trained to be optimal combinations. Besides, an attentive edge-wise space projector is designed to model the similarities between heterogeneous nodes according to local graph structures and the relation types. Moreover, to deploy AMCAD in Taobao, one of the largest ecommerce platforms with hundreds of million users, we design an efficient two-layer online retrieval framework for the task of graph based advertisement retrieval. Extensive evaluations on real-world datasets and A/B tests on online traffic are conducted to illustrate the effectiveness of the proposed system.

preprint2022arXiv

Complete One-loop Matching of the Type-I Seesaw Model onto the Standard Model Effective Field Theory

In this paper, we accomplish the complete one-loop matching of the type-I seesaw model onto the Standard Model Effective Field Theory (SMEFT), by integrating out three heavy Majorana neutrinos with the functional approach. It turns out that only 31 dimension-six operators (barring flavor structures and Hermitian conjugates) in the Warsaw basis of the SMEFT can be obtained, and most of them appear at the one-loop level. The Wilson coefficients of these 31 dimension-six operators are computed up to $\mathcal{O}\left( M^{-2}\right)$ with $M$ being the mass scale of heavy Majorana neutrinos. As the effects of heavy Majorana neutrinos are encoded in the Wilson coefficients of these higher-dimensional operators, a complete one-loop matching is useful to explore the low-energy phenomenological consequences of the type-I seesaw model. In addition, the threshold corrections to the couplings in the Standard Model and to the coefficient of the dimension-five operator are also discussed.

preprint2022arXiv

Impact of recent $(g-2)_μ$ measurement on the light CP-even Higgs scenario in general Next-to-Minimal Supersymmetric Standard Model

The General Next-to-Minimal Supersymmetric Standard Model (GNMSSM) is an attractive theory that is free from the tadpole problem and the domain-wall problem of $Z_3$-NMSSM, and can form an economic secluded dark matter (DM) sector to naturally predict the DM experimental results. It also provides mechanisms to easily and significantly weaken the constraints from the LHC search for supersymmetric particles. These characteristics enable the theory to explain the recently measured muon anomalous magnetic moment, $(g-2)_μ$, in a broad parameter space that is consistent with all experimental results and at same time keeps the electroweak symmetry breaking natural. This work focuses on a popular scenario of the GNMSSM in which the next-to-lightest CP-even Higgs boson corresponds to the scalar discovered at the Large Hadron Collider (LHC). Both analytic formulae and a sophisticated numerical study show that in order to predict the scenario without significant tunings of relevant parameters, the Higgsino mass $μ_{tot} \lesssim 500~{\rm GeV}$ and $\tan β\lesssim 30$ are preferred. This character, if combined with the requirement to account for the $(g-2)_μ$ anomaly, will entail some light sparticles and make the LHC constraints very tight. As a result, this scenario can explain the muon anomalous magnetic moment in very narrow corners of its parameter space.

preprint2022arXiv

One-loop Matching of the Type-II Seesaw Model onto the Standard Model Effective Field Theory

In this paper, we continue to construct the low-energy effective field theories (EFTs) of the canonical seesaw models, which are natural extensions of the Standard Model (SM) to accommodate tiny but nonzero neutrino masses. Different from three right-handed neutrino singlets in the type-I seesaw model, the Higgs triplet in the type-II seesaw model participates directly in the electroweak gauge interactions, rendering the EFT construction more challenging. By integrating out the heavy Higgs triplet in the functional-integral formalism, we carry out a complete one-loop matching of the type-II seesaw model onto the so-called Standard Model Effective Field Theory (SMEFT). It turns out that 41 dimension-six operators (barring flavor structures and Hermitian conjugates) in the Warsaw basis of the SMEFT can be obtained, covering all those 31 dimension-six operators in the case of type-I seesaw model. The Wilson coefficients for 41 dimension-six operators are computed up to $\mathcal{O}\left( M^{-2}_Δ\right)$ with $M^{}_Δ$ being the mass scale of the Higgs triplet. Moreover, the branching ratios of rare radiative decays of charged leptons $l^-_α\to l^-_β+ γ$ are calculated in the EFT and compared with that in the full theory in order to demonstrate the practical application and the correctness of our EFT construction.

preprint2022arXiv

PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems

The development of personalized recommendation has significantly improved the accuracy of information matching and the revenue of e-commerce platforms. Recently, it has 2 trends: 1) recommender systems must be trained timely to cope with ever-growing new products and ever-changing user interests from online marketing and social network; 2) SOTA recommendation models introduce DNN modules to improve prediction accuracy. Traditional CPU-based recommender systems cannot meet these two trends, and GPU- centric training has become a trending approach. However, we observe that GPU devices in training recommender systems are underutilized, and they cannot attain an expected throughput improvement as what it has achieved in CV and NLP areas. This issue can be explained by two characteristics of these recommendation models: First, they contain up to a thousand input feature fields, introducing fragmentary and memory-intensive operations; Second, the multiple constituent feature interaction submodules introduce substantial small-sized compute kernels. To remove this roadblock to the development of recommender systems, we propose a novel framework named PICASSO to accelerate the training of recommendation models on commodity hardware. Specifically, we conduct a systematic analysis to reveal the bottlenecks encountered in training recommendation models. We leverage the model structure and data distribution to unleash the potential of hardware through our packing, interleaving, and caching optimization. Experiments show that PICASSO increases the hardware utilization by an order of magnitude on the basis of SOTA baselines and brings up to 6x throughput improvement for a variety of industrial recommendation models. Using the same hardware budget in production, PICASSO on average shortens the walltime of daily training tasks by 7 hours, significantly reducing the delay of continuous delivery.

preprint2022arXiv

The elastic properties of composites reinforced by a 3D Voronoi fibre network with or without missing fibres

Many composite materials, both natural and fabricated, process a Voronoi like architecture or microstructure. Furthermore, the stochasticity and connectivity of Voronoi tessellation endow the composite materials constructed by this kind of structures a wide range of desired properties including high and tuneable stiffness, high strength and good manufacturability. Thus, Voronoi-based fibre network structures are regarded as promising designs of composite reinforcements, as well as powerful tools in composite mechanics simulations. In this paper, the elastic properties of composites reinforced by a 3D Voronoi fibre network are systemically investigated based on the precise control of the Voronoi cell regularity. The regularity of the reinforcement fibre networks, according to our definition, is found positively related to the Young's moduli of the composites. Interestingly, 20% percent of defects in total reinforcement fibres only causes a less than 6.5% of Young's moduli drop in Voronoi fibre network reinforced composite. The Voronoi fibre network reinforced composites also shows higher Young's moduli than those of conventional composites with discrete reinforcements. The possibility of simulating aerogels by Voronoi fibre network structures are also presented.

preprint2021arXiv

Ab initio calculations of reactor antineutrino fluxes with exact lepton wave functions

New \textit{ab initio} calculations of the isotopic reactor antineutrino fluxes are provided with exact numerical calculations of the lepton wave functions, assuming all the decay branches are allowed GT transitions. We illustrate that the analytical Fermi function and finite size effect each could have the largest spectral deviation of $\mathcal{O}(10\%)$, whereas the effect of their combination could result in spectral deviations at the level of 5%-10%. Meanwhile, we also find that several forms of the extended charge distributions have negligible effects on the spectral variation. Using the state-of-the-art nuclear databases, compared to usual \textit{ab initio} calculations using the analytical single beta decay spectrum, our new calculation shows sizable but opposite spectral deviations at the level of 2%-4% for the cumulative antineutrino and electron energy spectra which may partially contribute to the observed spectral excess in the high energy antineutrino range. Finally we observe that the {bias} of analytical beta decay spectrum approximation is rather universal for all the four fissionable isotopes.

preprint2021arXiv

Correlations between quark mass and flavor mixing hierarchies

We calculate the quark flavor mixing matrix $V$ based on the Hermitian quark mass matrices $M^{}_{\rm u}$ and $M^{}_{\rm d}$ with vanishing $(1,1)$, $(1,3)$ and $(3,1)$ entries. The popular leading-order prediction $|V^{}_{ub}/V^{}_{cb}| \simeq \sqrt{m^{}_u/m^{}_c}$ is significantly modified, and the result agrees with the current experimental value. We find that behind the strong {\it mass} hierarchy of up- or down-type quarks is the weak {\it texture} hierarchy of $M^{}_{\rm u}$ or $M^{}_{\rm d}$ characterized by an approximate seesaw-like relation among its $(2,2)$, $(2,3)$ and $(3,3)$ elements.

preprint2021arXiv

SCMA Codebook Design Based on Uniquely Decomposable Constellation Groups

Sparse code multiple access (SCMA), which helps improve spectrum efficiency (SE) and enhance connectivity, has been proposed as a non-orthogonal multiple access (NOMA) scheme for 5G systems. In SCMA, codebook design determines system overload ratio and detection performance at a receiver. In this paper, an SCMA codebook design approach is proposed based on uniquely decomposable constellation group (UDCG). We show that there are $N+1 (N \geq 1)$ constellations in the proposed UDCG, each of which has $M (M \geq 2)$ constellation points. These constellations are allocated to users sharing the same resource. Combining the constellations allocated on multiple resources of each user, we can obtain UDCG-based codebook sets. Bit error ratio (BER) performance will be discussed in terms of coding gain maximization with superimposed constellations and UDCG-based codebooks. Simulation results demonstrate that the superimposed constellation of each resource has large minimum Euclidean distance (MED) and meets uniquely decodable constraint. Thus, BER performance of the proposed codebook design approach outperforms that of the existing codebook design schemes in both uncoded and coded SCMA systems, especially for large-size codebooks.

preprint2020arXiv

A direct link between unflavored leptogenesis and low-energy CP violation via the one-loop quantum corrections

In the type-I seesaw mechanism the Casas-Ibarra (CI) parametrization provides a convenient description of the Dirac neutrino mass matrix in terms of the light and heavy Majorana neutrino masses, the lepton flavor mixing matrix $U$ and an unknown complex orthogonal matrix $O$. If $O$ is assumed to be real, it will be impossible to generate {\it unflavored} thermal leptogenesis via the lepton-number-violating and CP-violating decays of the lightest heavy Majorana neutrino. We find that this observation can be invalidated after small but unavoidable quantum corrections to the CI parametrization are taken into account with the help of the one-loop renormalization-group equations (RGEs) between the seesaw and electroweak scales. We illustrate a novel and viable unflavored leptogenesis scenario of this kind based on the RGEs in the seesaw-extended standard model, and show its direct link to the CP-violating phases of $U$ at low energies.

preprint2020arXiv

A modular $A_4$ symmetry realization of two-zero textures of the Majorana neutrino mass matrix

We show how to realize two-zero textures of the Majorana neutrino mass matrix $M_ν$ based on modular $A_4$ invariant models without flavons. In these models, all matter fields are assigned to three inequivalent singlets, ${\bf 1}$, ${\bf 1^\prime}$ and ${\bf 1^{\prime\prime}}$, of the finite modular group $Γ_3 \simeq A_4$. Considering tensor products of the $A_4$ group, it is easy to make the charged lepton mass matrix $M_\ell$ diagonal. Since not all modular forms of a specific weight and level 3 can be arranged into three inequivalent singlets of $A_4$ simultaneously, we can always make some entries in $M_ν$ vanish by properly assigning the representations and modular weights for the matter fields. We consider two cases where neutrino masses originate from the Weinberg operator and the type-I seesaw mechanism, respectively. For the former case, all seven viable two-zero textures of $M_ν$ (${\bf A_{1,2}}$, ${\bf B_{1,2,3,4}}$ and ${\bf C}$) can be realized successfully. For the latter case, only five of them (namely ${\bf A_{1,2}}$, ${\bf B_{3,4}}$ and ${\bf C}$) can be achieved due to the intrinsic structure of the right-handed neutrino mass matrix $M_{\rm R}$ in our assumption for simplicity.

preprint2020arXiv

Bridging resonant leptogenesis and low-energy CP violation with an RGE-modified seesaw relation

We propose a special type-I seesaw scenario in which the Yukawa coupling matrix $Y^{}_ν$ can be fully reconstructed by using the light Majorana neutrino masses $m^{}_i$, the heavy Majorana neutrino masses $M^{}_i$ and the PMNS lepton flavor mixing matrix $U$. It is the RGE-induced correction to the seesaw relation that helps interpret the observed baryon-antibaryon asymmetry of the Universe via flavored resonant thermal leptogenesis with $M^{}_1 \simeq M^{}_2 \ll M^{}_3$. We show that our idea works well in either the $τ$-flavored regime with equilibrium temperature $T \simeq M^{}_1 \in (10^9, 10^{12}]$ GeV or the $(μ+τ)$-flavored regime with $T \simeq M^{}_1 \in (10^5, 10^9]$ GeV, provided the light neutrinos have a normal mass ordering. We find that the same idea is also viable for a {\it minimal} type-I seesaw model with two nearly degenerate heavy Majorana neutrinos.

preprint2020arXiv

Distinguishing between the twin $b$-flavored unitarity triangles on a circular arc

With the help of the generalized Wolfenstein parametrization of quark flavor mixing and CP violation, we calculate fine differences between the twin $b$-flavored unitarity triangles defined by $V^{*}_{ub} V^{}_{ud} + V^{*}_{cb} V^{}_{cd} + V^{*}_{tb} V^{}_{td} = 0$ and $V^{*}_{ud} V^{}_{td} + V^{*}_{us} V^{}_{ts} + V^{*}_{ub} V^{}_{tb} = 0$ in the complex plane. We find that vertices of the rescaled versions of these two triangles, described respectively by $\barρ + {\rm i} \barη = -\left(V^{*}_{ub} V^{}_{ud}\right)/\left(V^{*}_{cb} V^{}_{cd}\right)$ and $\widetildeρ + {\rm i} \widetildeη = -\left(V^{*}_{ub} V^{}_{tb}\right)/\left(V^{*}_{us} V^{}_{ts}\right)$, are located on a circular arc whose center and radius are given by $O = \left(0.5, 0.5 \cotα\right)$ and $R = 0.5 \cscα$ with $α$ being their common inner angle. The small difference between $(\barρ, \barη)$ and $(\widetildeρ, \widetildeη)$ is characterized by $\widetildeρ - \barρ \sim \widetildeη - \barη \sim {\cal O}(λ^2)$ with $λ\simeq 0.22$ being the Wolfenstein expansion parameter, and these two vertices are insensitive to the two-loop renormalization-group running effects up to the accuracy of ${\cal O}(λ^4)$. Some comments are also made on similar features of three pairs of the rescaled unitarity triangles of lepton flavor mixing and CP violation.

preprint2020arXiv

Integral solutions to the one-loop renormalization-group equations for lepton flavor mixing parameters and the Jarlskog invariant

Working in the basis where the charged-lepton Yukawa matrix is diagonal and making the $τ$-dominance approximations, we analytically derive integral solutions to the one-loop renormalization-group equations (RGEs) for neutrino masses, flavor mixing angles, CP-violating phases and the Jarlskog invariant under the standard parametrization of the PMNS matrix in the standard model or its minimal supersymmetric extension for both Majorana and Dirac neutrinos. With these integral solutions, we carry out numerical calculations to investigate the RGE running of lepton flavor mixing parameters and the Jarlskog invariant, and also compare these integral solutions with the exact results obtained by numerically solving the one-loop RGEs. It is shown that these integral solutions coincide with the exact results and can well describe the evolution of lepton flavor mixing parameters and the Jarlskog invariant in most cases. Some important features of our integral solutions and the evolution behaviors of relevant flavor parameters are also discussed in detail both analytically and numerically.

preprint2020arXiv

On the mean curvature flow of submanifolds in the standard Gaussian space $^†$

In this paper, we study the regular geometric behavior of the mean curvature flow (MCF) of submanifolds in the standard Gaussian metric space $({\mathbb R}^{m+p},e^{-|x|^2/m}\ol g)$ where $({\mathbb R}^{m+p},\ol g)$ is the standard Euclidean space and $x\in{\mathbb R}^{m+p}$ denotes the position vector. Note that, as a special Riemannian manifold, $({\mathbb R}^{m+p},e^{-|x|^2/m}\ol g)$ has an unbounded curvature. Up to a family of diffeomorphisms on $M^m$, the mean curvature flow we considered here turns out to be equivalent to a special variation of the ``{\em conformal mean curvature flow}\,'' which we have introduced previously. The main theorem of this paper indicates, geometrically, that any immersed compact submanifold in the standard Gaussian space, with the square norm of the position vector being not equal to $m$, will blow up at a finite time under the mean curvature flow, in the sense that either the position or the curvature blows up to infinity; Moreover, by this main theorem, the interval $[0,T)$ of time in which the flowing submanifolds keep regular has some certain optimal upper bound, and it can reach the bound if and only if the initial submanifold either shrinks to the origin or expands uniformly to infinity under the flow. Besides the main theorem, we also obtain some other interesting conclusions which not only play their key roles in proving the main theorem but also characterize in part the geometric behavior of the flow, being of independent significance.

preprint2020arXiv

On the two-loop radiative origin of the smallest neutrino mass and the associated Majorana CP phase

Given a massless neutrino at a superhigh energy scale $Λ$ (e.g., in the minimal seesaw model with only two heavy Majorana neutrinos), we calculate quantum corrections to its initially vanishing mass $m^{}_1$ (or $m^{}_3$) and the associated Majorana CP phase $ρ$ (or $\varrho$) at the Fermi scale $Λ^{}_{\rm F}$ by means of the two-loop renormalization-group equations (RGEs) in the standard model and with the help of the latest neutrino oscillation data. The numerical results obtained from our analytical approximations are in good agreement with those achieved by numerically solving the two-loop RGEs. In particular, we confirm that a nonzero value of $m^{}_1$ (or $m^{}_3$) of ${\cal O}(10^{-13})$ eV at $Λ^{}_{\rm F}$ can be radiatively generated from $m^{}_1 =0$ (or $m^{}_3 =0$) at $Λ\simeq 10^{14}$ GeV in the SM, and find that $ρ$ (or $\varrho$) may accordingly acquire an appreciable physical value. As a nontrivial by-product, the evolution of all the other (initially nonzero) flavor parameters of massive neutrinos is studied both analytically and numerically, by just keeping their leading (i.e., one-loop) RGE-induced effects.

preprint2020arXiv

RLScheduler: An Automated HPC Batch Job Scheduler Using Reinforcement Learning

Today high-performance computing (HPC) platforms are still dominated by batch jobs. Accordingly, effective batch job scheduling is crucial to obtain high system efficiency. Existing HPC batch job schedulers typically leverage heuristic priority functions to prioritize and schedule jobs. But, once configured and deployed by the experts, such priority functions can hardly adapt to the changes of job loads, optimization goals, or system settings, potentially leading to degraded system efficiency when changes occur. To address this fundamental issue, we present RLScheduler, an automated HPC batch job scheduler built on reinforcement learning. RLScheduler relies on minimal manual interventions or expert knowledge, but can learn high-quality scheduling policies via its own continuous 'trial and error'. We introduce a new kernel-based neural network structure and trajectory filtering mechanism in RLScheduler to improve and stabilize the learning process. Through extensive evaluations, we confirm that RLScheduler can learn high-quality scheduling policies towards various workloads and various optimization goals with relatively low computation cost. Moreover, we show that the learned models perform stably even when applied to unseen workloads, making them practical for production use.