Researcher profile

Da Huang

Da Huang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2023arXiv

Symbol tuning improves in-context learning in language models

We present symbol tuning - finetuning language models on in-context input-label pairs where natural language labels (e.g., "positive/negative sentiment") are replaced with arbitrary symbols (e.g., "foo/bar"). Symbol tuning leverages the intuition that when a model cannot use instructions or natural language labels to figure out a task, it must instead do so by learning the input-label mappings. We experiment with symbol tuning across Flan-PaLM models up to 540B parameters and observe benefits across various settings. First, symbol tuning boosts performance on unseen in-context learning tasks and is much more robust to underspecified prompts, such as those without instructions or without natural language labels. Second, symbol-tuned models are much stronger at algorithmic reasoning tasks, with up to 18.2% better performance on the List Functions benchmark and up to 15.3% better performance on the Simple Turing Concepts benchmark. Finally, symbol-tuned models show large improvements in following flipped-labels presented in-context, meaning that they are more capable of using in-context information to override prior semantic knowledge.

preprint2022arXiv

On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models

For industrial-scale advertising systems, prediction of ad click-through rate (CTR) is a central problem. Ad clicks constitute a significant class of user engagements and are often used as the primary signal for the usefulness of ads to users. Additionally, in cost-per-click advertising systems where advertisers are charged per click, click rate expectations feed directly into value estimation. Accordingly, CTR model development is a significant investment for most Internet advertising companies. Engineering for such problems requires many machine learning (ML) techniques suited to online learning that go well beyond traditional accuracy improvements, especially concerning efficiency, reproducibility, calibration, credit attribution. We present a case study of practical techniques deployed in Google's search ads CTR model. This paper provides an industry case study highlighting important areas of current ML research and illustrating how impactful new ML methods are evaluated and made useful in a large-scale industrial setting.

preprint2022arXiv

Unitarity Bounds on the Massive Spin-2 Particle Explanation of Muon $g-2$ Anomaly

Motivated by the long-standing discrepancy between the Standard Model prediction and the experimental measurement of the muon magnetic dipole moment, we have recently proposed to interpret this muon $g-2$ anomaly in terms of the loop effect induced by a new massive spin-2 field $G$. In the present paper, we investigate the unitarity bounds on this scenario. We calculate the $s$-wave projected amplitudes for two-body elastic scatterings of charged leptons and photons mediated by $G$ at high energies for all possible initial and final helicity states. By imposing the condition of the perturbative unitarity, we obtain the analytic constraints on the charged-lepton-$G$ and photon-$G$ couplings. We then apply our results to constrain the parameter space relevant to the explanation of the muon $g-2$ anomaly.

preprint2021arXiv

Electroweak phase transition confronted with dark matter detection constraints

We study the type-II first-order electroweak phase transition and dark matter (DM) phenomenology in both real and complex singlet extensions of SM. In the real singlet extension with a $\mathbb{Z}_2$ symmetry, we show that the parameter regions favored by the phase transition suffer from strong constraints from DM direct detection so that only a negligible fraction ($f_{X}\sim 10^{-4}-10^{-5}$) of DM composed of the real singlet scalar can survive the LUX and XENON1T constraints. In the complex singlet $S$ case, we impose a $CP$ symmetry $S\to S^{*}$ to the scalar potential. The real component of $S$ can mix with SM Higgs boson while the imaginary component becomes a DM candidate due to the protection of the $CP$ symmetry. By taking into account the current experimental constraints of invisible Higgs decays, Higgs signal strength measurements, and dark matter detections, we find that there exists a large parameter space for the type-II electroweak phase transition to occur while explaining all of the dark matter relic density. We identify a subset of parameter space that is promising for future experiments, including the di-Higgs and Higgs signal strength measurements at the HL-LHC and the dark matter direct detection in the XENONnT project.

preprint2021arXiv

Rethinking Co-design of Neural Architectures and Hardware Accelerators

Neural architectures and hardware accelerators have been two driving forces for the progress in deep learning. Previous works typically attempt to optimize hardware given a fixed model architecture or model architecture given fixed hardware. And the dominant hardware architecture explored in this prior work is FPGAs. In our work, we target the optimization of hardware and software configurations on an industry-standard edge accelerator. We systematically study the importance and strategies of co-designing neural architectures and hardware accelerators. We make three observations: 1) the software search space has to be customized to fully leverage the targeted hardware architecture, 2) the search for the model architecture and hardware architecture should be done jointly to achieve the best of both worlds, and 3) different use cases lead to very different search outcomes. Our experiments show that the joint search method consistently outperforms previous platform-aware neural architecture search, manually crafted models, and the state-of-the-art EfficientNet on all latency targets by around 1% on ImageNet top-1 accuracy. Our method can reduce energy consumption of an edge accelerator by up to 2x under the same accuracy constraint, when co-adapting the model architecture and hardware accelerator configurations.

preprint2020arXiv

CP Violating $hW^+W^-$ Coupling in the Standard Model and Beyond

Inspired by the recent development in determining the property of the observed Higgs boson, we explore the $CP$-violating (CPV) $- c_{\rm CPV} h W^{+\, μν}\tilde{W}^{-}_{μν}/v$ coupling in the Standard Model (SM) and beyond, where $W^{\pm \, μν}$ and $\tilde{W}^{\pm\,μν}$ denote the $W$-boson field strength and its dual. To begin with, we show that the leading-order SM contribution to this CPV vertex appears at two-loop level. By summing over the quark flavor indices in the two loop integrals analytically, we can estimate the order of the corresponding Wilson coefficient to be $c^{\rm SM}_{\rm CPV} \sim {\cal O}(10^{-23})$, which is obviously too small to be probed at the LHC and planned future colliders. Then we investigate this CPV $hW^+ W^-$ interaction in two Beyond the Standard Model benchmark models: the left-right model and the complex 2-Higgs doublet model (C2HDM). Unlike what happens for the SM, the dominant contributions in both models arise at the one-loop level, and the corresponding Wilson coefficient can be as large as of ${\cal O}(10^{-9})$ in the former model and of ${\cal O}(10^{-3})$ for the latter. In light of such a large CPV effect in the $hW^+W^-$ coupling, we also give the formulae for the leading one-loop contribution to the related CPV $hZZ$ effective operator in the C2HDM. The order of magnitude of the Wilson coefficients in the C2HDM may be within reach of the high-luminosity LHC or planned future colliders.

preprint2020arXiv

Multicomponent Dark Matter in the Light of CALET and DAMPE

In the light of the latest measurements on the total $e^+ + e^-$ flux by CALET and DAMPE experiments, we revisit the multicomponent leptonically decaying dark matter (DM) explanations to the cosmic-ray electron/positron excesses observed previously. Especially, we use the single and double-component DM models to explore the compatibility of the AMS-02 positron fraction with the new CALET or DAMPE data. It turns out that neither single nor double-component DM models are able to fit the AMS-02 positron fraction and DAMPE total $e^+ + e^-$ flux data simultaneously. On the other hand, for the combined AMS-02 and CALET dataset, both the single and double-component DM models can provide reasonable fits. If we further take into the diffuse $γ$-ray constraints from Fermi-LAT, only the double-component DM models are allowed.

preprint2020arXiv

Strong Dark Matter Self-Interaction from a Stable Scalar Mediator

In face of the small-scale structure problems of the collisionless cold dark matter (DM) paradigm, a popular remedy is to introduce a strong DM self-interaction which can be generated nonperturbatively by a MeV-scale light mediator. However, if such the mediator is unstable and decays into SM particles, the model is severely constrained by the DM direct and indirect detection experiments. In the present paper, we study a model of a self-interacting fermionic DM, endowed with a light stable scalar mediator. In this model, the DM relic abundance is dominated by the fermionic DM particle which is generated mainly via the freeze-out of its annihilations to the stable mediator. Since this channel is invisible, the DM indirect detection constraints should be greatly relaxed. Furthermore, the direct detection signals are suppressed to an unobservable level since fermionic DM scatterings with a nucleon appear at one-loop level. By further studying the bounds from the CMB and BBN on the visible channels involving the dark sector, we show that there is a large parameter space which can generate appropriate DM self-interactions at dwarf galaxy scales, while remaining compatible with other experimental constraints.