Researcher profile

David Shih

David Shih contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2026arXiv

Collider-Bench: Benchmarking AI Agents with Particle Physics Analysis Reproduction

Autonomous language-model agents are increasingly evaluated on long-horizon tool-use tasks, but existing benchmarks rarely capture the complexity and nuance of real scientific work. To address this gap, we introduce Collider-Bench, a benchmark for evaluating whether LLM agents can reproduce experimental analyses from the Large Hadron Collider (LHC) using only public papers and open scientific software. Such analyses are often difficult to reproduce because the public toolchain only approximates the software used internally by the experimental collaborations, while the published papers inevitably omit implementation details needed for a faithful reconstruction. Agents must therefore rely on physical reasoning, domain knowledge, and trial-and-error to fill these gaps. Each task requires the agent to turn a published analysis into an executable simulation-and-selection pipeline and submit predicted collision event yields in specified signal regions. These predictions are evaluated with standard histogram metrics that provide continuous fidelity scores without a hand-written rubric. We also report the computational cost incurred by each agent per task. Finally, we evaluate the codebase and full session trace using an LLM judge to catch qualitative failure modes such as fabrications, hallucinations and duplications. We release an initial set of tasks drawn from LHC searches, together with a containerized sandbox and event simulation tools. We evaluate across a capability ladder of general purpose coding agents. Our results show that on average no agent reliably beats the physicist-in-the-loop solution.

preprint2026arXiv

Quirk SUEP

We propose searching for physics beyond the Standard Model in the low-transverse-momentum tracks accompanying hard-scatter events at the LHC. TeV-scale resonances connected to a dark QCD sector could be enhanced by selecting events with anomalies in the track distributions. As a benchmark, a quirk model with microscopic string lengths is developed, including a setup for event simulation. For this model, strategies are presented to enhance the sensitivity compared to inclusive resonance searches: a simple cut-based selection, a supervised search, and a model-agnostic weakly supervised anomaly search with the CATHODE method. Expected discovery potentials and exclusion limits are shown for 140 fb$^{-1}$ of 13 TeV proton-proton collisions at the LHC.

preprint2022arXiv

Anomaly Detection under Coordinate Transformations

There is a growing need for machine learning-based anomaly detection strategies to broaden the search for Beyond-the-Standard-Model (BSM) physics at the Large Hadron Collider (LHC) and elsewhere. The first step of any anomaly detection approach is to specify observables and then use them to decide on a set of anomalous events. One common choice is to select events that have low probability density. It is a well-known fact that probability densities are not invariant under coordinate transformations, so the sensitivity can depend on the initial choice of coordinates. The broader machine learning community has recently connected coordinate sensitivity with anomaly detection and our goal is to bring awareness of this issue to the growing high energy physics literature on anomaly detection. In addition to analytical explanations, we provide numerical examples from simple random variables and from the LHC Olympics Dataset that show how using probability density as an anomaly score can lead to events being classified as anomalous or not depending on the coordinate frame.

preprint2022arXiv

Classifying Anomalies THrough Outer Density Estimation (CATHODE)

We propose a new model-agnostic search strategy for physics beyond the standard model (BSM) at the LHC, based on a novel application of neural density estimation to anomaly detection. Our approach, which we call Classifying Anomalies THrough Outer Density Estimation (CATHODE), assumes the BSM signal is localized in a signal region (defined e.g. using invariant mass). By training a conditional density estimator on a collection of additional features outside the signal region, interpolating it into the signal region, and sampling from it, we produce a collection of events that follow the background model. We can then train a classifier to distinguish the data from the events sampled from the background model, thereby approaching the optimal anomaly detector. Using the LHC Olympics R&D dataset, we demonstrate that CATHODE nearly saturates the best possible performance, and significantly outperforms other approaches that aim to enhance the bump hunt (CWoLa Hunting and ANODE). Finally, we demonstrate that CATHODE is very robust against correlations between the features and maintains nearly-optimal performance even in this more challenging setting.

preprint2022arXiv

New directions for surrogate models and differentiable programming for High Energy Physics detector simulation

The computational cost for high energy physics detector simulation in future experimental facilities is going to exceed the current available resources. To overcome this challenge, new ideas on surrogate models using machine learning methods are being explored to replace computationally expensive components. Additionally, differentiable programming has been proposed as a complementary approach, providing controllable and scalable simulation routines. In this document, new and ongoing efforts for surrogate models and differential programming applied to detector simulation are discussed in the context of the 2021 Particle Physics Community Planning Exercise (`Snowmass').

preprint2022arXiv

Resolving Combinatorial Ambiguities in Dilepton $t \bar t$ Event Topologies with Neural Networks

We study the potential of deep learning to resolve the combinatorial problem in SUSY-like events with two invisible particles at the LHC. As a concrete example, we focus on dileptonic $t \bar t$ events, where the combinatorial problem becomes an issue of binary classification: pairing the correct lepton with each $b$ quark coming from the decays of the tops. We investigate the performance of a number of machine learning algorithms, including attention-based networks, which have been used for a similar problem in the fully-hadronic channel of $t\bar t$ production; and the Lorentz Boost Network, which is motivated by physics principles. We then consider the general case when the underlying mass spectrum is unknown, and hence no kinematic endpoint information is available. Compared against existing methods based on kinematic variables, we demonstrate that the efficiency for selecting the correct pairing is greatly improved by utilizing deep learning techniques.

preprint2020arXiv

ABCDisCo: Automating the ABCD Method with Machine Learning

The ABCD method is one of the most widely used data-driven background estimation techniques in high energy physics. Cuts on two statistically-independent classifiers separate signal and background into four regions, so that background in the signal region can be estimated simply using the other three control regions. Typically, the independent classifiers are chosen "by hand" to be intuitive and physically motivated variables. Here, we explore the possibility of automating the design of one or both of these classifiers using machine learning. We show how to use state-of-the-art decorrelation methods to construct powerful yet independent discriminators. Along the way, we uncover a previously unappreciated aspect of the ABCD method: its accuracy hinges on having low signal contamination in control regions not just overall, but relative to the signal fraction in the signal region. We demonstrate the method with three examples: a simple model consisting of three-dimensional Gaussians; boosted hadronic top jet tagging; and a recasted search for paired dijet resonances. In all cases, automating the ABCD method with machine learning significantly improves performance in terms of ABCD closure, background rejection and signal contamination.

preprint2020arXiv

Anomaly Detection with Density Estimation

We leverage recent breakthroughs in neural density estimation to propose a new unsupervised anomaly detection technique (ANODE). By estimating the probability density of the data in a signal region and in sidebands, and interpolating the latter into the signal region, a likelihood ratio of data vs. background can be constructed. This likelihood ratio is broadly sensitive to overdensities in the data that could be due to localized anomalies. In addition, a unique potential benefit of the ANODE method is that the background can be directly estimated using the learned densities. Finally, ANODE is robust against systematic differences between signal region and sidebands, giving it broader applicability than other methods. We demonstrate the power of this new approach using the LHC Olympics 2020 R\&D Dataset. We show how ANODE can enhance the significance of a dijet bump hunt by up to a factor of 7 with a 10\% accuracy on the background prediction. While the LHC is used as the recurring example, the methods developed here have a much broader applicability to anomaly detection in physics and beyond.

preprint2020arXiv

Boosted $W/Z$ Tagging with Jet Charge and Deep Learning

We demonstrate that the classification of boosted, hadronically-decaying weak gauge bosons can be significantly improved over traditional cut-based and BDT-based methods using deep learning and the jet charge variable. We construct binary taggers for $W^+$ vs. $W^-$ and $Z$ vs. $W$ discrimination, as well as an overall ternary classifier for $W^+$/$W^-$/$Z$ discrimination. Besides a simple convolutional neural network (CNN), we also explore a composite of two CNNs, with different numbers of layers in the jet $p_{T}$ and jet charge channels. We find that this novel structure boosts the performance particularly when considering the $Z$ boson as signal. The methods presented here can enhance the physics potential in SM measurements and searches for new physics that are sensitive to the electric charge of weak gauge bosons.

preprint2020arXiv

Simulation Assisted Likelihood-free Anomaly Detection

Given the lack of evidence for new particle discoveries at the Large Hadron Collider (LHC), it is critical to broaden the search program. A variety of model-independent searches have been proposed, adding sensitivity to unexpected signals. There are generally two types of such searches: those that rely heavily on simulations and those that are entirely based on (unlabeled) data. This paper introduces a hybrid method that makes the best of both approaches. For potential signals that are resonant in one known feature, this new method first learns a parameterized reweighting function to morph a given simulation to match the data in sidebands. This function is then interpolated into the signal region and then the reweighted background-only simulation can be used for supervised learning as well as for background estimation. The background estimation from the reweighted simulation allows for non-trivial correlations between features used for classification and the resonant feature. A dijet search with jet substructure is used to illustrate the new method. Future applications of Simulation Assisted Likelihood-free Anomaly Detection (SALAD) include a variety of final states and potential combinations with other model-independent approaches.

preprint2020arXiv

Strange Jet Tagging

Tagging jets of strongly interacting particles initiated by energetic strange quarks is one of the few largely unexplored Standard Model object classification problems remaining in high energy collider physics. In this paper we investigate the purest version of this classification problem in the form of distinguishing strange-quark jets from down-quark jets. Our strategy relies on the fact that a strange-quark jet contains on average a higher ratio of neutral kaon energy to neutral pion energy than does a down-quark jet. Long-lived neutral kaons deposit energy mainly in the hadronic calorimeter of a high energy detector, while neutral pions decay promptly to photons that deposit energy mainly in the electromagnetic calorimeter. In addition, short-lived neutral kaons that decay in flight to charged pion pairs can be identified as a secondary vertex in the inner tracking system. Using these handles we study different approaches to distinguishing strange-quark from down-quark jets, including single variable cut-based methods, a boosted decision tree (BDT) with a small number of simple variables, and a deep learning convolutional neural network (CNN) architecture with jet images. We show that modest gains are possible from the CNN compared with the BDT or a single variable. Starting from jet samples with only strange-quark and down-quark jets, the CNN algorithm can improve the strange to down ratio by a factor of roughly 2 for strange tagging efficiencies below 0.2, and by a factor of 2.5 for strange tagging efficiencies near 0.02.

preprint2019arXiv

Searching for muonic forces with the ATLAS detector

The LHC copiously produces muons via different processes, and the muon sample will be large at the high-luminosity LHC (HL-LHC). In this work we propose to leverage this large muon sample and utilize the HL-LHC as a muon fixed-target experiment, with the ATLAS calorimeter as the target. We consider a novel analysis for the ATLAS detector, which takes advantage of the two independent muon momentum measurements by the inner detector and the muon system. We show that a comparison of the two measurements, before and after the calorimeters, can probe new force carriers that are coupled to muons and escape detection. The proposed analysis, based on muon samples from $W$ and $Z$ decays only, has a comparable reach to other proposals. In particular, it can explore the part of parameter-space that could explain the muon $g-2$ anomaly.

preprint2018arXiv

Searching for New Physics with Deep Autoencoders

We introduce a potentially powerful new method of searching for new physics at the LHC, using autoencoders and unsupervised deep learning. The key idea of the autoencoder is that it learns to map "normal" events back to themselves, but fails to reconstruct "anomalous" events that it has never encountered before. The reconstruction error can then be used as an anomaly threshold. We demonstrate the effectiveness of this idea using QCD jets as background and boosted top jets and RPV gluino jets as signal. We show that a deep autoencoder can significantly improve signal over background when trained on backgrounds only, or even directly on data which contains a small admixture of signal. Finally we examine the correlation of the autoencoders with jet mass and show how the jet mass distribution can be stable against cuts in reconstruction loss. This may be important for estimating QCD backgrounds from data. As a test case we show how one could plausibly discover 400 GeV RPV gluinos using an autoencoder combined with a bump hunt in jet mass. This opens up the exciting possibility of training directly on actual data to discover new physics with no prior expectations or theory prejudice.