Source author record

Nhan Tran

Nhan Tran appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

hep-ex Machine Learning hep-ph physics.ins-det Artificial Intelligence Computer Vision Hardware Architecture physics.acc-ph physics.comp-ph astro-ph.CO astro-ph.IM gr-qc hep-th math.NA nucl-ex nucl-th physics.atom-ph physics.chem-ph Programming Languages

Catalog footprint

What is connected

25works

19topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Surrogate Neural Architecture Codesign Package (SNAC-Pack)

Neural architecture search (NAS) is a powerful approach for automating model design, but existing methods often optimize for accuracy alone or rely on proxy metrics such as bit operations (BOPs) that correlate poorly with hardware cost. This gap is particularly large for FPGA deployment, where cost is dominated by a multi-dimensional budget of lookup tables, DSPs, flip-flops, BRAM, and latency. We present the Surrogate Neural Architecture Codesign Package (SNAC-Pack), an open-source AutoML framework for hardware-aware neural architecture codesign and end-to-end FPGA deployment. SNAC-Pack runs a multi-objective global search with Optuna and NSGA-II, loading trials to a shared SQLite store that enables parallel workers across compute nodes. A hardware surrogate model outputs per-trial resource and latency estimates, avoiding the synthesis cost that would otherwise dominate the search loop. A local search stage then applies quantization-aware training (QAT) together with iterative magnitude pruning in a combined compression loop, after which the final model is synthesized to FPGA firmware via the hls4ml Python library. A YAML configuration and an optional agentic frontend let users run the pipeline on new datasets without modifying the framework. We demonstrate SNAC-Pack on jet classification at the Large Hadron Collider and superconducting qubit readout, discovering compact architectures that match or exceed strong baselines on the task metric while reducing FPGA resource utilization and, in the qubit readout case, reducing the design space exploration process from months of manual fine-tuning to hours of automated search.

preprint2026arXiv

Towards a Self-Driving Trigger at the LHC: Adaptive Response in Real Time

Real-time data filtering and selection -- or trigger -- systems at high-throughput scientific facilities such as the experiments at the Large Hadron Collider (LHC) must process extremely high-rate data streams under stringent bandwidth, latency, and storage constraints. Yet these systems are typically designed as static, hand-tuned menus of selection criteria grounded in prior knowledge and simulation. In this work, we further explore the concept of a self-driving trigger, an autonomous data-filtering framework that reallocates resources and adjusts thresholds dynamically in real-time to optimize signal efficiency, rate stability, and computational cost as instrumentation and environmental conditions evolve. We introduce a benchmark ecosystem to emulate realistic collider scenarios and demonstrate real-time optimization of a menu including canonical energy sum triggers as well as modern anomaly-detection algorithms that target non-standard event topologies using machine learning. Using simulated data streams and publicly available collision data from the Compact Muon Solenoid (CMS) experiment, we demonstrate the capability to dynamically and automatically optimize trigger performance under specific cost objectives without manual retuning. Our adaptive strategy shifts trigger design from static menus with heuristic tuning to intelligent, automated, data-driven control, unlocking greater flexibility and discovery potential in future high-energy physics analyses.

preprint2023arXiv

Beyond PID Controllers: PPO with Neuralized PID Policy for Proton Beam Intensity Control in Mu2e

We introduce a novel Proximal Policy Optimization (PPO) algorithm aimed at addressing the challenge of maintaining a uniform proton beam intensity delivery in the Muon to Electron Conversion Experiment (Mu2e) at Fermi National Accelerator Laboratory (Fermilab). Our primary objective is to regulate the spill process to ensure a consistent intensity profile, with the ultimate goal of creating an automated controller capable of providing real-time feedback and calibration of the Spill Regulation System (SRS) parameters on a millisecond timescale. We treat the Mu2e accelerator system as a Markov Decision Process suitable for Reinforcement Learning (RL), utilizing PPO to reduce bias and enhance training stability. A key innovation in our approach is the integration of a neuralized Proportional-Integral-Derivative (PID) controller into the policy function, resulting in a significant improvement in the Spill Duty Factor (SDF) by 13.6%, surpassing the performance of the current PID controller baseline by an additional 1.6%. This paper presents the preliminary offline results based on a differentiable simulator of the Mu2e accelerator. It paves the groundwork for real-time implementations and applications, representing a crucial step towards automated proton beam intensity control for the Mu2e experiment.

preprint2023arXiv

Differentiable Earth Mover's Distance for Data Compression at the High-Luminosity LHC

The Earth mover's distance (EMD) is a useful metric for image recognition and classification, but its usual implementations are not differentiable or too slow to be used as a loss function for training other algorithms via gradient descent. In this paper, we train a convolutional neural network (CNN) to learn a differentiable, fast approximation of the EMD and demonstrate that it can be used as a substitute for computing-intensive EMD implementations. We apply this differentiable approximation in the training of an autoencoder-inspired neural network (encoder NN) for data compression at the high-luminosity LHC at CERN. The goal of this encoder NN is to compress the data while preserving the information related to the distribution of energy deposits in particle detectors. We demonstrate that the performance of our encoder NN trained using the differentiable EMD CNN surpasses that of training with loss functions based on mean squared error.

preprint2022arXiv

Dark Sector Physics at High-Intensity Experiments

Is Dark Matter part of a Dark Sector? The possibility of a dark sector neutral under Standard Model (SM) forces furnishes an attractive explanation for the existence of Dark Matter (DM), and is a compelling new-physics direction to explore in its own right, with potential relevance to fundamental questions as varied as neutrino masses, the hierarchy problem, and the Universe's matter-antimatter asymmetry. Because dark sectors are generically weakly coupled to ordinary matter, and because they can naturally have MeV-to-GeV masses and respect the symmetries of the SM, they are only mildly constrained by high-energy collider data and precision atomic measurements. Yet upcoming and proposed intensity-frontier experiments will offer an unprecedented window into the physics of dark sectors, highlighted as a Priority Research Direction in the 2018 Dark Matter New Initiatives (DMNI) BRN report. Support for this program -- in the form of dark-sector analyses at multi-purpose experiments, realization of the intensity-frontier experiments receiving DMNI funds, an expansion of DMNI support to explore the full breadth of DM and visible final-state signatures (especially long-lived particles) called for in the BRN report, and support for a robust dark-sector theory effort -- will enable comprehensive exploration of low-mass thermal DM milestones, and greatly enhance the potential of intensity-frontier experiments to discover dark-sector particles decaying back to SM particles.

preprint2022arXiv

DarkQuest: A dark sector upgrade to SpinQuest at the 120 GeV Fermilab Main Injector

Expanding the mass range and techniques by which we search for dark matter is an important part of the worldwide particle physics program. Accelerator-based searches for dark matter and dark sector particles are a uniquely compelling part of this program as a way to both create and detect dark matter in the laboratory and explore the dark sector by searching for mediators and excited dark matter particles. This paper focuses on developing the DarkQuest experimental concept and gives an outlook on related enhancements collectively referred to as LongQuest. DarkQuest is a proton fixed-target experiment with leading sensitivity to an array of visible dark sector signatures in the MeV-GeV mass range. Because it builds off of existing accelerator and detector infrastructure, it offers a powerful but modest-cost experimental initiative that can be realized on a short timescale.

preprint2022arXiv

Experiments and Facilities for Accelerator-Based Dark Sector Searches

This paper provides an overview of experiments and facilities for accelerator-based dark matter searches as part of the US Community Study on the Future of Particle Physics (Snowmass 2021). Companion white papers to this paper present the physics drivers: thermal dark matter, visible dark portals, and new flavors and rich dark sectors.

preprint2022arXiv

FastML Science Benchmarks: Accelerating Real-Time Scientific Edge Machine Learning

Applications of machine learning (ML) are growing by the day for many unique and challenging scientific applications. However, a crucial challenge facing these applications is their need for ultra low-latency and on-detector ML capabilities. Given the slowdown in Moore's law and Dennard scaling, coupled with the rapid advances in scientific instrumentation that is resulting in growing data rates, there is a need for ultra-fast ML at the extreme edge. Fast ML at the edge is essential for reducing and filtering scientific data in real-time to accelerate science experimentation and enable more profound insights. To accelerate real-time scientific edge ML hardware and software solutions, we need well-constrained benchmark tasks with enough specifications to be generically applicable and accessible. These benchmarks can guide the design of future edge ML hardware for scientific applications capable of meeting the nanosecond and microsecond level latency requirements. To this end, we present an initial set of scientific ML benchmarks, covering a variety of ML and embedded system techniques.

preprint2022arXiv

Jets and Jet Substructure at Future Colliders

Even though jet substructure was not an original design consideration for the Large Hadron Collider (LHC) experiments, it has emerged as an essential tool for the current physics program. We examine the role of jet substructure on the motivation for and design of future energy frontier colliders. In particular, we discuss the need for a vibrant theory and experimental research and development program to extend jet substructure physics into the new regimes probed by future colliders. Jet substructure has organically evolved with a close connection between theorists and experimentalists and has catalyzed exciting innovations in both communities. We expect such developments will play an important role in the future energy frontier physics program.

preprint2022arXiv

Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark

We present our development experience and recent results for the MLPerf Tiny Inference Benchmark on field-programmable gate array (FPGA) platforms. We use the open-source hls4ml and FINN workflows, which aim to democratize AI-hardware codesign of optimized neural networks on FPGAs. We present the design and implementation process for the keyword spotting, anomaly detection, and image classification benchmark tasks. The resulting hardware implementations are quantized, configurable, spatial dataflow architectures tailored for speed and efficiency and introduce new generic optimizations and common workflows developed as a part of this work. The full workflow is presented from quantization-aware training to FPGA implementation. The solutions are deployed on system-on-chip (Pynq-Z2) and pure FPGA (Arty A7-100T) platforms. The resulting submissions achieve latencies as low as 20 $μ$s and energy consumption as low as 30 $μ$J per inference. We demonstrate how emerging ML benchmarks on heterogeneous hardware platforms can catalyze collaboration and the development of new techniques and more accessible tools.

preprint2022arXiv

Physics Community Needs, Tools, and Resources for Machine Learning

Machine learning (ML) is becoming an increasingly important component of cutting-edge physics research, but its computational requirements present significant challenges. In this white paper, we discuss the needs of the physics community regarding ML across latency and throughput regimes, the tools and resources that offer the possibility of addressing these needs, and how these can be best utilized and accessed in the coming years.

preprint2022arXiv

Physics Opportunities for the Fermilab Booster Replacement

This white paper presents opportunities afforded by the Fermilab Booster Replacement and its various options. Its goal is to inform the design process of the Booster Replacement about the accelerator needs of the various options, allowing the design to be versatile and enable, or leave the door open to, as many options as possible. The physics themes covered by the paper include searches for dark sectors and new opportunities with muons.

preprint2022arXiv

QONNX: Representing Arbitrary-Precision Quantized Neural Networks

We present extensions to the Open Neural Network Exchange (ONNX) intermediate representation format to represent arbitrary-precision quantized neural networks. We first introduce support for low precision quantization in existing ONNX-based quantization formats by leveraging integer clipping, resulting in two new backward-compatible variants: the quantized operator format with clipping and quantize-clip-dequantize (QCDQ) format. We then introduce a novel higher-level ONNX format called quantized ONNX (QONNX) that introduces three new operators -- Quant, BipolarQuant, and Trunc -- in order to represent uniform quantization. By keeping the QONNX IR high-level and flexible, we enable targeting a wider variety of platforms. We also present utilities for working with QONNX, as well as examples of its usage in the FINN and hls4ml toolchains. Finally, we introduce the QONNX model zoo to share low-precision quantized neural networks.

preprint2022arXiv

Smart sensors using artificial intelligence for on-detector electronics and ASICs

Cutting edge detectors push sensing technology by further improving spatial and temporal resolution, increasing detector area and volume, and generally reducing backgrounds and noise. This has led to a explosion of more and more data being generated in next-generation experiments. Therefore, the need for near-sensor, at the data source, processing with more powerful algorithms is becoming increasingly important to more efficiently capture the right experimental data, reduce downstream system complexity, and enable faster and lower-power feedback loops. In this paper, we discuss the motivations and potential applications for on-detector AI. Furthermore, the unique requirements of particle physics can uniquely drive the development of novel AI hardware and design tools. We describe existing modern work for particle physics in this area. Finally, we outline a number of areas of opportunity where we can advance machine learning techniques, codesign workflows, and future microelectronics technologies which will accelerate design, performance, and implementations for next generation experiments.

preprint2021arXiv

Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics

Graph neural networks have been shown to achieve excellent performance for several crucial tasks in particle physics, such as charged particle tracking, jet tagging, and clustering. An important domain for the application of these networks is the FGPA-based first layer of real-time data filtering at the CERN Large Hadron Collider, which has strict latency and resource constraints. We discuss how to design distance-weighted graph networks that can be executed with a latency of less than 1$μ\mathrm{s}$ on an FPGA. To do so, we consider a representative task associated to particle reconstruction and identification in a next-generation calorimeter operating at a particle collider. We use a graph network architecture developed for such purposes, and apply additional simplifications to match the computing constraints of Level-1 trigger systems, including weight quantization. Using the $\mathtt{hls4ml}$ library, we convert the compressed models into firmware to be implemented on an FPGA. Performance of the synthesized models is presented both in terms of inference accuracy and resource usage.

preprint2020arXiv

Fast inference of Boosted Decision Trees in FPGAs for particle physics

We describe the implementation of Boosted Decision Trees in the hls4ml library, which allows the translation of a trained model into FPGA firmware through an automated conversion process. Thanks to its fully on-chip implementation, hls4ml performs inference of Boosted Decision Tree models with extremely low latency. With a typical latency less than 100 ns, this solution is suitable for FPGA-based real-time processing, such as in the Level-1 Trigger system of a collider experiment. These developments open up prospects for physicists to deploy BDTs in FPGAs for identifying the origin of jets, better reconstructing the energies of muons, and enabling better selection of rare signal processes.

preprint2020arXiv

Improving Long Handwritten Text Line Recognition with Convolutional Multi-way Associative Memory

Convolutional Recurrent Neural Networks (CRNNs) excel at scene text recognition. Unfortunately, they are likely to suffer from vanishing/exploding gradient problems when processing long text images, which are commonly found in scanned documents. This poses a major challenge to goal of completely solving Optical Character Recognition (OCR) problem. Inspired by recently proposed memory-augmented neural networks (MANNs) for long-term sequential modeling, we present a new architecture dubbed Convolutional Multi-way Associative Memory (CMAM) to tackle the limitation of current CRNNs. By leveraging recent memory accessing mechanisms in MANNs, our architecture demonstrates superior performance against other CRNN counterparts in three real-world long text OCR datasets.

preprint2020arXiv

Lepton-Nucleus Cross Section Measurements for DUNE with the LDMX Detector

We point out that the LDMX (Light Dark Matter eXperiment) detector design, conceived to search for sub-GeV dark matter, will also have very advantageous characteristics to pursue electron-nucleus scattering measurements of direct relevance to the neutrino program at DUNE and elsewhere. These characteristics include a 4-GeV electron beam, a precision tracker, electromagnetic and hadronic calorimeters with near 2$π$ azimuthal acceptance from the forward beam axis out to $\sim$40$^\circ$ angle, and low reconstruction energy threshold. LDMX thus could provide (semi)exclusive cross section measurements, with detailed information about final-state electrons, pions, protons, and neutrons. We compare the predictions of two widely used neutrino generators (GENIE, GiBUU) in the LDMX region of acceptance to illustrate the large modeling discrepancies in electron-nucleus interactions at DUNE-like kinematics. We argue that discriminating between these predictions is well within the capabilities of the LDMX detector.

preprint2018arXiv

Fast inference of deep neural networks in FPGAs for particle physics

Recent results at the Large Hadron Collider (LHC) have pointed to enhanced physics capabilities through the improvement of the real-time event processing techniques. Machine learning methods are ubiquitous and have proven to be very powerful in LHC physics, and particle physics as a whole. However, exploration of the use of such techniques in low-latency, low-power FPGA hardware has only just begun. FPGA-based trigger and data acquisition (DAQ) systems have extremely low, sub-microsecond latency requirements that are unique to particle physics. We present a case study for neural network inference in FPGAs focusing on a classifier for jet substructure which would enable, among many other physics scenarios, searches for new dark sector particles and novel measurements of the Higgs boson. While we focus on a specific example, the lessons are far-reaching. We develop a package based on High-Level Synthesis (HLS) called hls4ml to build machine learning models in FPGAs. The use of HLS increases accessibility across a broad user community and allows for a drastic decrease in firmware development time. We map out FPGA resource usage and latency versus neural network hyperparameters to identify the problems in particle physics that would benefit from performing neural network inference with FPGAs. For our example jet substructure model, we fit well within the available resources of modern FPGAs with a latency on the scale of 100 ns.

preprint2016arXiv

A Fast Algorithm for Solving Scalar Wave Scattering Problem by Billions of Particles

Scalar wave scattering by many small particles of arbitrary shapes with impedance boundary condition is studied. The problem is solved asymptotically and numerically under the assumptions a << d << lambda, where k = 2pi/lambda is the wave number, lambda is the wave length, a is the characteristic size of the particles, and d is the smallest distance between neighboring particles. A fast algorithm for solving this wave scattering problem by billions of particles is presented. The algorithm comprises the derivation of the (ORI) linear system and makes use of Conjugate Orthogonal Conjugate Gradient method and Fast Fourier Transform. Numerical solutions of the scalar wave scattering problem with 1, 4, 7, and 10 billions of small impedance particles are achieved for the first time. In these numerical examples, the problem of creating a material with negative refraction coefficient is also described and a recipe for creating materials with a desired refraction coefficient is tested.

preprint2016arXiv

Dissecting Jets and Missing Energy Searches Using $n$-body Extended Simplified Models

Simplified Models are a useful way to characterize new physics scenarios for the LHC. Particle decays are often represented using non-renormalizable operators that involve the minimal number of fields required by symmetries. Generalizing to a wider class of decay operators allows one to model a variety of final states. This approach, which we dub the $n$-body extension of Simplified Models, provides a unifying treatment of the signal phase space resulting from a variety of signals. In this paper, we present the first application of this framework in the context of multijet plus missing energy searches. The main result of this work is a global performance study with the goal of identifying which set of observables yields the best discriminating power against the largest Standard Model backgrounds for a wide range of signal jet multiplicities. Our analysis compares combinations of one, two and three variables, placing emphasis on the enhanced sensitivity gain resulting from non-trivial correlations. Utilizing boosted decision trees, we compare and classify the performance of missing energy, energy scale and energy structure observables. We demonstrate that including an observable from each of these three classes is required to achieve optimal performance. This work additionally serves to establish the utility of $n$-body extended Simplified Models as a diagnostic for unpacking the relative merits of different search strategies, thereby motivating their application to new physics signatures beyond jets and missing energy.

preprint2016arXiv

Thinking outside the ROCs: Designing Decorrelated Taggers (DDT) for jet substructure

We explore the scale-dependence and correlations of jet substructure observables to improve upon existing techniques in the identification of highly Lorentz-boosted objects. Modified observables are designed to remove correlations from existing theoretically well-understood observables, providing practical advantages for experimental measurements and searches for new phenomena. We study such observables in $W$ jet tagging and provide recommendations for observables based on considerations beyond signal and background efficiencies.

preprint2014arXiv

Pileup Per Particle Identification

We propose a new method for pileup mitigation by implementing "pileup per particle identification" (PUPPI). For each particle we first define a local shape $α$ which probes the collinear versus soft diffuse structure in the neighborhood of the particle. The former is indicative of particles originating from the hard scatter and the latter of particles originating from pileup interactions. The distribution of $α$ for charged pileup, assumed as a proxy for all pileup, is used on an event-by-event basis to calculate a weight for each particle. The weights describe the degree to which particles are pileup-like and are used to rescale their four-momenta, superseding the need for jet-based corrections. Furthermore, the algorithm flexibly allows combination with other, possibly experimental, probabilistic information associated with particles such as vertexing and timing performance. We demonstrate the algorithm improves over existing methods by looking at jet $p_T$ and jet mass. We also find an improvement on non-jet quantities like missing transverse energy.

preprint2013arXiv

Scrutinizing the Higgs Signal and Background in the $2e2μ$ Golden Channel

Kinematic distributions in the decays of the newly discovered resonance to four leptons are a powerful test of the tensor structure of its couplings to electroweak gauge bosons. We present an analytic calculation for both signal and background of the fully differential cross section for the `Golden Channel' $e^+e^-μ^+μ^-$ final state. We include all interference effects between intermediate gauge bosons and allow them to be on- or off-shell. For the signal we compute the fully differential cross section for general scalar couplings to $ZZ$, $γγ$, and $Zγ$. For the background we compute the leading order fully differential cross section for the dominant contribution coming from $q\bar{q}$ annihilation into $Z$ and $γ$ gauge bosons, including the contribution from the resonant $Z\rightarrow 2e2μ$ process. We also present singly and doubly differential projections and study the interference effects on the differential spectra. These expressions can be used in a variety of ways to uncover the nature of the newly discovered resonance or any new scalars decaying to neutral gauge bosons which might be discovered in the future.

preprint2011arXiv

Observation of scalar nuclear spin-spin coupling in van der Waals molecules

Scalar couplings between covalently bound nuclear spins are a ubiquitous feature in nuclear magnetic resonance (NMR) experiments, imparting valuable information to NMR spectra regarding molecular structure and conformation. Such couplings arise due to a second-order hyperfine interaction, and, in principle, the same mechanism should lead to scalar couplings between nuclear spins in unbound van der Waals complexes. Here, we report the first observation of scalar couplings between nuclei in van der Waals molecules. Our measurements are performed in a solution of hyperpolarized ${\rm ^{129}Xe}$ and pentane, using superconducting quantum interference devices to detect NMR in 10 mG fields, and are in good agreement with calculations based on density functional theory. van der Waals forces play an important role in many physical phenomena, and hence the techniques presented here may provide a new method for probing such interactions.

Nhan Tran

What is connected

Connect this record

See the researcher in context

Building this map preview

25 published item(s)

Surrogate Neural Architecture Codesign Package (SNAC-Pack)

Towards a Self-Driving Trigger at the LHC: Adaptive Response in Real Time

Beyond PID Controllers: PPO with Neuralized PID Policy for Proton Beam Intensity Control in Mu2e

Differentiable Earth Mover's Distance for Data Compression at the High-Luminosity LHC

Dark Sector Physics at High-Intensity Experiments

DarkQuest: A dark sector upgrade to SpinQuest at the 120 GeV Fermilab Main Injector

Experiments and Facilities for Accelerator-Based Dark Sector Searches

FastML Science Benchmarks: Accelerating Real-Time Scientific Edge Machine Learning

Jets and Jet Substructure at Future Colliders

Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark

Physics Community Needs, Tools, and Resources for Machine Learning

Physics Opportunities for the Fermilab Booster Replacement

QONNX: Representing Arbitrary-Precision Quantized Neural Networks

Smart sensors using artificial intelligence for on-detector electronics and ASICs

Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics

Fast inference of Boosted Decision Trees in FPGAs for particle physics

Improving Long Handwritten Text Line Recognition with Convolutional Multi-way Associative Memory

Lepton-Nucleus Cross Section Measurements for DUNE with the LDMX Detector

Fast inference of deep neural networks in FPGAs for particle physics

A Fast Algorithm for Solving Scalar Wave Scattering Problem by Billions of Particles

Dissecting Jets and Missing Energy Searches Using $n$-body Extended Simplified Models

Thinking outside the ROCs: Designing Decorrelated Taggers (DDT) for jet substructure

Pileup Per Particle Identification

Scrutinizing the Higgs Signal and Background in the $2e2μ$ Golden Channel

Observation of scalar nuclear spin-spin coupling in van der Waals molecules