Researcher profile

Shih-Chieh Hsu

Shih-Chieh Hsu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
21works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

21 published item(s)

preprint2022arXiv

Data Science and Machine Learning in Education

The growing role of data science (DS) and machine learning (ML) in high-energy physics (HEP) is well established and pertinent given the complex detectors, large data, sets and sophisticated analyses at the heart of HEP research. Moreover, exploiting symmetries inherent in physics data have inspired physics-informed ML as a vibrant sub-field of computer science research. HEP researchers benefit greatly from materials widely available materials for use in education, training and workforce development. They are also contributing to these materials and providing software to DS/ML-related fields. Increasingly, physics departments are offering courses at the intersection of DS, ML and physics, often using curricula developed by HEP researchers and involving open software and data used in HEP. In this white paper, we explore synergies between HEP research and DS/ML education, discuss opportunities and challenges at this intersection, and propose community activities that will be mutually beneficial.

preprint2022arXiv

Exploring the Universality of Hadronic Jet Classification

The modeling of jet substructure significantly differs between Parton Shower Monte Carlo (PSMC) programs. Despite this, we observe that machine learning classifiers trained on different PSMCs learn nearly the same function. This means that when these classifiers are applied to the same PSMC for testing, they result in nearly the same performance. This classifier universality indicates that a machine learning model trained on one simulation and tested on another simulation (or data) will likely be optimal. Our observations are based on detailed studies of shallow and deep neural networks applied to simulated Lorentz boosted Higgs jet tagging at the LHC.

preprint2022arXiv

Graph Neural Networks for Charged Particle Tracking on FPGAs

The determination of charged particle trajectories in collisions at the CERN Large Hadron Collider (LHC) is an important but challenging problem, especially in the high interaction density conditions expected during the future high-luminosity phase of the LHC (HL-LHC). Graph neural networks (GNNs) are a type of geometric deep learning algorithm that has successfully been applied to this task by embedding tracker data as a graph -- nodes represent hits, while edges represent possible track segments -- and classifying the edges as true or fake track segments. However, their study in hardware- or software-based trigger applications has been limited due to their large computational cost. In this paper, we introduce an automated translation workflow, integrated into a broader tool called $\texttt{hls4ml}$, for converting GNNs into firmware for field-programmable gate arrays (FPGAs). We use this translation tool to implement GNNs for charged particle tracking, trained using the TrackML challenge dataset, on FPGAs with designs targeting different graph sizes, task complexites, and latency/throughput requirements. This work could enable the inclusion of charged particle tracking GNNs at the trigger level for HL-LHC experiments.

preprint2022arXiv

Learning to Identify Semi-Visible Jets

We train a network to identify jets with fractional dark decay (semi-visible jets) using the pattern of their low-level jet constituents, and explore the nature of the information used by the network by mapping it to a space of jet substructure observables. Semi-visible jets arise from dark matter particles which decay into a mixture of dark sector (invisible) and Standard Model (visible) particles. Such objects are challenging to identify due to the complex nature of jets and the alignment of the momentum imbalance from the dark particles with the jet axis, but such jets do not yet benefit from the construction of dedicated theoretically-motivated jet substructure observables. A deep network operating on jet constituents is used as a probe of the available information and indicates that classification power not captured by current high-level observables arises primarily from low-$p_\textrm{T}$ jet constituents.

preprint2022arXiv

Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark

We present our development experience and recent results for the MLPerf Tiny Inference Benchmark on field-programmable gate array (FPGA) platforms. We use the open-source hls4ml and FINN workflows, which aim to democratize AI-hardware codesign of optimized neural networks on FPGAs. We present the design and implementation process for the keyword spotting, anomaly detection, and image classification benchmark tasks. The resulting hardware implementations are quantized, configurable, spatial dataflow architectures tailored for speed and efficiency and introduce new generic optimizations and common workflows developed as a part of this work. The full workflow is presented from quantization-aware training to FPGA implementation. The solutions are deployed on system-on-chip (Pynq-Z2) and pure FPGA (Arty A7-100T) platforms. The resulting submissions achieve latencies as low as 20 $μ$s and energy consumption as low as 30 $μ$J per inference. We demonstrate how emerging ML benchmarks on heterogeneous hardware platforms can catalyze collaboration and the development of new techniques and more accessible tools.

preprint2022arXiv

Permutationless Many-Jet Event Reconstruction with Symmetry Preserving Attention Networks

Top quarks, produced in large numbers at the Large Hadron Collider, have a complex detector signature and require special reconstruction techniques. The most common decay mode, the "all-jet" channel, results in a 6-jet final state which is particularly difficult to reconstruct in $pp$ collisions due to the large number of permutations possible. We present a novel approach to this class of problem, based on neural networks using a generalized attention mechanism, that we call Symmetry Preserving Attention Networks (SPA-Net). We train one such network to identify the decay products of each top quark unambiguously and without combinatorial explosion as an example of the power of this technique.This approach significantly outperforms existing state-of-the-art methods, correctly assigning all jets in $93.0%$ of $6$-jet, $87.8%$ of $7$-jet, and $82.6%$ of $\geq 8$-jet events respectively.

preprint2022arXiv

Physics Community Needs, Tools, and Resources for Machine Learning

Machine learning (ML) is becoming an increasingly important component of cutting-edge physics research, but its computational requirements present significant challenges. In this white paper, we discuss the needs of the physics community regarding ML across latency and throughput regimes, the tools and resources that offer the possibility of addressing these needs, and how these can be best utilized and accessed in the coming years.

preprint2022arXiv

QONNX: Representing Arbitrary-Precision Quantized Neural Networks

We present extensions to the Open Neural Network Exchange (ONNX) intermediate representation format to represent arbitrary-precision quantized neural networks. We first introduce support for low precision quantization in existing ONNX-based quantization formats by leveraging integer clipping, resulting in two new backward-compatible variants: the quantized operator format with clipping and quantize-clip-dequantize (QCDQ) format. We then introduce a novel higher-level ONNX format called quantized ONNX (QONNX) that introduces three new operators -- Quant, BipolarQuant, and Trunc -- in order to represent uniform quantization. By keeping the QONNX IR high-level and flexible, we enable targeting a wider variety of platforms. We also present utilities for working with QONNX, as well as examples of its usage in the FINN and hls4ml toolchains. Finally, we introduce the QONNX model zoo to share low-precision quantized neural networks.

preprint2022arXiv

Quantum Machine Learning with SQUID

In this work we present the Scaled QUantum IDentifier (SQUID), an open-source framework for exploring hybrid Quantum-Classical algorithms for classification problems. The classical infrastructure is based on PyTorch and we provide a standardized design to implement a variety of quantum models with the capability of back-propagation for efficient training. We present the structure of our framework and provide examples of using SQUID in a standard binary classification problem from the popular MNIST dataset. In particular, we highlight the implications for scalability for gradient-based optimization of quantum models on the choice of output for variational quantum models.

preprint2022arXiv

Sensitivity on Two-Higgs-Doublet Models from Higgs-Pair Production via $b\bar{b}b\bar{b}$ Final State

Higgs boson pair production is well known to probe the structure of the electroweak symmetry breaking sector. We illustrate using the gluon-fusion process $pp \to H \to h h \to (b\bar b) (b\bar b)$ in the framework of two-Higgs-doublet models and how the machine learning approach (three-stream convolutional neural network) can substantially improve the signal-background discrimination and thus improves the sensitivity coverage of the relevant parameter space. We show that such $gg \to hh \to b \bar b b\bar b$ process can further probe the currently allowed parameter space by HiggsSignals and HiggsBounds at the HL-LHC. The results for Types I to IV are shown.

preprint2022arXiv

Study of Electroweak Phase Transition in Exotic Higgs Decays at the CEPC

A strong first-order electroweak phase transition (EWPT) can be induced by light new physics weakly coupled to the Higgs. This study focuses on a scenario in which the first-order EWPT is driven by a light scalar $s$ with a mass between 15-60 GeV. A search for exotic decays of the Higgs boson into a pair of spin-zero particles, $h \to ss$, where the $s$-boson decays into $b$-quarks promptly is presented. The search is performed in events where the Higgs boson is produced in association with a $Z$ boson, giving rise to a signature of two charged leptons (electrons or muons) and multiple jets from $b$-quark decays. The analysis is considering a scenario of analysing 5000 fb$^{-1}$ $e^+ e^-$ collision data at $\sqrt{s} = 240 $ GeV from the Circular Electron Positron Collider (CEPC). This study with $4b$ final state conclusively tests the expected sensitivity of probing the light scalars in the CEPC experiment. The sensitivity reach is significantly larger than that can be achieved at the LHC.

preprint2022arXiv

The tracking detector of the FASER experiment

FASER is a new experiment designed to search for new light weakly-interacting long-lived particles (LLPs) and study high-energy neutrino interactions in the very forward region of the LHC collisions at CERN. The experimental apparatus is situated 480 m downstream of the ATLAS interaction-point aligned with the beam collision axis. The FASER detector includes four identical tracker stations constructed from silicon microstrip detectors. Three of the tracker stations form a tracking spectrometer, and enable FASER to detect the decay products of LLPs decaying inside the apparatus, whereas the fourth station is used for the neutrino analysis. The spectrometer has been installed in the LHC complex since March 2021, while the fourth station is not yet installed. FASER will start physics data taking when the LHC resumes operation in early 2022. This paper describes the design, construction and testing of the tracking spectrometer, including the associated components such as the mechanics, readout electronics, power supplies and cooling system.

preprint2022arXiv

The trigger and data acquisition system of the FASER experiment

The FASER experiment is a new small and inexpensive experiment that is placed 480 meters downstream of the ATLAS experiment at the CERN LHC. FASER is designed to capture decays of new long-lived particles, produced outside of the ATLAS detector acceptance. These rare particles can decay in the FASER detector together with about 500-1000 Hz of other particles originating from the ATLAS interaction point. A very high efficiency trigger and data acquisition system is required to ensure that the physics events of interest will be recorded. This paper describes the trigger and data acquisition system of the FASER experiment and presents performance results of the system acquired during initial commissioning.

preprint2022arXiv

Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml

Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programmable gate arrays (FPGAs). In this paper we present an implementation of two types of recurrent neural network layers -- long short-term memory and gated recurrent unit -- within the hls4ml framework. We demonstrate that our implementation is capable of producing effective designs for both small and large models, and can be customized to meet specific design requirements for inference latencies and FPGA resources. We show the performance and synthesized designs for multiple neural networks, many of which are trained specifically for jet identification tasks at the CERN Large Hadron Collider.

preprint2021arXiv

Disentangling Boosted Higgs Boson Production Modes with Machine Learning

Higgs Bosons produced via gluon-gluon fusion (ggF) with large transverse momentum ($p_T$) are sensitive probes of physics beyond the Standard Model. However, high $p_T$ Higgs Boson production is contaminated by a diversity of production modes other than ggF: vector boson fusion, production of a Higgs boson in association with a vector boson, and production of a Higgs boson with a top-quark pair. Combining jet substructure and event information with modern machine learning, we demonstrate the ability to focus on particular production modes. These tools hold great discovery potential for boosted Higgs bosons produced via ggF and may also provide additional information about the Higgs Boson sector of the Standard Model in extreme phase space regions for other production modes as well.

preprint2020arXiv

An Update to the Letter of Intent for MATHUSLA: Search for Long-Lived Particles at the HL-LHC

We report on recent progress in the design of the proposed MATHUSLA Long Lived Particle (LLP) detector for the HL-LHC, updating the information in the original Letter of Intent (LoI), see CDS:LHCC-I-031, arXiv:1811.00927. A suitable site has been identified at LHC Point 5 that is closer to the CMS Interaction Point (IP) than assumed in the LoI. The decay volume has been increased from 20 m to 25 m in height. Engineering studies have been made in order to locate much of the decay volume below ground, bringing the detector even closer to the IP. With these changes, a 100 m x 100 m detector has the same physics reach for large c$τ$ as the 200 m x 200 m detector described in the LoI and other studies. The performance for small c$τ$ is improved because of the proximity to the IP. Detector technology has also evolved while retaining the strip-like sensor geometry in Resistive Plate Chambers (RPC) described in the LoI. The present design uses extruded scintillator bars read out using wavelength shifting fibers and silicon photomultipliers (SiPM). Operations will be simpler and more robust with much lower operating voltages and without the use of greenhouse gases. Manufacturing is straightforward and should result in cost savings. Understanding of backgrounds has also significantly advanced, thanks to new simulation studies and measurements taken at the MATHUSLA test stand operating above ATLAS in 2018. We discuss next steps for the MATHUSLA collaboration, and identify areas where new members can make particularly important contributions.

preprint2020arXiv

Detecting and Studying High-Energy Collider Neutrinos with FASER at the LHC

Neutrinos are copiously produced at particle colliders, but no collider neutrino has ever been detected. Colliders, and particularly hadron colliders, produce both neutrinos and anti-neutrinos of all flavors at very high energies, and they are therefore highly complementary to those from other sources. FASER, the recently approved Forward Search Experiment at the Large Hadron Collider, is ideally located to provide the first detection and study of collider neutrinos. We investigate the prospects for neutrino studies of a proposed component of FASER, FASER$ν$, a 25cm x 25cm x 1.35m emulsion detector to be placed directly in front of the FASER spectrometer in tunnel TI12. FASER$ν$ consists of 1000 layers of emulsion films interleaved with 1-mm-thick tungsten plates, with a total tungsten target mass of 1.2 tons. We estimate the neutrino fluxes and interaction rates at FASER$ν$, describe the FASER$ν$ detector, and analyze the characteristics of the signals and primary backgrounds. For an integrated luminosity of 150 fb$^{-1}$ to be collected during Run 3 of the 14 TeV Large Hadron Collider from 2021-23, and assuming standard model cross sections, approximately 1300 electron neutrinos, 20,000 muon neutrinos, and 20 tau neutrinos will interact in FASER$ν$, with mean energies of 600 GeV to 1 TeV, depending on the flavor. With such rates and energies, FASER will measure neutrino cross sections at energies where they are currently unconstrained, will bound models of forward particle production, and could open a new window on physics beyond the standard model.

preprint2020arXiv

HL-LHC Computing Review: Common Tools and Community Software

Common and community software packages, such as ROOT, Geant4 and event generators have been a key part of the LHC's success so far and continued development and optimisation will be critical in the future. The challenges are driven by an ambitious physics programme, notably the LHC accelerator upgrade to high-luminosity, HL-LHC, and the corresponding detector upgrades of ATLAS and CMS. In this document we address the issues for software that is used in multiple experiments (usually even more widely than ATLAS and CMS) and maintained by teams of developers who are either not linked to a particular experiment or who contribute to common software within the context of their experiment activity. We also give space to general considerations for future software and projects that tackle upcoming challenges, no matter who writes it, which is an area where community convergence on best practice is extremely useful.

preprint2020arXiv

Search for a generic heavy Higgs at the LHC

A generic heavy Higgs has both dim-4 and effective dim-6 interactions with the Standard Model (SM) particles. The former has been the focus of LHC searches in all major Higgs production channels, just as the SM one, but with negative results so far. If the heavy Higgs is connected with Beyond Standard Model (BSM) physics at a few TeV scale, its dim-6 operators will play a very important role - they significantly enhance the Higgs momentum, and reduce the SM background in a special phase space corner to a level such that a heavy Higgs emerges, which is not possible with dim-4 operators only. We focus on the associated VH production channel, where the effect of dim-6 operators is the largest and the SM background is the lowest. Main search regions for this type of signal are identified, and substructure variables of boosted jets are employed to enhance the signal from backgrounds. The parameter space of these operators are scanned over, and expected exclusion regions with 300 fb$^{-1}$ and 3 ab$^{-1}$ LHC data are shown, if no BSM is present. The strategy given in this paper will shed light on a heavy Higgs which may be otherwise hiding in the present and future LHC data.

preprint2020arXiv

Technical Proposal: FASERnu

FASERnu is a proposed small and inexpensive emulsion detector designed to detect collider neutrinos for the first time and study their properties. FASERnu will be located directly in front of FASER, 480 m from the ATLAS interaction point along the beam collision axis in the unused service tunnel TI12. From 2021-23 during Run 3 of the 14 TeV LHC, roughly 1,300 electron neutrinos, 20,000 muon neutrinos, and 20 tau neutrinos will interact in FASERnu with TeV-scale energies. With the ability to observe these interactions, reconstruct their energies, and distinguish flavors, FASERnu will probe the production, propagation, and interactions of neutrinos at the highest human-made energies ever recorded. The FASERnu detector will be composed of 1000 emulsion layers interleaved with tungsten plates. The total volume of the emulsion and tungsten is 25cm x 25cm x 1.35m, and the tungsten target mass is 1.2 tonnes. From 2021-23, 7 sets of emulsion layers will be installed, with replacement roughly every 20-50 1/fb in planned Technical Stops. In this document, we summarize FASERnu's physics goals and discuss the estimates of neutrino flux and interaction rates. We then describe the FASERnu detector in detail, including plans for assembly, transport, installation, and emulsion replacement, and procedures for emulsion readout and analyzing the data. We close with cost estimates for the detector components and infrastructure work and a timeline for the experiment.

preprint2017arXiv

Telescoping jet substructure

We introduce a novel jet substructure method which exploits the variation of observables with respect to a sampling of phase-space boundaries quantified by the variability. We apply this technique to identify boosted W boson and top quark jets using telescoping subjets which utilizes information coming from subjet topology and that coming from subjet substructure. We find excellent performance of the variability, in particular its robustness against finite detector resolution. The extension to telescoping jet grooming and other telescoping jet substructure observables is also straightforward. This method provides a new direction in heavy particle tagging and suggests a systematic approach to the decomposition of jet substructure.