Researcher profile

Kazuhiro Terao

Kazuhiro Terao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

Adversarial methods to reduce simulation bias in neutrino interaction event filtering at Liquid Argon Time Projection Chambers

For current and future neutrino oscillation experiments using large Liquid Argon Time Projection Chambers (LAr-TPCs), a key challenge is identifying neutrino interactions from the pervading cosmic-ray background. Rejection of such background is often possible using traditional cut-based selections, but this typically requires the prior use of computationally expensive reconstruction algorithms. This work demonstrates an alternative approach of using a 3D Submanifold Sparse Convolutional Network trained on low-level information from the scintillation light signal of interactions inside LAr-TPCs. This technique is applied to example simulations from ICARUS, the far detector of the Short Baseline Neutrino (SBN) program at Fermilab. The results of the network, show that cosmic background is reduced by up to 76.3% whilst neutrino interaction selection efficiency remains over 98.9%. We further present a way to mitigate potential biases from imperfect input simulations by applying Domain Adversarial Neural Networks (DANNs), for which modified simulated samples are introduced to imitate real data and a small portion of them are used for adverserial training. A series of mock-data studies are performed and demonstrate the effectiveness of using DANNs to mitigate biases, showing neutrino interaction selection efficiency performances significantly better than that achieved without the adversarial training.

preprint2022arXiv

An Efficient, Scalable IO Framework for Sparse Data: larcv3

Neutrino physics is one of the fundamental areas of research into the origins and properties of the Universe. Many experimental neutrino projects use sophisticated detectors to observe properties of these particles, and have turned to deep learning and artificial intelligence techniques to analyze their data. From this, we have developed \texttt{larcv}, a \texttt{C++} and \texttt{Python} based framework for efficient IO of sparse data with particle physics applications in mind. We describe in this paper the \texttt{larcv} framework and some benchmark IO performance tests. \texttt{larcv} is designed to enable fast and efficient IO of ragged and irregular data, at scale on modern HPC systems, and is compatible with the most popular open source data analysis tools in the Python ecosystem.

preprint2022arXiv

Data Science and Machine Learning in Education

The growing role of data science (DS) and machine learning (ML) in high-energy physics (HEP) is well established and pertinent given the complex detectors, large data, sets and sophisticated analyses at the heart of HEP research. Moreover, exploiting symmetries inherent in physics data have inspired physics-informed ML as a vibrant sub-field of computer science research. HEP researchers benefit greatly from materials widely available materials for use in education, training and workforce development. They are also contributing to these materials and providing software to DS/ML-related fields. Increasingly, physics departments are offering courses at the intersection of DS, ML and physics, often using curricula developed by HEP researchers and involving open software and data used in HEP. In this white paper, we explore synergies between HEP research and DS/ML education, discuss opportunities and challenges at this intersection, and propose community activities that will be mutually beneficial.

preprint2022arXiv

Graph Neural Networks in Particle Physics: Implementations, Innovations, and Challenges

Many physical systems can be best understood as sets of discrete data with associated relationships. Where previously these sets of data have been formulated as series or image data to match the available machine learning architectures, with the advent of graph neural networks (GNNs), these systems can be learned natively as graphs. This allows a wide variety of high- and low-level physical features to be attached to measurements and, by the same token, a wide variety of HEP tasks to be accomplished by the same GNN architectures. GNNs have found powerful use-cases in reconstruction, tagging, generation and end-to-end analysis. With the wide-spread adoption of GNNs in industry, the HEP community is well-placed to benefit from rapid improvements in GNN latency and memory usage. However, industry use-cases are not perfectly aligned with HEP and much work needs to be done to best match unique GNN capabilities to unique HEP obstacles. We present here a range of these capabilities, predictions of which are currently being well-adopted in HEP communities, and which are still immature. We hope to capture the landscape of graph techniques in machine learning as well as point out the most significant gaps that are inhibiting potentially large leaps in research.

preprint2022arXiv

Solving Simulation Systematics in and with AI/ML

Training an AI/ML system on simulated data while using that system to infer on data from real detectors introduces a systematic error which is difficult to estimate and in many analyses is simply not confronted. It is crucial to minimize and to quantitatively estimate the uncertainties in such analysis and do so with a precision and accuracy that matches those that AI/ML techniques bring. Here we highlight the need to confront this class of systematic error, discuss conventional ways to estimate it and describe ways to quantify and to minimize the uncertainty using methods which are themselves based on the power of AI/ML. We also describe methods to introduce a simulation into an AI/ML network to allow for training of its semantically meaningful parameters. This whitepaper is a contribution to the Computational Frontier of Snowmass21.

preprint2021arXiv

Scalable, End-to-End, Deep-Learning-Based Data Reconstruction Chain for Particle Imaging Detectors

Recent inroads in Computer Vision (CV) and Machine Learning (ML) have motivated a new approach to the analysis of particle imaging detector data. Unlike previous efforts which tackled isolated CV tasks, this paper introduces an end-to-end, ML-based data reconstruction chain for Liquid Argon Time Projection Chambers (LArTPCs), the state-of-the-art in precision imaging at the intensity frontier of neutrino physics. The chain is a multi-task network cascade which combines voxel-level feature extraction using Sparse Convolutional Neural Networks and particle superstructure formation using Graph Neural Networks. Each algorithm incorporates physics-informed inductive biases, while their collective hierarchy is used to enforce a causal structure. The output is a comprehensive description of an event that may be used for high-level physics inference. The chain is end-to-end optimizable, eliminating the need for time-intensive manual software adjustments. It is also the first implementation to handle the unprecedented pile-up of dozens of high energy neutrino interactions, expected in the 3D-imaging LArTPC of the Deep Underground Neutrino Experiment. The chain is trained as a whole and its performance is assessed at each step using an open simulated data set.

preprint2020arXiv

PILArNet: Public Dataset for Particle Imaging Liquid Argon Detectors in High Energy Physics

Rapid advancement of machine learning solutions has often coincided with the production of a test public data set. Such datasets reduce the largest barrier to entry for tackling a problem -- procuring data -- while also providing a benchmark to compare different solutions. Furthermore, large datasets have been used to train high-performing feature finders which are then used in new approaches to problems beyond that initially defined. In order to encourage the rapid development in the analysis of data collected using liquid argon time projection chambers, a class of particle detectors used in high energy physics experiments, we have produced the PILArNet, first 2D and 3D open dataset to be used for a couple of key analysis tasks. The initial dataset presented in this paper contains 300,000 samples simulated and recorded in three different volume sizes. The dataset is stored efficiently in sparse 2D and 3D matrix format with auxiliary information about simulated particles in the volume, and is made available for public research use. In this paper we describe the dataset, tasks, and the method used to procure the sample.

preprint2020arXiv

Scalable, Proposal-free Instance Segmentation Network for 3D Pixel Clustering and Particle Trajectory Reconstruction in Liquid Argon Time Projection Chambers

Liquid Argon Time Projection Chambers (LArTPCs) are high resolution particle imaging detectors, employed by accelerator-based neutrino oscillation experiments for high precision physics measurements. While images of particle trajectories are intuitive to analyze for physicists, the development of a high quality, automated data reconstruction chain remains challenging. One of the most critical reconstruction steps is particle clustering: the task of grouping 3D image pixels into different particle instances that share the same particle type. In this paper, we propose the first scalable deep learning algorithm for particle clustering in LArTPC data using sparse convolutional neural networks (SCNN). Building on previous works on SCNNs and proposal free instance segmentation, we build an end-to-end trainable instance segmentation network that learns an embedding of the image pixels to perform point cloud clustering in a transformed space. We benchmark the performance of our algorithm on PILArNet, a public 3D particle imaging dataset, with respect to common clustering evaluation metrics. 3D pixels were successfully clustered into individual particle trajectories with 90% of them having an adjusted Rand index score greater than 92% with a mean pixel clustering efficiency and purity above 96%. This work contributes to the development of an end-to-end optimizable full data reconstruction chain for LArTPCs, in particular pixel-based 3D imaging detectors including the near detector of the Deep Underground Neutrino Experiment. Our algorithm is made available in the open access repository, and we share our Singularity software container, which can be used to reproduce our work on the dataset.

preprint2019arXiv

Scalable Deep Convolutional Neural Networks for Sparse, Locally Dense Liquid Argon Time Projection Chamber Data

Deep convolutional neural networks (CNNs) show strong promise for analyzing scientific data in many domains including particle imaging detectors such as a liquid argon time projection chamber (LArTPC). Yet the high sparsity of LArTPC data challenges traditional CNNs which were designed for dense data such as photographs. A naive application of CNNs on LArTPC data results in inefficient computations and a poor scalability to large LArTPC detectors such as the Short Baseline Neutrino Program and Deep Underground Neutrino Experiment. Recently Submanifold Sparse Convolutional Networks (SSCNs) have been proposed to address this challenge. We report their performance on a 3D semantic segmentation task on simulated LArTPC samples. In comparison with standard CNNs, we observe that the computation memory and wall-time cost for inference are reduced by factor of 364 and 33 respectively without loss of accuracy. The same factors for 2D samples are found to be 93 and 3.1 respectively. Using SSCN, we present the first machine learning-based approach to the reconstruction of Michel electrons using public 3D LArTPC samples. We find a Michel electron identification efficiency of 93.9% with 96.7% of true positive rate. Reconstructed Michel electron clusters yield 95.4% in average pixel clustering efficiency and 95.5% in purity. The results are compelling to show strong promise of scalable data reconstruction technique using deep neural networks for large scale LArTPC detectors.