Researcher profile

Sanmay Ganguly

Sanmay Ganguly contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

BRICKS: Compositional Neural Markov Kernels for Zero-Shot Radiation-Matter Simulation

We introduce a new strategy for compositional neural surrogates for radiation-matter interactions, a key task spanning domains from particle physics through nuclear and space engineering to medical physics. Exploiting the locality and the Markov nature of particle interactions, we create a \emph{next-particle prediction} kernel using hybrid discrete-continuous transformer models based on Riemannian Flow Matching on product manifolds. The model generates variable-sized typed sets of particles and radiation side effects that are the result of the interaction of an incident particle with a material volume. The resulting kernel can be composed to simulate unseen large-scale material distributions in a zero-shot manner. Unlike mechanistic simulators, our model is designed to be differentiable, provides tractable likelihoods for future downstream applications. A significant computational speed-up on GPU compared to CPU-bound mechanistic simulation is observed for single-kernel execution. We evaluate the model at the kernel level and demonstrate predictive stability over multi-round autoregressive rollouts. We additionally release a novel 20M-event radiation-matter interaction dataset for further research.

preprint2026arXiv

Dissecting Jet-Tagger Through Mechanistic Interpretability

Mechanistic interpretability seeks to reverse engineer a trained neural network by identifying the minimal subset of internal components. We perform a mechanistic interpretability analysis of the Particle Transformer architecture, trained on the Top Quark Tagging reference dataset, with the goal of identifying the computational circuit responsible for jet classification and characterizing the physical content of its internal representations. Combining zero ablation, path patching with two complementary on-manifold corruption strategies and linear probing of the residual stream, we identify a sparse six-head circuit that recovers the great majority of the full model performance while admitting a clean source-relay-readout interpretation. In this circuit, a single early layer head serves as the primary causal source, a cluster of middle-layer heads acts as relays selectively attending to hard pairwise substructure and a single late-layer head reads out the aggregated signal. Linear probes show that the residual stream is preferentially aligned with the energy correlator basis over the $N$-subjettiness basis. Within the energy correlator basis, the model preferentially encodes 2-prong substructure observables over the 3-prong observables. A per-layer trained probe further reveals that the apparent single step commitment of the model to a classification decision in the first class attention block is in fact a basis rotation, with the discriminating signal already saturating in the particle attention stack. These results demonstrate that mechanistic interpretability methods developed for natural language models can be used for jet physics classifiers and indicate that gradient descent may rediscover physically meaningful aspects of jet tagging without supervision.

preprint2026arXiv

Explainable AI for Jet Tagging: A Comparative Study of GNNExplainer, GNNShap, and GradCAM for Jet Tagging in the Lund Jet Plane

Graph neural networks such as ParticleNet and transformer based networks on point clouds such as ParticleTransformer achieve state-of-the-art performance on jet tagging benchmarks at the Large Hadron Collider, yet the physical reasoning behind their predictions remains opaque. We present different methods, i.e. perturbation-based (GNNExplainer), Shapley-value-based (GNNShap), and gradient-based (GRADCam); adapted to operate on LundNet's Lund-plane graph representation. Leveraging the fact that each node in the Lund plane corresponds to a physically meaningful parton splitting, we construct Monte Carlo truth explanation masks and introduce a physics-informed evaluation framework that goes beyond standard fidelity metrics. We perform the analysis in three transverse-momentum bins ($\mathrm{p_T} \in [500,700]$, $[800,1000]$, and the inclusive region $[500,1000]$ GeV), revealing how explanation quality and focus shift between non-perturbative and perturbative regimes. We further quantify the correlation between explainer-assigned node importance and classical jet substructure observables -- $N$-subjettiness ratios $τ_{21}$ and $τ_{32}$ and the energy correlation functions -- establishing the degree to which the model has learned known QCD features. We find that overall the weight assigned by explainability methods has a correlation with analytic observables, with expected shift across different phase space regimes, indicating that a trained neural network indeed learns some aspects of jet-substructure moments. Our open-source implementation enables reproducible explainability studies for graph-based jet taggers.

preprint2026arXiv

Probing SMEFT Operators through $t\bar{t}t\bar{t}$ Production with Hyper-Graph Neural Networks at the LHC

We present a phenomenological study of $t\bar{t}t\bar{t}$ production in proton-proton collisions at $\sqrt{s} = 13$~TeV, using a Hyper-Graph Neural Network (H-GNN) to discriminate multilepton signal events from the dominant SM backgrounds, namely $t\bar{t}W$, $t\bar{t}Z$, $t\bar{t}H$, $t\bar{t}VV$, single-top associated production, and diboson and triboson processes. In the H-GNN architecture each event is represented as a hypergraph whose nodes correspond to reconstructed jets and leptons and whose hyperedges encode higher-order correlations among arbitrary subsets of these objects, allowing the network to learn the many-body kinematic structures that characterize the $t\bar{t}t\bar{t}$ final state. Combining same-sign di-lepton, tri-lepton, and four-lepton channels following a CMS-like event selection, the H-GNN attains an area under the ROC curve of $0.951$ for the $t\bar{t}t\bar{t}$ signal and yields a statistical significance of $Z = 9.11$ at an integrated luminosity of $\mathcal{L} = 140~\mathrm{fb}^{-1}$, to be compared with $Z = 8.62$ for a SPANet baseline, $Z = 7.37$ for a Particle Transformer baseline, and $Z = 5.13$ obtained by the ATLAS analysis, evaluated under identical event selection. We exploit the improved signal extraction to derive one- and two-parameter $95\%$ confidence level limits on the Wilson coefficients of the dimension-six operators $\mathcal{O}_{Φu}$, $\mathcal{O}^{(1)}_{tt}$, $\mathcal{O}^{(1)}_{qq}$, $\mathcal{O}^{(1)}_{qt}$, and $\mathcal{O}^{(8)}_{qt}$, and we project the expected sensitivity at the HL-LHC integrated luminosities of $1000~\mathrm{fb}^{-1}$ and $3000~\mathrm{fb}^{-1}$ with $50\%$ uncertainty on the background estimation.

preprint2022arXiv

Graph Neural Networks in Particle Physics: Implementations, Innovations, and Challenges

Many physical systems can be best understood as sets of discrete data with associated relationships. Where previously these sets of data have been formulated as series or image data to match the available machine learning architectures, with the advent of graph neural networks (GNNs), these systems can be learned natively as graphs. This allows a wide variety of high- and low-level physical features to be attached to measurements and, by the same token, a wide variety of HEP tasks to be accomplished by the same GNN architectures. GNNs have found powerful use-cases in reconstruction, tagging, generation and end-to-end analysis. With the wide-spread adoption of GNNs in industry, the HEP community is well-placed to benefit from rapid improvements in GNN latency and memory usage. However, industry use-cases are not perfectly aligned with HEP and much work needs to be done to best match unique GNN capabilities to unique HEP obstacles. We present here a range of these capabilities, predictions of which are currently being well-adopted in HEP communities, and which are still immature. We hope to capture the landscape of graph techniques in machine learning as well as point out the most significant gaps that are inhibiting potentially large leaps in research.

preprint2022arXiv

Improving Di-Higgs Sensitivity at Future Colliders in Hadronic Final States with Machine Learning

One of the central goals of the physics program at the future colliders is to elucidate the origin of electroweak symmetry breaking, including precision measurements of the Higgs sector. This includes a detailed study of Higgs boson (H) pair production, which can reveal the H self-coupling. Since the discovery of the Higgs boson, a large campaign of measurements of the properties of the Higgs boson has begun and many new ideas have emerged during the completion of this program. One such idea is the use of highly boosted and merged hadronic decays of the Higgs boson ($\mathrm{H}\to\mathrm{b}\bar{\mathrm{b}}$, $\mathrm{H}\to\mathrm{W}\mathrm{W}\to\mathrm{q}\bar{\mathrm{q}}\mathrm{q}\bar{\mathrm{q}}$) with machine learning methods to improve the signal-to-background discrimination. In this white paper, we champion the use of these modes to boost the sensitivity of future collider physics programs to Higgs boson pair production, the Higgs self-coupling, and Higgs-vector boson couplings. We demonstrate the potential improvement possible at the Future Circular Collider in hadron mode, especially with the use of graph neural networks.

preprint2022arXiv

Symmetry Group Equivariant Architectures for Physics

Physical theories grounded in mathematical symmetries are an essential component of our understanding of a wide range of properties of the universe. Similarly, in the domain of machine learning, an awareness of symmetries such as rotation or permutation invariance has driven impressive performance breakthroughs in computer vision, natural language processing, and other important applications. In this report, we argue that both the physics community and the broader machine learning community have much to understand and potentially to gain from a deeper investment in research concerning symmetry group equivariant machine learning architectures. For some applications, the introduction of symmetries into the fundamental structural design can yield models that are more economical (i.e. contain fewer, but more expressive, learned parameters), interpretable (i.e. more explainable or directly mappable to physical quantities), and/or trainable (i.e. more efficient in both data and computational requirements). We discuss various figures of merit for evaluating these models as well as some potential benefits and limitations of these methods for a variety of physics applications. Research and investment into these approaches will lay the foundation for future architectures that are potentially more robust under new computational paradigms and will provide a richer description of the physical systems to which they are applied.

preprint2021arXiv

Towards a Computer Vision Particle Flow

In High Energy Physics experiments Particle Flow (PFlow) algorithms are designed to provide an optimal reconstruction of the nature and kinematic properties of the particles produced within the detector acceptance during collisions. At the heart of PFlow algorithms is the ability to distinguish the calorimeter energy deposits of neutral particles from those of charged particles, using the complementary measurements of charged particle tracking devices, to provide a superior measurement of the particle content and kinematics. In this paper, a computer vision approach to this fundamental aspect of PFlow algorithms, based on calorimeter images, is proposed. A comparative study of the state of the art deep learning techniques is performed. A significantly improved reconstruction of the neutral particle calorimeter energy deposits is obtained in a context of large overlaps with the deposits from charged particles. Calorimeter images with augmented finer granularity are also obtained using super-resolution techniques.