Source author record

Mikael Kuusela

Mikael Kuusela appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications hep-ex physics.data-an hep-ph Machine Learning Methodology physics.ao-ph

Catalog footprint

What is connected

9works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Keeping Score: Efficiency Improvements in Neural Likelihood Surrogate Training via Score-Augmented Loss Functions

For stochastic process models, parameter inference is often severely bottlenecked by computationally expensive likelihood functions. Simulation-based inference (SBI) bypasses this restriction by constructing amortized surrogate likelihoods, but most SBI methods assume a black-box data generating process. While these surrogates are exact in the limit of infinite training data, practical scenarios force a strict tradeoff between model quality and simulation cost. In this work, we loosen the black-box assumption of SBI to improve this tradeoff for structured stochastic process models. Specifically, for neural network likelihood surrogates trained via probabilistic classification, we propose to augment the standard binary cross-entropy loss with exact score information $\nabla_θ\log p(x \mid θ)$ and adaptive weighting based on loss gradients. We evaluate our approach on case studies involving network dynamics and spatial processes, demonstrating that our method improves surrogate quality at a drastically lower computational cost than generating more training data. Notably, in some cases, our approach achieves downstream inference performance equivalent to a 10x increase in training data with less than a 1.1x increase in training time.

preprint2026arXiv

Machine Learning-based Unfolding for Cross Section Measurements in the Presence of Nuisance Parameters

Statistically correcting measured cross sections for detector effects is an important step across many applications. In particle physics, this inverse problem is known as unfolding. In cases with complex instruments, the distortions they introduce are often known only implicitly through simulations of the detector. Modern machine learning has enabled efficient simulation-based approaches for unfolding high-dimensional data. Among these, one of the first methods successfully deployed on experimental data is the OmniFold algorithm, a classifier-based Expectation-Maximization procedure. In practice, however, the forward model is only approximately specified, and the corresponding uncertainty is encoded through nuisance parameters. Building on the well-studied OmniFold algorithm, we show how to extend machine learning-based unfolding to incorporate nuisance parameters. Our new algorithm, called Profile OmniFold, is demonstrated using a Gaussian example as well as a particle physics case study using simulated data from the CMS Experiment at the Large Hadron Collider.

preprint2022arXiv

Objective frequentist uncertainty quantification for atmospheric CO$_2$ retrievals

The steadily increasing amount of atmospheric carbon dioxide (CO$_2$) is affecting the global climate system and threatening the long-term sustainability of Earth's ecosystem. In order to better understand the sources and sinks of CO$_2$, NASA operates the Orbiting Carbon Observatory-2 & 3 satellites to monitor CO$_2$ from space. These satellites make passive radiance measurements of the sunlight reflected off the Earth's surface in different spectral bands, which are then inverted in an ill-posed inverse problem to obtain estimates of the atmospheric CO$_2$ concentration. In this work, we propose a new CO$_2$ retrieval method that uses known physical constraints on the state variables and direct inversion of the target functional of interest to construct well-calibrated frequentist confidence intervals based on convex programming. We compare the method with the current operational retrieval procedure, which uses prior knowledge in the form of probability distributions to regularize the problem. We demonstrate that the proposed intervals consistently achieve the desired frequentist coverage, while the operational uncertainties are poorly calibrated in a frequentist sense both at individual locations and over a spatial region in a realistic simulation experiment. We also study the influence of specific nuisance state variables on the length of the proposed intervals and identify certain key variables that can greatly reduce the final uncertainty given additional deterministic or probabilistic constraints, and develop a principled framework to incorporate such information into our method.

preprint2022arXiv

Spatio-temporal Local Interpolation of Global Ocean Heat Transport using Argo Floats: A Debiased Latent Gaussian Process Approach

The world ocean plays a key role in redistributing heat in the climate system and hence in regulating Earth's climate. Yet statistical analysis of ocean heat transport suffers from partially incomplete large-scale data intertwined with complex spatio-temporal dynamics, as well as from potential model misspecification. We present a comprehensive spatio-temporal statistical framework tailored to interpolating the global ocean heat transport using in-situ Argo profiling float measurements. We formalize the statistical challenges using latent local Gaussian process regression accompanied by a two-stage fitting procedure. We introduce an approximate Expectation-Maximization algorithm to jointly estimate both the mean field and the covariance parameters, and refine the potentially under-specified mean field model with a debiasing procedure. This approach provides data-driven global ocean heat transport fields that vary in both space and time and can provide insights into crucial dynamical phenomena, such as El Ni{ñ}o \& La Ni{ñ}a, as well as the global climatological mean heat transport field, which by itself is of scientific interest. The proposed framework and the Argo-based estimates are thoroughly validated with state-of-the-art multimission satellite products and shown to yield realistic subsurface ocean heat transport estimates.

preprint2015arXiv

Statistical unfolding of elementary particle spectra: Empirical Bayes estimation and bias-corrected uncertainty quantification

We consider the high energy physics unfolding problem where the goal is to estimate the spectrum of elementary particles given observations distorted by the limited resolution of a particle detector. This important statistical inverse problem arising in data analysis at the Large Hadron Collider at CERN consists in estimating the intensity function of an indirectly observed Poisson point process. Unfolding typically proceeds in two steps: one first produces a regularized point estimate of the unknown intensity and then uses the variability of this estimator to form frequentist confidence intervals that quantify the uncertainty of the solution. In this paper, we propose forming the point estimate using empirical Bayes estimation which enables a data-driven choice of the regularization strength through marginal maximum likelihood estimation. Observing that neither Bayesian credible intervals nor standard bootstrap confidence intervals succeed in achieving good frequentist coverage in this problem due to the inherent bias of the regularized point estimate, we introduce an iteratively bias-corrected bootstrap technique for constructing improved confidence intervals. We show using simulations that this enables us to achieve nearly nominal frequentist coverage with only a modest increase in interval length. The proposed methodology is applied to unfolding the $Z$ boson invariant mass spectrum as measured in the CMS experiment at the Large Hadron Collider.

preprint2014arXiv

Empirical Bayes unfolding of elementary particle spectra at the Large Hadron Collider

We consider the so-called unfolding problem in experimental high energy physics, where the goal is to estimate the true spectrum of elementary particles given observations distorted by measurement error due to the limited resolution of a particle detector. This an important statistical inverse problem arising in the analysis of data at the Large Hadron Collider at CERN. Mathematically, the problem is formalized as one of estimating the intensity function of an indirectly observed Poisson point process. Particle physicists are particularly keen on unfolding methods that feature a principled way of choosing the regularization strength and allow for the quantification of the uncertainty inherent in the solution. Though there are many approaches that have been considered by experimental physicists, it can be argued that few -- if any -- of these deal with these two key issues in a satisfactory manner. In this paper, we propose to attack the unfolding problem within the framework of empirical Bayes estimation: we consider Bayes estimators of the coefficients of a basis expansion of the unknown intensity, using a regularizing prior; and employ a Monte Carlo expectation-maximization algorithm to find the marginal maximum likelihood estimate of the hyperparameter controlling the strength of the regularization. Due to the data-driven choice of the hyperparameter, credible intervals derived using the empirical Bayes posterior lose their subjective Bayesian interpretation. Since the properties and meaning of such intervals are poorly understood, we explore instead the use of bootstrap resampling for constructing purely frequentist confidence bands for the true intensity. The performance of the proposed methodology is demonstrated using both simulations and real data from the Large Hadron Collider.

preprint2012arXiv

Semi-Supervised Anomaly Detection - Towards Model-Independent Searches of New Physics

Most classification algorithms used in high energy physics fall under the category of supervised machine learning. Such methods require a training set containing both signal and background events and are prone to classification errors should this training data be systematically inaccurate for example due to the assumed MC model. To complement such model-dependent searches, we propose an algorithm based on semi-supervised anomaly detection techniques, which does not require a MC training sample for the signal data. We first model the background using a multivariate Gaussian mixture model. We then search for deviations from this model by fitting to the observations a mixture of the background model and a number of additional Gaussians. This allows us to perform pattern recognition of any anomalous excess over the background. We show by a comparison to neural network classifiers that such an approach is a lot more robust against misspecification of the signal MC than supervised classification. In cases where there is an unexpected signal, a neural network might fail to correctly identify it, while anomaly detection does not suffer from such a limitation. On the other hand, when there are no systematic errors in the training data, both methods perform comparably.

preprint2010arXiv

Soft Classification of Diffractive Interactions at the LHC

Multivariate machine learning techniques provide an alternative to the rapidity gap method for event-by-event identification and classification of diffraction in hadron-hadron collisions. Traditionally, such methods assign each event exclusively to a single class producing classification errors in overlap regions of data space. As an alternative to this so called hard classification approach, we propose estimating posterior probabilities of each diffractive class and using these estimates to weigh event contributions to physical observables. It is shown with a Monte Carlo study that such a soft classification scheme is able to reproduce observables such as multiplicity distributions and relative event rates with a much higher accuracy than hard classification.

preprint2009arXiv

Multivariate Techniques for Identifying Diffractive Interactions at the LHC

Close to one half of the LHC events are expected to be due to elastic or inelastic diffractive scattering. Still, predictions based on extrapolations of experimental data at lower energies differ by large factors in estimating the relative rate of diffractive event categories at the LHC energies. By identifying diffractive events, detailed studies on proton structure can be carried out. The combined forward physics objects: rapidity gaps, forward multiplicity and transverse energy flows can be used to efficiently classify proton-proton collisions. Data samples recorded by the forward detectors, with a simple extension, will allow first estimates of the single diffractive (SD), double diffractive (DD), central diffractive (CD), and non-diffractive (ND) cross sections. The approach, which uses the measurement of inelastic activity in forward and central detector systems, is complementary to the detection and measurement of leading beam-like protons. In this investigation, three different multivariate analysis approaches are assessed in classifying forward physics processes at the LHC. It is shown that with gene expression programming, neural networks and support vector machines, diffraction can be efficiently identified within a large sample of simulated proton-proton scattering events. The event characteristics are visualized by using the self-organizing map algorithm.