Source author record

Simon Buchholz

Simon Buchholz appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Applications math-ph math.MP Populations and Evolution

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation

We introduce Pion, a spectrum-preserving optimizer for large language model (LLM) training based on orthogonal equivalence transformation. Unlike additive optimizers such as Adam and Muon, Pion updates each weight matrix through left and right orthogonal transformations, preserving its singular values throughout training. This yields an optimization mechanism that modulates the geometry of weight matrices while keeping their spectral norm fixed. We derive the Pion update rule, systematically examine its design choices, and analyze its convergence behavior along with several key properties. Empirical results show that Pion offers a stable and competitive alternative to standard optimizers for both LLM pretraining and finetuning.

preprint2023arXiv

AutoML Two-Sample Test

Two-sample tests are important in statistics and machine learning, both as tools for scientific discovery as well as to detect distribution shifts. This led to the development of many sophisticated test procedures going beyond the standard supervised learning frameworks, whose usage can require specialized knowledge about two-sample testing. We use a simple test that takes the mean discrepancy of a witness function as the test statistic and prove that minimizing a squared loss leads to a witness with optimal testing power. This allows us to leverage recent advancements in AutoML. Without any user input about the problems at hand, and using the same method for all our experiments, our AutoML two-sample test achieves competitive performance on a diverse distribution shift benchmark as well as on challenging two-sample testing problems. We provide an implementation of the AutoML two-sample test in the Python package autotst.

preprint2022arXiv

Function Classes for Identifiable Nonlinear Independent Component Analysis

Unsupervised learning of latent variable models (LVMs) is widely used to represent data in machine learning. When such models reflect the ground truth factors and the mechanisms mapping them to observations, there is reason to expect that they allow generalization in downstream tasks. It is however well known that such identifiability guaranties are typically not achievable without putting constraints on the model class. This is notably the case for nonlinear Independent Component Analysis, in which the LVM maps statistically independent variables to observations via a deterministic nonlinear function. Several families of spurious solutions fitting perfectly the data, but that do not correspond to the ground truth factors can be constructed in generic settings. However, recent work suggests that constraining the function class of such models may promote identifiability. Specifically, function classes with constraints on their partial derivatives, gathered in the Jacobian matrix, have been proposed, such as orthogonal coordinate transformations (OCT), which impose orthogonality of the Jacobian columns. In the present work, we prove that a subclass of these transformations, conformal maps, is identifiable and provide novel theoretical results suggesting that OCTs have properties that prevent families of spurious solutions to spoil identifiability in a generic setting.

preprint2021arXiv

Assaying Large-scale Testing Models to Interpret COVID-19 Case Numbers

Large-scale testing is considered key to assess the state of the current COVID-19 pandemic. Yet, the link between the reported case numbers and the true state of the pandemic remains elusive. We develop mathematical models based on competing hypotheses regarding this link, thereby providing different prevalence estimates based on case numbers, and validate them by predicting SARS-CoV-2-attributed death rate trajectories. Assuming that individuals were tested based solely on a predefined risk of being infectious implies the absolute case numbers reflect the prevalence, but turned out to be a poor predictor, consistently overestimating growth rates at the beginning of two COVID-19 epidemic waves. In contrast, assuming that testing capacity is fully exploited performs better. This leads to using the percent-positive rate as a more robust indicator of epidemic dynamics, however we find it is subject to a saturation phenomenon that needs to be accounted for as the number of tests becomes larger.

preprint2013arXiv

Multivariate Central Limit Theorem in Quantum Dynamics

We consider the time evolution of $N$ bosons in the mean field regime for factorized initial data. In the limit of large $N$, the many body evolution can be approximated by the non-linear Hartree equation. In this paper we are interested in the fluctuations around the Hartree dynamics. We choose $k$ self-adjoint one-particle operators $O_1, \dots, O_k$ on $L^2 (\R^3)$, and we average their action over the $N$-particles. We show that, for every fixed $t \in \R$, expectations of products of functions of the averaged observables approach, as $N \to \infty$, expectations with respect to a complex Gaussian measure, whose covariance matrix can be expressed in terms of a Bogoliubov transformation describing the dynamics of quantum fluctuations around the mean field Hartree evolution. If the operators $O_1, \dots, O_k$ commute, the Gaussian measure is real and positive, and we recover a "classical" multivariate central limit theorem. All our results give explicit bounds on the rate of the convergence (we obtain therefore Berry-Ess{é}en type central limit theorems).