Researcher profile

Michel Besserve

Michel Besserve contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2023arXiv

Exploring the Latent Space of Autoencoders with Interventional Assays

Autoencoders exhibit impressive abilities to embed the data manifold into a low-dimensional latent space, making them a staple of representation learning methods. However, without explicit supervision, which is often unavailable, the representation is usually uninterpretable, making analysis and principled progress challenging. We propose a framework, called latent responses, which exploits the locally contractive behavior exhibited by variational autoencoders to explore the learned manifold. More specifically, we develop tools to probe the representation using interventions in the latent space to quantify the relationships between latent variables. We extend the notion of disentanglement to take the learned generative process into account and consequently avoid the limitations of existing metrics that may rely on spurious correlations. Our analyses underscore the importance of studying the causal structure of the representation to improve performance on downstream tasks such as generation, interpolation, and inference of the factors of variation.

preprint2022arXiv

Bayesian Information Criterion for Event-based Multi-trial Ensemble data

Transient recurring phenomena are ubiquitous in many scientific fields like neuroscience and meteorology. Time inhomogenous Vector Autoregressive Models (VAR) may be used to characterize peri-event system dynamics associated with such phenomena, and can be learned by exploiting multi-dimensional data gathering samples of the evolution of the system in multiple time windows comprising, each associated with one occurrence of the transient phenomenon, that we will call "trial". However, optimal VAR model order selection methods, commonly relying on the Akaike or Bayesian Information Criteria (AIC/BIC), are typically not designed for multi-trial data. Here we derive the BIC methods for multi-trial ensemble data which are gathered after the detection of the events. We show using simulated bivariate AR models that the multi-trial BIC is able to recover the real model order. We also demonstrate with simulated transient events and real data that the multi-trial BIC is able to estimate a sufficiently small model order for dynamic system modeling.

preprint2022arXiv

Function Classes for Identifiable Nonlinear Independent Component Analysis

Unsupervised learning of latent variable models (LVMs) is widely used to represent data in machine learning. When such models reflect the ground truth factors and the mechanisms mapping them to observations, there is reason to expect that they allow generalization in downstream tasks. It is however well known that such identifiability guaranties are typically not achievable without putting constraints on the model class. This is notably the case for nonlinear Independent Component Analysis, in which the LVM maps statistically independent variables to observations via a deterministic nonlinear function. Several families of spurious solutions fitting perfectly the data, but that do not correspond to the ground truth factors can be constructed in generic settings. However, recent work suggests that constraining the function class of such models may promote identifiability. Specifically, function classes with constraints on their partial derivatives, gathered in the Jacobian matrix, have been proposed, such as orthogonal coordinate transformations (OCT), which impose orthogonality of the Jacobian columns. In the present work, we prove that a subclass of these transformations, conformal maps, is identifiable and provide novel theoretical results suggesting that OCTs have properties that prevent families of spurious solutions to spoil identifiability in a generic setting.

preprint2022arXiv

Independent mechanism analysis, a new concept?

Independent component analysis provides a principled framework for unsupervised representation learning, with solid theory on the identifiability of the latent code that generated the data, given only observations of mixtures thereof. Unfortunately, when the mixing is nonlinear, the model is provably nonidentifiable, since statistical independence alone does not sufficiently constrain the problem. Identifiability can be recovered in settings where additional, typically observed variables are included in the generative process. We investigate an alternative path and consider instead including assumptions reflecting the principle of independent causal mechanisms exploited in the field of causality. Specifically, our approach is motivated by thinking of each source as independently influencing the mixing process. This gives rise to a framework which we term independent mechanism analysis. We provide theoretical and empirical evidence that our approach circumvents a number of nonidentifiability issues arising in nonlinear blind source separation.

preprint2022arXiv

On Pitfalls of Identifiability in Unsupervised Learning. A Note on: "Desiderata for Representation Learning: A Causal Perspective"

Model identifiability is a desirable property in the context of unsupervised representation learning. In absence thereof, different models may be observationally indistinguishable while yielding representations that are nontrivially related to one another, thus making the recovery of a ground truth generative model fundamentally impossible, as often shown through suitably constructed counterexamples. In this note, we discuss one such construction, illustrating a potential failure case of an identifiability result presented in "Desiderata for Representation Learning: A Causal Perspective" by Wang & Jordan (2021). The construction is based on the theory of nonlinear independent component analysis. We comment on implications of this and other counterexamples for identifiable representation learning.

preprint2022arXiv

Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style

Self-supervised representation learning has shown remarkable success in a number of domains. A common practice is to perform data augmentation via hand-crafted transformations intended to leave the semantics of the data invariant. We seek to understand the empirical success of this approach from a theoretical perspective. We formulate the augmentation process as a latent variable model by postulating a partition of the latent representation into a content component, which is assumed invariant to augmentation, and a style component, which is allowed to change. Unlike prior work on disentanglement and independent component analysis, we allow for both nontrivial statistical and causal dependencies in the latent space. We study the identifiability of the latent representation based on pairs of views of the observations and prove sufficient conditions that allow us to identify the invariant content partition up to an invertible mapping in both generative and discriminative settings. We find numerical simulations with dependent latent variables are consistent with our theory. Lastly, we introduce Causal3DIdent, a dataset of high-dimensional, visually complex images with rich causal dependencies, which we use to study the effect of data augmentations performed in practice.

preprint2021arXiv

A theory of independent mechanisms for extrapolation in generative models

Generative models can be trained to emulate complex empirical data, but are they useful to make predictions in the context of previously unobserved environments? An intuitive idea to promote such extrapolation capabilities is to have the architecture of such model reflect a causal graph of the true data generating process, such that one can intervene on each node independently of the others. However, the nodes of this graph are usually unobserved, leading to overparameterization and lack of identifiability of the causal structure. We develop a theoretical framework to address this challenging situation by defining a weaker form of identifiability, based on the principle of independence of mechanisms. We demonstrate on toy examples that classical stochastic gradient descent can hinder the model's extrapolation capabilities, suggesting independence of mechanisms should be enforced explicitly during training. Experiments on deep generative models trained on real world data support these insights and illustrate how the extrapolation capabilities of such models can be leveraged.

preprint2021arXiv

Assaying Large-scale Testing Models to Interpret COVID-19 Case Numbers

Large-scale testing is considered key to assess the state of the current COVID-19 pandemic. Yet, the link between the reported case numbers and the true state of the pandemic remains elusive. We develop mathematical models based on competing hypotheses regarding this link, thereby providing different prevalence estimates based on case numbers, and validate them by predicting SARS-CoV-2-attributed death rate trajectories. Assuming that individuals were tested based solely on a predefined risk of being infectious implies the absolute case numbers reflect the prevalence, but turned out to be a poor predictor, consistently overestimating growth rates at the beginning of two COVID-19 epidemic waves. In contrast, assuming that testing capacity is fully exploited performs better. This leads to using the percent-positive rate as a more robust indicator of epidemic dynamics, however we find it is subject to a saturation phenomenon that needs to be accounted for as the number of tests becomes larger.

preprint2020arXiv

From univariate to multivariate coupling between continuous signals and point processes: a mathematical framework

Time series datasets often contain heterogeneous signals, composed of both continuously changing quantities and discretely occurring events. The coupling between these measurements may provide insights into key underlying mechanisms of the systems under study. To better extract this information, we investigate the asymptotic statistical properties of coupling measures between continuous signals and point processes. We first introduce martingale stochastic integration theory as a mathematical model for a family of statistical quantities that include the Phase Locking Value, a classical coupling measure to characterize complex dynamics. Based on the martingale Central Limit Theorem, we can then derive the asymptotic Gaussian distribution of estimates of such coupling measure, that can be exploited for statistical testing. Second, based on multivariate extensions of this result and Random Matrix Theory, we establish a principled way to analyze the low rank coupling between a large number of point processes and continuous signals. For a null hypothesis of no coupling, we establish sufficient conditions for the empirical distribution of squared singular values of the matrix to converge, as the number of measured signals increases, to the well-known Marchenko-Pastur (MP) law, and the largest squared singular value converges to the upper end of the MPs support. This justifies a simple thresholding approach to assess the significance of multivariate coupling. Finally, we illustrate with simulations the relevance of our univariate and multivariate results in the context of neural time series, addressing how to reliably quantify the interplay between multi channel Local Field Potential signals and the spiking activity of a large population of neurons.

preprint2020arXiv

Orthogonal Structure Search for Efficient Causal Discovery from Observational Data

The problem of inferring the direct causal parents of a response variable among a large set of explanatory variables is of high practical importance in many disciplines. Recent work exploits stability of regression coefficients or invariance properties of models across different experimental conditions for reconstructing the full causal graph. These approaches generally do not scale well with the number of the explanatory variables and are difficult to extend to nonlinear relationships. Contrary to existing work, we propose an approach which even works for observational data alone, while still offering theoretical guarantees including the case of partially nonlinear relationships. Our algorithm requires only one estimation for each variable and in our experiments we apply our causal discovery algorithm even to large graphs, demonstrating significant improvements compared to well established approaches.