Researcher profile

Patrick Forré

Patrick Forré contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

Normalizing Flows for Hierarchical Bayesian Analysis: A Gravitational Wave Population Study

We propose parameterizing the population distribution of the gravitational wave population modeling framework (Hierarchical Bayesian Analysis) with a normalizing flow. We first demonstrate the merit of this method on illustrative experiments and then analyze four parameters of the latest LIGO/Virgo data release: primary mass, secondary mass, redshift, and effective spin. Our results show that despite the small and notoriously noisy dataset, the posterior predictive distributions (assuming a prior over the parameters of the flow) of the observed gravitational wave population recover structure that agrees with robust previous phenomenological modeling results while being less susceptible to biases introduced by less flexible models. Therefore, the method forms a promising flexible, reliable replacement for population inference distributions, even when data is highly noisy.

preprint2022arXiv

Self-Supervised Inference in State-Space Models

We perform approximate inference in state-space models with nonlinear state transitions. Without parameterizing a generative model, we apply Bayesian update formulas using a local linearity approximation parameterized by neural networks. This comes accompanied by a maximum likelihood objective that requires no supervision via uncorrupt observations or ground truth latent states. The optimization backpropagates through a recursion similar to the classical Kalman filter and smoother. Additionally, using an approximate conditional independence, we can perform smoothing without having to parameterize a separate model. In scientific applications, domain knowledge can give a linear approximation of the latent transition maps, which we can easily incorporate into our model. Usage of such domain knowledge is reflected in excellent results (despite our model's simplicity) on the chaotic Lorenz system compared to fully supervised and variational inference methods. Finally, we show competitive results on an audio denoising experiment.

preprint2020arXiv

FlipOut: Uncovering Redundant Weights via Sign Flipping

Modern neural networks, although achieving state-of-the-art results on many tasks, tend to have a large number of parameters, which increases training time and resource usage. This problem can be alleviated by pruning. Existing methods, however, often require extensive parameter tuning or multiple cycles of pruning and retraining to convergence in order to obtain a favorable accuracy-sparsity trade-off. To address these issues, we propose a novel pruning method which uses the oscillations around $0$ (i.e. sign flips) that a weight has undergone during training in order to determine its saliency. Our method can perform pruning before the network has converged, requires little tuning effort due to having good default values for its hyperparameters, and can directly target the level of sparsity desired by the user. Our experiments, performed on a variety of object classification architectures, show that it is competitive with existing methods and achieves state-of-the-art performance for levels of sparsity of $99.6\%$ and above for most of the architectures tested. For reproducibility, we release our code publicly at https://github.com/AndreiXYZ/flipout.

preprint2020arXiv

Improving Fair Predictions Using Variational Inference In Causal Models

The importance of algorithmic fairness grows with the increasing impact machine learning has on people's lives. Recent work on fairness metrics shows the need for causal reasoning in fairness constraints. In this work, a practical method named FairTrade is proposed for creating flexible prediction models which integrate fairness constraints on sensitive causal paths. The method uses recent advances in variational inference in order to account for unobserved confounders. Further, a method outline is proposed which uses the causal mechanism estimates to audit black box models. Experiments are conducted on simulated data and on a real dataset in the context of detecting unlawful social welfare. This research aims to contribute to machine learning techniques which honour our ethical and legal boundaries.

preprint2020arXiv

Learning Robust Representations via Multi-View Information Bottleneck

The information bottleneck principle provides an information-theoretic method for representation learning, by training an encoder to retain all information which is relevant for predicting the label while minimizing the amount of other, excess information in the representation. The original formulation, however, requires labeled data to identify the superfluous information. In this work, we extend this ability to the multi-view unsupervised setting, where two views of the same underlying entity are provided but the label is unknown. This enables us to identify superfluous information as that not shared by both views. A theoretical analysis leads to the definition of a new multi-view model that produces state-of-the-art results on the Sketchy dataset and label-limited versions of the MIR-Flickr dataset. We also extend our theory to the single-view setting by taking advantage of standard data augmentation techniques, empirically showing better generalization capabilities when compared to common unsupervised approaches for representation learning.

preprint2020arXiv

Neural Ordinary Differential Equations on Manifolds

Normalizing flows are a powerful technique for obtaining reparameterizable samples from complex multimodal distributions. Unfortunately current approaches fall short when the underlying space has a non trivial topology, and are only available for the most basic geometries. Recently normalizing flows in Euclidean space based on Neural ODEs show great promise, yet suffer the same limitations. Using ideas from differential geometry and geometric control theory, we describe how neural ODEs can be extended to smooth manifolds. We show how vector fields provide a general framework for parameterizing a flexible class of invertible mapping on these spaces and we illustrate how gradient based learning can be performed. As a result we define a general methodology for building normalizing flows on manifolds.

preprint2020arXiv

Pruning via Iterative Ranking of Sensitivity Statistics

With the introduction of SNIP [arXiv:1810.02340v2], it has been demonstrated that modern neural networks can effectively be pruned before training. Yet, its sensitivity criterion has since been criticized for not propagating training signal properly or even disconnecting layers. As a remedy, GraSP [arXiv:2002.07376v1] was introduced, compromising on simplicity. However, in this work we show that by applying the sensitivity criterion iteratively in smaller steps - still before training - we can improve its performance without difficult implementation. As such, we introduce 'SNIP-it'. We then demonstrate how it can be applied for both structured and unstructured pruning, before and/or during training, therewith achieving state-of-the-art sparsity-performance trade-offs. That is, while already providing the computational benefits of pruning in the training process from the start. Furthermore, we evaluate our methods on robustness to overfitting, disconnection and adversarial attacks as well.

preprint2019arXiv

Causal Calculus in the Presence of Cycles, Latent Confounders and Selection Bias

We prove the main rules of causal calculus (also called do-calculus) for i/o structural causal models (ioSCMs), a generalization of a recently proposed general class of non-/linear structural causal models that allow for cycles, latent confounders and arbitrary probability distributions. We also generalize adjustment criteria and formulas from the acyclic setting to the general one (i.e. ioSCMs). Such criteria then allow to estimate (conditional) causal effects from observational data that was (partially) gathered under selection bias and cycles. This generalizes the backdoor criterion, the selection-backdoor criterion and extensions of these to arbitrary ioSCMs. Together, our results thus enable causal reasoning in the presence of cycles, latent confounders and selection bias. Finally, we extend the ID algorithm for the identification of causal effects to ioSCMs.

preprint2018arXiv

Constraint-based Causal Discovery for Non-Linear Structural Causal Models with Cycles and Latent Confounders

We address the problem of causal discovery from data, making use of the recently proposed causal modeling framework of modular structural causal models (mSCM) to handle cycles, latent confounders and non-linearities. We introduce σ-connection graphs (σ-CG), a new class of mixed graphs (containing undirected, bidirected and directed edges) with additional structure, and extend the concept of σ-separation, the appropriate generalization of the well-known notion of d-separation in this setting, to apply to σ-CGs. We prove the closedness of σ-separation under marginalisation and conditioning and exploit this to implement a test of σ-separation on a σ-CG. This then leads us to the first causal discovery algorithm that can handle non-linear functional relations, latent confounders, cyclic causal relationships, and data from different (stochastic) perfect interventions. As a proof of concept, we show on synthetic data how well the algorithm recovers features of the causal graph of modular structural causal models.