Source author record

Marc Mézard

Marc Mézard appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.stat-mech cond-mat.dis-nn Information Theory math.IT Machine Learning math.PR Computational Complexity math.ST Statistics Theory physics.soc-ph Populations and Evolution Social and Information Networks cond-mat Genomics hep-th math.CO Neurons and Cognition Numerical Analysis Quantitative Methods

Catalog footprint

What is connected

23works

19topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

Dynamical Learning in Deep Asymmetric Recurrent Neural Networks

We investigate recurrent neural networks with asymmetric interactions and demonstrate that the inclusion of self-couplings or sparse excitatory inter-module connections leads to the emergence of a densely connected manifold of dynamically accessible stable configurations. This representation manifold is exponentially large in system size and is reachable through simple local dynamics, despite constituting a subdominant subset of the global configuration space. We further show that learning can be implemented directly on this structure via a fully local, gradient-free mechanism that selectively stabilizes a single task-relevant network configuration. Unlike error-driven or contrastive learning schemes, this approach does not require explicit comparisons between network states obtained with and without output supervision. Instead, transient supervisory signals bias the dynamics toward the representation manifold, after which local plasticity consolidates the attained configuration, effectively shaping the latent representation space. Numerical evaluations on standard image classification benchmarks indicate performance comparable to that of multilayer perceptrons trained using backpropagation. More generally, these results suggest that the dynamical accessibility of fixed points and the stabilization of internal network dynamics constitute viable alternative principles for learning in recurrent systems, with conceptual links to statistical physics and potential implications for biologically motivated and neuromorphic computing architectures.

preprint2022arXiv

Perturbative construction of mean-field equations in extensive-rank matrix factorization and denoising

Factorization of matrices where the rank of the two factors diverges linearly with their sizes has many applications in diverse areas such as unsupervised representation learning, dictionary learning or sparse coding. We consider a setting where the two factors are generated from known component-wise independent prior distributions, and the statistician observes a (possibly noisy) component-wise function of their matrix product. In the limit where the dimensions of the matrices tend to infinity, but their ratios remain fixed, we expect to be able to derive closed form expressions for the optimal mean squared error on the estimation of the two factors. However, this remains a very involved mathematical and algorithmic problem. A related, but simpler, problem is extensive-rank matrix denoising, where one aims to reconstruct a matrix with extensive but usually small rank from noisy measurements. In this paper, we approach both these problems using high-temperature expansions at fixed order parameters. This allows to clarify how previous attempts at solving these problems failed at finding an asymptotically exact solution. We provide a systematic way to derive the corrections to these existing approximations, taking into account the structure of correlations particular to the problem. Finally, we illustrate our approach in detail on the case of extensive-rank matrix denoising. We compare our results with known optimal rotationally-invariant estimators, and show how exact asymptotic calculations of the minimal error can be performed using extensive-rank matrix integrals.

preprint2021arXiv

The Gaussian equivalence of generative models for learning with shallow neural networks

Understanding the impact of data structure on the computational tractability of learning is a key challenge for the theory of neural networks. Many theoretical works do not explicitly model training data, or assume that inputs are drawn component-wise independently from some simple probability distribution. Here, we go beyond this simple paradigm by studying the performance of neural networks trained on data drawn from pre-trained generative models. This is possible due to a Gaussian equivalence stating that the key metrics of interest, such as the training and test errors, can be fully captured by an appropriately chosen Gaussian model. We provide three strands of rigorous, analytical and numerical evidence corroborating this equivalence. First, we establish rigorous conditions for the Gaussian equivalence to hold in the case of single-layer generative models, as well as deterministic rates for convergence in distribution. Second, we leverage this equivalence to derive a closed set of equations describing the generalisation performance of two widely studied machine learning problems: two-layer neural networks trained using one-pass stochastic gradient descent, and full-batch pre-learned features or kernel methods. Finally, we perform experiments demonstrating how our theory applies to deep, pre-trained generative models. These results open a viable path to the theoretical study of machine learning models with realistic data.

preprint2020arXiv

Generalisation error in learning with random features and the hidden manifold model

We study generalised linear regression and classification for a synthetically generated dataset encompassing different problems of interest, such as learning with random features, neural networks in the lazy training regime, and the hidden manifold model. We consider the high-dimensional regime and using the replica method from statistical physics, we provide a closed-form expression for the asymptotic generalisation performance in these problems, valid in both the under- and over-parametrised regimes and for a broad choice of generalised linear model loss functions. In particular, we show how to obtain analytically the so-called double descent behaviour for logistic regression with a peak at the interpolation threshold, we illustrate the superiority of orthogonal against random Gaussian projections in learning with random features, and discuss the role played by correlations in the data generated by the hidden manifold model. Beyond the interest in these particular problems, the theoretical formalism introduced in this manuscript provides a path to further extensions to more complex tasks.

preprint2020arXiv

High-temperature Expansions and Message Passing Algorithms

Improved mean-field technics are a central theme of statistical physics methods applied to inference and learning. We revisit here some of these methods using high-temperature expansions for disordered systems initiated by Plefka, Georges and Yedidia. We derive the Gibbs free entropy and the subsequent self-consistent equations for a generic class of statistical models with correlated matrices and show in particular that many classical approximation schemes, such as adaptive TAP, Expectation-Consistency, or the approximations behind the Vector Approximate Message Passing algorithm all rely on the same assumptions, that are also at the heart of high-temperature expansions. We focus on the case of rotationally invariant random coupling matrices in the `high-dimensional' limit in which the number of samples and the dimension are both large, but with a fixed ratio. This encapsulates many widely studied models, such as Restricted Boltzmann Machines or Generalized Linear Models with correlated data matrices. In this general setting, we show that all the approximation schemes described before are equivalent, and we conjecture that they are exact in the thermodynamic limit in the replica symmetric phases. We achieve this conclusion by resummation of the infinite perturbation series, which generalizes a seminal result of Parisi and Potters. A rigorous derivation of this conjecture is an interesting mathematical challenge. On the way to these conclusions, we uncover several diagrammatical results in connection with free probability and random matrix theory, that are interesting independently of the rest of our work.

preprint2017arXiv

Multi-Layer Generalized Linear Estimation

We consider the problem of reconstructing a signal from multi-layered (possibly) non-linear measurements. Using non-rigorous but standard methods from statistical physics we present the Multi-Layer Approximate Message Passing (ML-AMP) algorithm for computing marginal probabilities of the corresponding estimation problem and derive the associated state evolution equations to analyze its performance. We also give the expression of the asymptotic free energy and the minimal information-theoretically achievable reconstruction error. Finally, we present some applications of this measurement model for compressed sensing and perceptron learning with structured matrices/patterns, and for a simple model of estimation of latent variables in an auto-encoder.

preprint2016arXiv

Phase transitions and sample complexity in Bayes-optimal matrix factorization

We analyse the matrix factorization problem. Given a noisy measurement of a product of two matrices, the problem is to estimate back the original matrices. It arises in many applications such as dictionary learning, blind matrix calibration, sparse principal component analysis, blind source separation, low rank matrix completion, robust principal component analysis or factor analysis. It is also important in machine learning: unsupervised representation learning can often be studied through matrix factorization. We use the tools of statistical mechanics - the cavity and replica methods - to analyze the achievability and computational tractability of the inference problems in the setting of Bayes-optimal inference, which amounts to assuming that the two matrices have random independent elements generated from some known distribution, and this information is available to the inference algorithm. In this setting, we compute the minimal mean-squared-error achievable in principle in any computational time, and the error that can be achieved by an efficient approximate message passing algorithm. The computation is based on the asymptotic state-evolution analysis of the algorithm. The performance that our analysis predicts, both in terms of the achieved mean-squared-error, and in terms of sample complexity, is extremely promising and motivating for a further development of the algorithm.

preprint2015arXiv

Dynamic message-passing equations for models with unidirectional dynamics

Understanding and quantifying the dynamics of disordered out-of-equilibrium models is an important problem in many branches of science. Using the dynamic cavity method on time trajectories, we construct a general procedure for deriving the dynamic message-passing equations for a large class of models with unidirectional dynamics, which includes the zero-temperature random field Ising model, the susceptible-infected-recovered model, and rumor spreading models. We show that unidirectionality of the dynamics is the key ingredient that makes the problem solvable. These equations are applicable to single instances of the corresponding problems with arbitrary initial conditions, and are asymptotically exact for problems defined on locally tree-like graphs. When applied to real-world networks, they generically provide a good analytic approximation of the real dynamics.

preprint2014arXiv

Cavity Method: Message Passing from a Physics Perspective

In this three-sections lecture cavity method is introduced as heuristic framework from a Physics perspective to solve probabilistic graphical models and it is presented both at the replica symmetric (RS) and 1-step replica symmetry breaking (1RSB) level. This technique has been applied with success on a wide range of models and problems such as spin glasses, random constrain satisfaction problems (rCSP), error correcting codes etc. Firstly, the RS cavity solution for Sherrington-Kirkpatrick model---a fully connected spin glass model---is derived and its equivalence to the RS solution obtained using replicas is discussed. Then, the general cavity method for diluted graphs is illustrated both at RS and 1RSB level. The latter was a significant breakthrough in the last decade and has direct applications to rCSP. Finally, as example of an actual problem, K-SAT is investigated using belief and survey propagation.

preprint2014arXiv

Inferring the origin of an epidemic with a dynamic message-passing algorithm

We study the problem of estimating the origin of an epidemic outbreak -- given a contact network and a snapshot of epidemic spread at a certain time, determine the infection source. Finding the source is important in different contexts of computer or social networks. We assume that the epidemic spread follows the most commonly used susceptible-infected-recovered model. We introduce an inference algorithm based on dynamic message-passing equations, and we show that it leads to significant improvement of performance compared to existing approaches. Importantly, this algorithm remains efficient in the case where one knows the state of only a fraction of nodes.

preprint2014arXiv

Reweighted belief propagation and quiet planting for random K-SAT

We study the random K-satisfiability problem using a partition function where each solution is reweighted according to the number of variables that satisfy every clause. We apply belief propagation and the related cavity method to the reweighted partition function. This allows us to obtain several new results on the properties of random K-satisfiability problem. In particular the reweighting allows to introduce a planted ensemble that generates instances that are, in some region of parameters, equivalent to random instances. We are hence able to generate at the same time a typical random SAT instance and one of its solutions. We study the relation between clustering and belief propagation fixed points and we give a direct evidence for the existence of purely entropic (rather than energetic) barriers between clusters in some region of parameters in the random K-satisfiability problem. We exhibit, in some large planted instances, solutions with a non-trivial whitening core; such solutions were known to exist but were so far never found on very large instances. Finally, we discuss algorithmic hardness of such planted instances and we determine a region of parameters in which planting leads to satisfiable benchmarks that, up to our knowledge, are the hardest known.

preprint2013arXiv

Compressed Sensing under Matrix Uncertainty: Optimum Thresholds and Robust Approximate Message Passing

In compressed sensing one measures sparse signals directly in a compressed form via a linear transform and then reconstructs the original signal. However, it is often the case that the linear transform itself is known only approximately, a situation called matrix uncertainty, and that the measurement process is noisy. Here we present two contributions to this problem: first, we use the replica method to determine the mean-squared error of the Bayes-optimal reconstruction of sparse signals under matrix uncertainty. Second, we consider a robust variant of the approximate message passing algorithm and demonstrate numerically that in the limit of large systems, this algorithm matches the optimal performance in a large region of parameters.

preprint2013arXiv

Non-adaptive pooling strategies for detection of rare faulty items

We study non-adaptive pooling strategies for detection of rare faulty items. Given a binary sparse N-dimensional signal x, how to construct a sparse binary MxN pooling matrix F such that the signal can be reconstructed from the smallest possible number M of measurements y=Fx? We show that a very low number of measurements is possible for random spatially coupled design of pools F. Our design might find application in genetic screening or compressed genotyping. We show that our results are robust with respect to the uncertainty in the matrix F when some elements are mistaken.

preprint2013arXiv

Phase Diagram and Approximate Message Passing for Blind Calibration and Dictionary Learning

We consider dictionary learning and blind calibration for signals and matrices created from a random ensemble. We study the mean-squared error in the limit of large signal dimension using the replica method and unveil the appearance of phase transitions delimiting impossible, possible-but-hard and possible inference regions. We also introduce an approximate message passing algorithm that asymptotically matches the theoretical performance, and show through numerical tests that it performs very well, for the calibration problem, for tractable system sizes.

preprint2012arXiv

Emergence of clones in sexual populations

In sexual population, recombination reshuffles genetic variation and produces novel combinations of existing alleles, while selection amplifies the fittest genotypes in the population. If recombination is more rapid than selection, populations consist of a diverse mixture of many genotypes, as is observed in many populations. In the opposite regime, which is realized for example in the facultatively sexual populations that outcross in only a fraction of reproductive cycles, selection can amplify individual genotypes into large clones. Such clones emerge when the fitness advantage of some of the genotypes is large enough that they grow to a significant fraction of the population despite being broken down by recombination. The occurrence of this "clonal condensation" depends, in addition to the outcrossing rate, on the heritability of fitness. Clonal condensation leads to a strong genetic heterogeneity of the population which is not adequately described by traditional population genetics measures, such as Linkage Disequilibrium. Here we point out the similarity between clonal condensation and the freezing transition in the Random Energy Model of spin glasses. Guided by this analogy we explicitly calculate the probability, Y, that two individuals are genetically identical as a function of the key parameters of the model. While Y is the analog of the spin-glass order parameter, it is also closely related to rate of coalescence in population genetics: Two individuals that are part of the same clone have a recent common ancestor.

preprint2012arXiv

Probabilistic Reconstruction in Compressed Sensing: Algorithms, Phase Diagrams, and Threshold Achieving Matrices

Compressed sensing is a signal processing method that acquires data directly in a compressed form. This allows one to make less measurements than what was considered necessary to record a signal, enabling faster or more precise measurement protocols in a wide range of applications. Using an interdisciplinary approach, we have recently proposed in [arXiv:1109.4424] a strategy that allows compressed sensing to be performed at acquisition rates approaching to the theoretical optimal limits. In this paper, we give a more thorough presentation of our approach, and introduce many new results. We present the probabilistic approach to reconstruction and discuss its optimality and robustness. We detail the derivation of the message passing algorithm for reconstruction and expectation max- imization learning of signal-model parameters. We further develop the asymptotic analysis of the corresponding phase diagrams with and without measurement noise, for different distribution of signals, and discuss the best possible reconstruction performances regardless of the algorithm. We also present new efficient seeding matrices, test them on synthetic data and analyze their performance asymptotically.

preprint2012arXiv

Statistical physics-based reconstruction in compressed sensing

Compressed sensing is triggering a major evolution in signal acquisition. It consists in sampling a sparse signal at low rate and later using computational power for its exact reconstruction, so that only the necessary information is measured. Currently used reconstruction techniques are, however, limited to acquisition rates larger than the true density of the signal. We design a new procedure which is able to reconstruct exactly the signal with a number of measurements that approaches the theoretical limit in the limit of large systems. It is based on the joint use of three essential ingredients: a probabilistic approach to signal reconstruction, a message-passing algorithm adapted from belief propagation, and a careful design of the measurement matrix inspired from the theory of crystal nucleation. The performance of this new algorithm is analyzed by statistical physics methods. The obtained improvement is confirmed by numerical studies of several cases.

preprint2011arXiv

Linearization effect in multifractal analysis: Insights from the Random Energy Model

The analysis of the linearization effect in multifractal analysis, and hence of the estimation of moments for multifractal processes, is revisited borrowing concepts from the statistical physics of disordered systems, notably from the analysis of the so-called Random Energy Model. Considering a standard multifractal process (compound Poisson motion), chosen as a simple representative example, we show: i) the existence of a critical order $q^*$ beyond which moments, though finite, cannot be estimated through empirical averages, irrespective of the sample size of the observation; ii) that multifractal exponents necessarily behave linearly in $q$, for $q > q^*$. Tayloring the analysis conducted for the Random Energy Model to that of Compound Poisson motion, we provide explicative and quantitative predictions for the values of $q^*$ and for the slope controlling the linear behavior of the multifractal exponents. These quantities are shown to be related only to the definition of the multifractal process and not to depend on the sample size of the observation. Monte-Carlo simulations, conducted over a large number of large sample size realizations of compound Poisson motion, comfort and extend these analyses.

preprint2009arXiv

Decimation flows in constraint satisfaction problems

We study hard constraint satisfaction problems with a decimation approach based on message passing algorithms. Decimation induces a renormalization flow in the space of problems, and we exploit the fact that this flow transforms some of the constraints into linear constraints over GF(2). In particular, when the flow hits the subspace of linear problems, one can stop decimation and use Gaussian elimination. We introduce a new decimation algorithm which uses this linear structure and shows a strongly improved performance with respect to the usual decimation methods on some of the hardest locked occupation problems.

preprint2009arXiv

Susceptibility Propagation for Constraint Satisfaction Problems

We study the susceptibility propagation, a message-passing algorithm to compute correlation functions. It is applied to constraint satisfaction problems and its accuracy is examined. As a heuristic method to find a satisfying assignment, we propose susceptibility-guided decimation where correlations among the variables play an important role. We apply this novel decimation to locked occupation problems, a class of hard constraint satisfaction problems exhibited recently. It is shown that the present method performs better than the standard belief-guided decimation.

preprint2006arXiv

Geometrical organization of solutions to random linear Boolean equations

The random XORSAT problem deals with large random linear systems of Boolean variables. The difficulty of such problems is controlled by the ratio of number of equations to number of variables. It is known that in some range of values of this parameter, the space of solutions breaks into many disconnected clusters. Here we study precisely the corresponding geometrical organization. In particular, the distribution of distances between these clusters is computed by the cavity method. This allows to study the `x-satisfiability' threshold, the critical density of equations where there exist two solutions at a given distance.

preprint2006arXiv

The number of matchings in random graphs

We study matchings on sparse random graphs by means of the cavity method. We first show how the method reproduces several known results about maximum and perfect matchings in regular and Erdos-Renyi random graphs. Our main new result is the computation of the entropy, i.e. the leading order of the logarithm of the number of solutions, of matchings with a given size. We derive both an algorithm to compute this entropy for an arbitrary graph with a girth that diverges in the large size limit, and an analytic result for the entropy in regular and Erdos-Renyi random graph ensembles.

preprint1995arXiv

Mode-Coupling Approximations, Glass Theory and Disordered Systems

We discuss the general link between mode-coupling like equations (which serve as the basis of some recent theories of supercooled liquids) and the dynamical equations governing mean-field spin-glass models, or the dynamics of a particle in a random potential. The physical consequences of this interrelation are underlined. It suggests to extend the mode-coupling approximation to temperatures well below the freezing temperature, in which aging effects become important. In this regime we suggest some new experiments in order to test a non-trivial prediction of the Mode-Coupling picture, which is a generalized relation between the short ($β$) and long ($α$) time regimes.

Marc Mézard

What is connected

Connect this record

See the researcher in context

Building this map preview

23 published item(s)

Dynamical Learning in Deep Asymmetric Recurrent Neural Networks

Perturbative construction of mean-field equations in extensive-rank matrix factorization and denoising

The Gaussian equivalence of generative models for learning with shallow neural networks

Generalisation error in learning with random features and the hidden manifold model

High-temperature Expansions and Message Passing Algorithms

Multi-Layer Generalized Linear Estimation

Phase transitions and sample complexity in Bayes-optimal matrix factorization

Dynamic message-passing equations for models with unidirectional dynamics

Cavity Method: Message Passing from a Physics Perspective

Inferring the origin of an epidemic with a dynamic message-passing algorithm

Reweighted belief propagation and quiet planting for random K-SAT

Compressed Sensing under Matrix Uncertainty: Optimum Thresholds and Robust Approximate Message Passing

Non-adaptive pooling strategies for detection of rare faulty items

Phase Diagram and Approximate Message Passing for Blind Calibration and Dictionary Learning

Emergence of clones in sexual populations

Probabilistic Reconstruction in Compressed Sensing: Algorithms, Phase Diagrams, and Threshold Achieving Matrices

Statistical physics-based reconstruction in compressed sensing

Linearization effect in multifractal analysis: Insights from the Random Energy Model

Decimation flows in constraint satisfaction problems

Susceptibility Propagation for Constraint Satisfaction Problems

Geometrical organization of solutions to random linear Boolean equations

The number of matchings in random graphs

Mode-Coupling Approximations, Glass Theory and Disordered Systems