Source author record

Grant M. Rotskoff

Grant M. Rotskoff appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.stat-mech cond-mat.dis-nn Machine Learning math.OC math.PR physics.data-an

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Dynamical Central Limit Theorem for Shallow Neural Networks

Recent theoretical works have characterized the dynamics of wide shallow neural networks trained via gradient descent in an asymptotic mean-field limit when the width tends towards infinity. At initialization, the random sampling of the parameters leads to deviations from the mean-field limit dictated by the classical Central Limit Theorem (CLT). However, since gradient descent induces correlations among the parameters, it is of interest to analyze how these fluctuations evolve. Here, we use a dynamical CLT to prove that the asymptotic fluctuations around the mean limit remain bounded in mean square throughout training. The upper bound is given by a Monte-Carlo resampling error, with a variance that that depends on the 2-norm of the underlying measure, which also controls the generalization error. This motivates the use of this 2-norm as a regularization term during training. Furthermore, if the mean-field dynamics converges to a measure that interpolates the training data, we prove that the asymptotic deviation eventually vanishes in the CLT scaling. We also complement these results with numerical experiments.

preprint2022arXiv

Adaptive Monte Carlo augmented with normalizing flows

Many problems in the physical sciences, machine learning, and statistical inference necessitate sampling from a high-dimensional, multi-modal probability distribution. Markov Chain Monte Carlo (MCMC) algorithms, the ubiquitous tool for this task, typically rely on random local updates to propagate configurations of a given system in a way that ensures that generated configurations will be distributed according to a target probability distribution asymptotically. In high-dimensional settings with multiple relevant metastable basins, local approaches require either immense computational effort or intricately designed importance sampling strategies to capture information about, for example, the relative populations of such basins. Here we analyze an adaptive MCMC which augments MCMC sampling with nonlocal transition kernels parameterized with generative models known as normalizing flows. We focus on a setting where there is no preexisting data, as is commonly the case for problems in which MCMC is used. Our method uses: (i) a MCMC strategy that blends local moves obtained from any standard transition kernel with those from a generative model to accelerate the sampling and (ii) the data generated this way to adapt the generative model and improve its efficacy in the MCMC algorithm. We provide a theoretical analysis of the convergence properties of this algorithm, and investigate numerically its efficiency, in particular in terms of its propensity to equilibrate fast between metastable modes whose rough location is known \textit{a~priori} but respective probability weight is not. We show that our algorithm can sample effectively across large free energy barriers, providing dramatic accelerations relative to traditional MCMC algorithms.

preprint2022arXiv

Learning nonequilibrium control forces to characterize dynamical phase transitions

Sampling the collective, dynamical fluctuations that lead to nonequilibrium pattern formation requires probing rare regions of trajectory space. Recent approaches to this problem based on importance sampling, cloning, and spectral approximations, have yielded significant insight into nonequilibrium systems, but tend to scale poorly with the size of the system, especially near dynamical phase transitions. Here we propose a machine learning algorithm that samples rare trajectories and estimates the associated large deviation functions using a many-body control force by leveraging the flexible function representation provided by deep neural networks, importance sampling in trajectory space, and stochastic optimal control theory. We show that this approach scales to hundreds of interacting particles and remains robust at dynamical phase transitions.

preprint2016arXiv

Near-optimal protocols in complex nonequilibrium transformations

The development of sophisticated experimental means to control nanoscale systems has motivated efforts to design driving protocols which minimize the energy dissipated to the environment. Computational models are a crucial tool in this practical challenge. We describe a general method for sampling an ensemble of finite-time, nonequilibrium protocols biased towards a low average dissipation. We show that this scheme can be carried out very efficiently in several limiting cases. As an application, we sample the ensemble of low-dissipation protocols that invert the magnetization of a 2D Ising model and explore how the diversity of the protocols varies in response to constraints on the average dissipation. In this example, we find that there is a large set of protocols with average dissipation close to the optimal value, which we argue is a general phenomenon.

preprint2014arXiv

Efficiency and Large Deviations in Time-Asymmetric Stochastic Heat Engines

In a stochastic heat engine driven by a cyclic non-equilibrium protocol, fluctuations in work and heat give rise to a fluctuating efficiency. Using computer simulations and tools from large deviation theory, we have examined these fluctuations in detail for a model two-state engine. We find in general that the form of efficiency probability distributions is similar to those described by Verley et al. [2014 Nat Comm, 5 4721], in particular featuring a local minimum in the long-time limit. In contrast to the time-symmetric engine protocols studied previously, however, this minimum need not occur at the value characteristic of a reversible Carnot engine. Furthermore, while the local minimum may reside at the global minimum of a large deviation rate function, it does not generally correspond to the least likely efficiency measured over finite time. We introduce a general approximation for the finite-time efficiency distribution, $P(η)$, based on large deviation statistics of work and heat, that remains very accurate even when $P(η)$ deviates significantly from its large deviation form.

Grant M. Rotskoff

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

A Dynamical Central Limit Theorem for Shallow Neural Networks

Adaptive Monte Carlo augmented with normalizing flows

Learning nonequilibrium control forces to characterize dynamical phase transitions

Near-optimal protocols in complex nonequilibrium transformations

Efficiency and Large Deviations in Time-Asymmetric Stochastic Heat Engines