Source author record

Maxime Gazeau

Maxime Gazeau appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.AP math.NA

Catalog footprint

What is connected

3works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Higher Order Generalization Error for First Order Discretization of Langevin Diffusion

We propose a novel approach to analyze generalization error for discretizations of Langevin diffusion, such as the stochastic gradient Langevin dynamics (SGLD). For an $ε$ tolerance of expected generalization error, it is known that a first order discretization can reach this target if we run $Ω(ε^{-1} \log (ε^{-1}) )$ iterations with $Ω(ε^{-1})$ samples. In this article, we show that with additional smoothness assumptions, even first order methods can achieve arbitrarily runtime complexity. More precisely, for each $N>0$, we provide a sufficient smoothness condition on the loss function such that a first order discretization can reach $ε$ expected generalization error given $Ω( ε^{-1/N} \log (ε^{-1}) )$ iterations with $Ω(ε^{-1})$ samples.

preprint2020arXiv

An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise

The choice of batch-size in a stochastic optimization algorithm plays a substantial role for both optimization and generalization. Increasing the batch-size used typically improves optimization but degrades generalization. To address the problem of improving generalization while maintaining optimal convergence in large-batch training, we propose to add covariance noise to the gradients. We demonstrate that the learning performance of our method is more accurately captured by the structure of the covariance matrix of the noise rather than by the variance of gradients. Moreover, over the convex-quadratic, we prove in theory that it can be characterized by the Frobenius norm of the noise matrix. Our empirical studies with standard deep learning model-architectures and datasets shows that our method not only improves generalization performance in large-batch training, but furthermore, does so in a way where the optimization performance remains desirable and the training duration is not elongated.

preprint2013arXiv

Strong order of convergence of a semidiscrete scheme for the stochastic Manakov equation

It is well accepted by physicists that the Manakov PMD equation is a good model to describe the evolution of nonlinear electric fields in optical fibers with randomly varying birefringence. In the regime of the diffusion approximation theory, an effective asymptotic dynamics has recently been obtained to describe this evolution. This equation is called the stochastic Manakov equation. In this article, we propose a semidiscrete version of a Crank Nicolson scheme for this limit equation and we analyze the strong error. Allowing sufficient regularity of the initial data, we prove that the numerical scheme has strong order 1/2.