Source author record

Zhengdao Chen

Zhengdao Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.OC math.PR math.NA Numerical Analysis

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Dynamical Central Limit Theorem for Shallow Neural Networks

Recent theoretical works have characterized the dynamics of wide shallow neural networks trained via gradient descent in an asymptotic mean-field limit when the width tends towards infinity. At initialization, the random sampling of the parameters leads to deviations from the mean-field limit dictated by the classical Central Limit Theorem (CLT). However, since gradient descent induces correlations among the parameters, it is of interest to analyze how these fluctuations evolve. Here, we use a dynamical CLT to prove that the asymptotic fluctuations around the mean limit remain bounded in mean square throughout training. The upper bound is given by a Monte-Carlo resampling error, with a variance that that depends on the 2-norm of the underlying measure, which also controls the generalization error. This motivates the use of this 2-norm as a regularization term during training. Furthermore, if the mean-field dynamics converges to a measure that interpolates the training data, we prove that the asymptotic deviation eventually vanishes in the CLT scaling. We also complement these results with numerical experiments.

preprint2022arXiv

On Feature Learning in Neural Networks with Global Convergence Guarantees

We study the optimization of wide neural networks (NNs) via gradient flow (GF) in setups that allow feature learning while admitting non-asymptotic global convergence guarantees. First, for wide shallow NNs under the mean-field scaling and with a general class of activation functions, we prove that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF. Building upon this analysis, we study a model of wide multi-layer NNs whose second-to-last layer is trained via GF, for which we also prove a linear-rate convergence of the training loss to zero, but regardless of the input dimension. We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.

preprint2020arXiv

Structure-preserving numerical integrators for Hodgkin-Huxley-type systems

Motivated by the Hodgkin-Huxley model of neuronal dynamics, we study explicit numerical integrators for "conditionally linear" systems of ordinary differential equations. We show that splitting and composition methods, when applied to the Van der Pol oscillator and to the Hodgkin-Huxley model, do a better job of preserving limit cycles of these systems for large time steps, compared with the "Euler-type" methods (including Euler's method, exponential Euler, and semi-implicit Euler) commonly used in computational neuroscience, with no increase in computational cost. These limit cycles are important to preserve, due to their role in neuronal spiking. Splitting methods even compare favorably to the explicit exponential midpoint method, which is twice as expensive per step. The second-order Strang splitting method is seen to perform especially well across a range of non-stiff and stiff dynamics.

preprint2020arXiv

Supervised Community Detection with Line Graph Neural Networks

Traditionally, community detection in graphs can be solved using spectral methods or posterior inference under probabilistic graphical models. Focusing on random graph families such as the stochastic block model, recent research has unified both approaches and identified both statistical and computational detection thresholds in terms of the signal-to-noise ratio. By recasting community detection as a node-wise classification problem on graphs, we can also study it from a learning perspective. We present a novel family of Graph Neural Networks (GNNs) for solving community detection problems in a supervised learning setting. We show that, in a data-driven manner and without access to the underlying generative models, they can match or even surpass the performance of the belief propagation algorithm on binary and multi-class stochastic block models, which is believed to reach the computational threshold. In particular, we propose to augment GNNs with the non-backtracking operator defined on the line graph of edge adjacencies. Our models also achieve good performance on real-world datasets. In addition, we perform the first analysis of the optimization landscape of training linear GNNs for community detection problems, demonstrating that under certain simplifications and assumptions, the loss values at local and global minima are not far apart.

preprint2020arXiv

Symplectic Recurrent Neural Networks

We propose Symplectic Recurrent Neural Networks (SRNNs) as learning algorithms that capture the dynamics of physical systems from observed trajectories. An SRNN models the Hamiltonian function of the system by a neural network and furthermore leverages symplectic integration, multiple-step training and initial state optimization to address the challenging numerical issues associated with Hamiltonian systems. We show that SRNNs succeed reliably on complex and noisy Hamiltonian systems. We also show how to augment the SRNN integration scheme in order to handle stiff dynamical systems such as bouncing billiards.

Zhengdao Chen

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

A Dynamical Central Limit Theorem for Shallow Neural Networks

On Feature Learning in Neural Networks with Global Convergence Guarantees

Structure-preserving numerical integrators for Hodgkin-Huxley-type systems

Supervised Community Detection with Line Graph Neural Networks

Symplectic Recurrent Neural Networks