Source author record

Yuanhan Hu

Yuanhan Hu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.OC math.ST physics.app-ph physics.optics Statistics Theory

Catalog footprint

What is connected

3works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Demonstration of broadband topological slow light

Slow-light devices are able to significantly enhance light-matter interaction due to the reduced group velocity of light, but a very low group velocity is usually achieved in a narrow bandwidth, accompanied by extreme sensitivity to imperfections that causes increased disorder-induced attenuation. Recent theories have suggested an ideal solution to this problem - unidirectional chiral photonic states, previously discovered in structures known as photonic topological insulators, not only resist backscattering from imperfections but can also be slowed down in the entire topological bandgap with multiple windings in the Brillouin zone. Here, we report on the experimental demonstration of broadband topological slow light in a photonic topological insulator. When coupled with periodic resonators that form flat bands, the chiral photonic states can wind many times around the Brillouin zone, achieving an ultra-low group velocity in the entire topological bandgap. This demonstration extends the scope of topological photonics into slow light engineering and opens a unique avenue in the dispersion manipulation of chiral photonic states.

preprint2021arXiv

Fractional moment-preserving initialization schemes for training deep neural networks

A traditional approach to initialization in deep neural networks (DNNs) is to sample the network weights randomly for preserving the variance of pre-activations. On the other hand, several studies show that during the training process, the distribution of stochastic gradients can be heavy-tailed especially for small batch sizes. In this case, weights and therefore pre-activations can be modeled with a heavy-tailed distribution that has an infinite variance but has a finite (non-integer) fractional moment of order $s$ with $s<2$. Motivated by this fact, we develop initialization schemes for fully connected feed-forward networks that can provably preserve any given moment of order $s \in (0, 2]$ over the layers for a class of activations including ReLU, Leaky ReLU, Randomized Leaky ReLU, and linear activations. These generalized schemes recover traditional initialization schemes in the limit $s \to 2$ and serve as part of a principled theory for initialization. For all these schemes, we show that the network output admits a finite almost sure limit as the number of layers grows, and the limit is heavy-tailed in some settings. This sheds further light into the origins of heavy tail during signal propagation in DNNs. We prove that the logarithm of the norm of the network outputs, if properly scaled, will converge to a Gaussian distribution with an explicit mean and variance we can compute depending on the activation used, the value of s chosen and the network width. We also prove that our initialization scheme avoids small network output values more frequently compared to traditional approaches. Furthermore, the proposed initialization strategy does not have an extra cost during the training procedure. We show through numerical experiments that our initialization can improve the training and test performance.

preprint2020arXiv

Non-Convex Optimization via Non-Reversible Stochastic Gradient Langevin Dynamics

Stochastic Gradient Langevin Dynamics (SGLD) is a powerful algorithm for optimizing a non-convex objective, where a controlled and properly scaled Gaussian noise is added to the stochastic gradients to steer the iterates towards a global minimum. SGLD is based on the overdamped Langevin diffusion which is reversible in time. By adding an anti-symmetric matrix to the drift term of the overdamped Langevin diffusion, one gets a non-reversible diffusion that converges to the same stationary distribution with a faster convergence rate. In this paper, we study the non reversible Stochastic Gradient Langevin Dynamics (NSGLD) which is based on discretization of the non-reversible Langevin diffusion. We provide finite-time performance bounds for the global convergence of NSGLD for solving stochastic non-convex optimization problems. Our results lead to non-asymptotic guarantees for both population and empirical risk minimization problems. Numerical experiments for Bayesian independent component analysis and neural network models show that NSGLD can outperform SGLD with proper choices of the anti-symmetric matrix.