Source author record

Jiaoyang Huang

Jiaoyang Huang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.PR math-ph math.CO math.MP Machine Learning math.NA math.OC Numerical Analysis Artificial Intelligence Computer Vision math.RT Neural and Evolutionary Computing

Catalog footprint

What is connected

12works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Efficient Derivative-free Bayesian Inference for Large-Scale Inverse Problems

We consider Bayesian inference for large scale inverse problems, where computational challenges arise from the need for repeated evaluations of an expensive forward model. This renders most Markov chain Monte Carlo approaches infeasible, since they typically require $O(10^4)$ model runs, or more. Moreover, the forward model is often given as a black box or is impractical to differentiate. Therefore derivative-free algorithms are highly desirable. We propose a framework, which is built on Kalman methodology, to efficiently perform Bayesian inference in such inverse problems. The basic method is based on an approximation of the filtering distribution of a novel mean-field dynamical system into which the inverse problem is embedded as an observation operator. Theoretical properties of the mean-field model are established for linear inverse problems, demonstrating that the desired Bayesian posterior is given by the steady state of the law of the filtering distribution of the mean-field dynamical system, and proving exponential convergence to it. This suggests that, for nonlinear problems which are close to Gaussian, sequentially computing this law provides the basis for efficient iterative methods to approximate the Bayesian posterior. Ensemble methods are applied to obtain interacting particle system approximations of the filtering distribution of the mean-field model; and practical strategies to further reduce the computational and memory cost of the methodology are presented, including low-rank approximation and a bi-fidelity approach. The effectiveness of the framework is demonstrated in several numerical experiments, including proof-of-concept linear/nonlinear examples and two large-scale applications: learning of permeability parameters in subsurface flow; and learning subgrid-scale parameters in a global climate model from time-averaged statistics.

preprint2022arXiv

Robustness Implies Generalization via Data-Dependent Generalization Bounds

This paper proves that robustness implies generalization via data-dependent generalization bounds. As a result, robustness and generalization are shown to be connected closely in a data-dependent manner. Our bounds improve previous bounds in two directions, to solve an open problem that has seen little development since 2010. The first is to reduce the dependence on the covering number. The second is to remove the dependence on the hypothesis space. We present several examples, including ones for lasso and deep learning, in which our bounds are provably preferable. The experiments on real-world data and theoretical models demonstrate near-exponential improvements in various situations. To achieve these improvements, we do not require additional assumptions on the unknown distribution; instead, we only incorporate an observable and computable property of the training samples. A key technical innovation is an improved concentration bound for multinomial random variables that is of independent interest beyond robustness and generalization.

preprint2021arXiv

$β$-Nonintersecting Poisson Random Walks: Law of Large Numbers and Central Limit Theorems

We study the $β$ analogue of the nonintersecting Poisson random walks. We derive a stochastic differential equation of the Stieltjes transform of the empirical measure process, which can be viewed as a dynamical version of the Nekrasov's equation in [7, Section 4]. We find that the empirical measure process converges weakly in the space of cádlág measure-valued processes to a deterministic process, characterized by the quantized free convolution, as introduced in [11]. For suitable initial data, we prove that the rescaled empirical measure process converges weakly in the space of distributions acting on analytic test functions to a Gaussian process. The means and the covariances are universal, and coincide with those of $β$-Dyson Brownian motions with the initial data constructed by the Markov-Krein correspondence. Especially, the covariance structure can be described in terms of the Gaussian Free Field. Our proof relies on integrable features of the generators of the $β$-nonintersecting Poisson random walks, the method of characteristics, and a coupling technique for Poisson random walks.

preprint2021arXiv

Improve Unscented Kalman Inversion With Low-Rank Approximation and Reduced-Order Model

The unscented Kalman inversion (UKI) presented in [1] is a general derivative-free approach to solving the inverse problem. UKI is particularly suitable for inverse problems where the forward model is given as a black box and may not be differentiable. The regularization strategy and convergence property of the UKI are thoroughly studied, and the method is demonstrated effectively handling noisy observation data and solving chaotic inverse problems. In this paper, we aim to make the UKI more efficient in terms of computational and memory costs for large scale inverse problems. We take advantages of the low-rank covariance structure to reduce the number of forward problem evaluations and the memory cost, related to the need to propagate large covariance matrices. And we leverage reduced-order model techniques to further speed up these forward evaluations. The effectiveness of the enhanced UKI is demonstrated on a barotropic model inverse problem with O($10^5$) unknown parameters and a 3D generalized circulation model (GCM) inverse problem, where each iteration is as efficient as that of gradient-based optimization methods.

preprint2020arXiv

Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes

In this paper, we theoretically prove that gradient descent can find a global minimum of non-convex optimization of all layers for nonlinear deep neural networks of sizes commonly encountered in practice. The theory developed in this paper only requires the practical degrees of over-parameterization unlike previous theories. Our theory only requires the number of trainable parameters to increase linearly as the number of training samples increases. This allows the size of the deep neural networks to be consistent with practice and to be several orders of magnitude smaller than that required by the previous theories. Moreover, we prove that the linear increase of the size of the network is the optimal rate and that it cannot be improved, except by a logarithmic factor. Furthermore, deep neural networks with the trainability guarantee are shown to generalize well to unseen test samples with a natural dataset but not a random dataset.

preprint2016arXiv

Local Kesten--McKay law for random regular graphs

We study the adjacency matrices of random $d$-regular graphs with large but fixed degree $d$. In the bulk of the spectrum $[-2\sqrt{d-1}+\varepsilon, 2\sqrt{d-1}-\varepsilon]$ down to the optimal spectral scale, we prove that the Green's functions can be approximated by those of certain infinite tree-like (few cycles) graphs that depend only on the local structure of the original graphs. This result implies that the Kesten--McKay law holds for the spectral density down to the smallest scale and the complete delocalization of bulk eigenvectors. Our method is based on estimating the Green's function of the adjacency matrices and a resampling of the boundary edges of large balls in the graphs.

preprint2015arXiv

Bulk universality of sparse random matrices

We consider the adjacency matrix of the ensemble of Erdős-Rényi random graphs which consists of graphs on $N$ vertices in which each edge occurs independently with probability $p$. We prove that in the regime $pN \gg 1$ these matrices exhibit bulk universality in the sense that both the averaged $n$-point correlation functions and distribution of a single eigenvalue gap coincide with those of the GOE. Our methods extend to a class of random matrices which includes sparse ensembles whose entries have different variances.

preprint2015arXiv

Mesoscopic Perturbations of Large Random Matrices

We consider the eigenvalues and eigenvectors of small rank perturbations of random $N\times N$ matrices. We allow the rank of perturbation $M$ increases with $N$, and the only assumption is $M=o(N)$. In both additive and multiplicative perturbation models, we prove rigidity results for the outliers of the perturbed random matrices. Based on the rigidity results we derive the empirical distribution of outliers of the perturbed random matrices. We also compute the appropriate projection of eigenvectors corresponding to the outliers of the perturbed random matrices, which are approximate eigenvectors of the perturbing matrix. Our results can be regarded as the extension of the finite rank perturbation case to the full generality up to $M=o(N)$.

preprint2015arXiv

Spectral statistics of sparse Erdős-Rényi graph Laplacians

We consider the bulk eigenvalue statistics of Laplacian matrices of large Erdős-Rényi random graphs in the regime $p \geq N^δ/N$ for any fixed $δ>0$. We prove a local law down to the optimal scale $η\gtrsim N^{-1}$ which implies that the eigenvectors are delocalized. We consider the local eigenvalue statistics and prove that both the gap statistics and averaged correlation functions coincide with the GOE in the bulk.

preprint2014arXiv

Asymptotic Expansion of Spherical Integral

We consider the spherical integral of real symmetric or Hermitian matrices when the rank of one matrix is one. We prove the existence of the full asymptotic expansions of these spherical integrals and derive the first and the second term in the asymptotic expansion. Using asymptotic expression of the spherical integral, we derive the asymptotic freeness of Wigner matrices with (deterministic) Hermitian matrices.

preprint2014arXiv

Partition Statistics Equidistributed with the Number of Hook Difference One Cells

Let $λ$ be a partition, viewed as a Young diagram. We define the hook difference of a cell of $λ$ to be the difference of its leg and arm lengths. Define $h_{1,1}(λ)$ to be the number of cells of $λ$ with hook difference one. In the paper of Buryak and Feigin (arXiv:1206.5640), algebraic geometry is used to prove a generating function identity which implies that $h_{1,1}$ is equidistributed with $a_2$, the largest part of a partition that appears at least twice, over the partitions of a given size. In this paper, we propose a refinement of the theorem of Buryak and Feigin and prove some partial results using combinatorial methods. We also obtain a new formula for the q-Catalan numbers which naturally leads us to define a new q,t-Catalan number with a simple combinatorial interpretation.

preprint2013arXiv

Laurent Phenomenon Sequences

In this paper, we undertake a systematic study of recurrences x_{m+n}x_{m} = P(x_{m+1}, ..., x_{m+n-1}) which exhibit the Laurent phenomenon. Some of the most famous among these sequences come from the Somos and the Gale-Robinson recurrences. Our approach is based on finding period 1 seeds of Laurent phenomenon algebras of Lam-Pylyavskyy. We completely classify polynomials P that generate period 1 seeds in the cases of n=2,3 and of mutual binomial seeds. We also find several other interesting families of polynomials P whose generated sequences exhibit the Laurent phenomenon. Our classification for binomial seeds is a direct generalization of a result by Fordy and Marsh, that employs a new combinatorial gadget we call a double quiver.

Jiaoyang Huang

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Efficient Derivative-free Bayesian Inference for Large-Scale Inverse Problems

Robustness Implies Generalization via Data-Dependent Generalization Bounds

$β$-Nonintersecting Poisson Random Walks: Law of Large Numbers and Central Limit Theorems

Improve Unscented Kalman Inversion With Low-Rank Approximation and Reduced-Order Model

Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes

Local Kesten--McKay law for random regular graphs

Bulk universality of sparse random matrices

Mesoscopic Perturbations of Large Random Matrices

Spectral statistics of sparse Erdős-Rényi graph Laplacians

Asymptotic Expansion of Spherical Integral

Partition Statistics Equidistributed with the Number of Hook Difference One Cells

Laurent Phenomenon Sequences