Researcher profile

Jiaoyang Huang

Jiaoyang Huang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

Efficient Derivative-free Bayesian Inference for Large-Scale Inverse Problems

We consider Bayesian inference for large scale inverse problems, where computational challenges arise from the need for repeated evaluations of an expensive forward model. This renders most Markov chain Monte Carlo approaches infeasible, since they typically require $O(10^4)$ model runs, or more. Moreover, the forward model is often given as a black box or is impractical to differentiate. Therefore derivative-free algorithms are highly desirable. We propose a framework, which is built on Kalman methodology, to efficiently perform Bayesian inference in such inverse problems. The basic method is based on an approximation of the filtering distribution of a novel mean-field dynamical system into which the inverse problem is embedded as an observation operator. Theoretical properties of the mean-field model are established for linear inverse problems, demonstrating that the desired Bayesian posterior is given by the steady state of the law of the filtering distribution of the mean-field dynamical system, and proving exponential convergence to it. This suggests that, for nonlinear problems which are close to Gaussian, sequentially computing this law provides the basis for efficient iterative methods to approximate the Bayesian posterior. Ensemble methods are applied to obtain interacting particle system approximations of the filtering distribution of the mean-field model; and practical strategies to further reduce the computational and memory cost of the methodology are presented, including low-rank approximation and a bi-fidelity approach. The effectiveness of the framework is demonstrated in several numerical experiments, including proof-of-concept linear/nonlinear examples and two large-scale applications: learning of permeability parameters in subsurface flow; and learning subgrid-scale parameters in a global climate model from time-averaged statistics.

preprint2022arXiv

Robustness Implies Generalization via Data-Dependent Generalization Bounds

This paper proves that robustness implies generalization via data-dependent generalization bounds. As a result, robustness and generalization are shown to be connected closely in a data-dependent manner. Our bounds improve previous bounds in two directions, to solve an open problem that has seen little development since 2010. The first is to reduce the dependence on the covering number. The second is to remove the dependence on the hypothesis space. We present several examples, including ones for lasso and deep learning, in which our bounds are provably preferable. The experiments on real-world data and theoretical models demonstrate near-exponential improvements in various situations. To achieve these improvements, we do not require additional assumptions on the unknown distribution; instead, we only incorporate an observable and computable property of the training samples. A key technical innovation is an improved concentration bound for multinomial random variables that is of independent interest beyond robustness and generalization.

preprint2021arXiv

$β$-Nonintersecting Poisson Random Walks: Law of Large Numbers and Central Limit Theorems

We study the $β$ analogue of the nonintersecting Poisson random walks. We derive a stochastic differential equation of the Stieltjes transform of the empirical measure process, which can be viewed as a dynamical version of the Nekrasov's equation in [7, Section 4]. We find that the empirical measure process converges weakly in the space of cádlág measure-valued processes to a deterministic process, characterized by the quantized free convolution, as introduced in [11]. For suitable initial data, we prove that the rescaled empirical measure process converges weakly in the space of distributions acting on analytic test functions to a Gaussian process. The means and the covariances are universal, and coincide with those of $β$-Dyson Brownian motions with the initial data constructed by the Markov-Krein correspondence. Especially, the covariance structure can be described in terms of the Gaussian Free Field. Our proof relies on integrable features of the generators of the $β$-nonintersecting Poisson random walks, the method of characteristics, and a coupling technique for Poisson random walks.

preprint2021arXiv

Improve Unscented Kalman Inversion With Low-Rank Approximation and Reduced-Order Model

The unscented Kalman inversion (UKI) presented in [1] is a general derivative-free approach to solving the inverse problem. UKI is particularly suitable for inverse problems where the forward model is given as a black box and may not be differentiable. The regularization strategy and convergence property of the UKI are thoroughly studied, and the method is demonstrated effectively handling noisy observation data and solving chaotic inverse problems. In this paper, we aim to make the UKI more efficient in terms of computational and memory costs for large scale inverse problems. We take advantages of the low-rank covariance structure to reduce the number of forward problem evaluations and the memory cost, related to the need to propagate large covariance matrices. And we leverage reduced-order model techniques to further speed up these forward evaluations. The effectiveness of the enhanced UKI is demonstrated on a barotropic model inverse problem with O($10^5$) unknown parameters and a 3D generalized circulation model (GCM) inverse problem, where each iteration is as efficient as that of gradient-based optimization methods.

preprint2020arXiv

Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes

In this paper, we theoretically prove that gradient descent can find a global minimum of non-convex optimization of all layers for nonlinear deep neural networks of sizes commonly encountered in practice. The theory developed in this paper only requires the practical degrees of over-parameterization unlike previous theories. Our theory only requires the number of trainable parameters to increase linearly as the number of training samples increases. This allows the size of the deep neural networks to be consistent with practice and to be several orders of magnitude smaller than that required by the previous theories. Moreover, we prove that the linear increase of the size of the network is the optimal rate and that it cannot be improved, except by a logarithmic factor. Furthermore, deep neural networks with the trainability guarantee are shown to generalize well to unseen test samples with a natural dataset but not a random dataset.

preprint2016arXiv

Local Kesten--McKay law for random regular graphs

We study the adjacency matrices of random $d$-regular graphs with large but fixed degree $d$. In the bulk of the spectrum $[-2\sqrt{d-1}+\varepsilon, 2\sqrt{d-1}-\varepsilon]$ down to the optimal spectral scale, we prove that the Green's functions can be approximated by those of certain infinite tree-like (few cycles) graphs that depend only on the local structure of the original graphs. This result implies that the Kesten--McKay law holds for the spectral density down to the smallest scale and the complete delocalization of bulk eigenvectors. Our method is based on estimating the Green's function of the adjacency matrices and a resampling of the boundary edges of large balls in the graphs.

preprint2013arXiv

Laurent Phenomenon Sequences

In this paper, we undertake a systematic study of recurrences x_{m+n}x_{m} = P(x_{m+1}, ..., x_{m+n-1}) which exhibit the Laurent phenomenon. Some of the most famous among these sequences come from the Somos and the Gale-Robinson recurrences. Our approach is based on finding period 1 seeds of Laurent phenomenon algebras of Lam-Pylyavskyy. We completely classify polynomials P that generate period 1 seeds in the cases of n=2,3 and of mutual binomial seeds. We also find several other interesting families of polynomials P whose generated sequences exhibit the Laurent phenomenon. Our classification for binomial seeds is a direct generalization of a result by Fordy and Marsh, that employs a new combinatorial gadget we call a double quiver.