Source author record

Thomas O'Leary-Roseberry

Thomas O'Leary-Roseberry appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Machine Learning math.NA Neural and Evolutionary Computing Numerical Analysis

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Large-scale Bayesian optimal experimental design with derivative-informed projected neural network

We address the solution of large-scale Bayesian optimal experimental design (OED) problems governed by partial differential equations (PDEs) with infinite-dimensional parameter fields. The OED problem seeks to find sensor locations that maximize the expected information gain (EIG) in the solution of the underlying Bayesian inverse problem. Computation of the EIG is usually prohibitive for PDE-based OED problems. To make the evaluation of the EIG tractable, we approximate the (PDE-based) parameter-to-observable map with a derivative-informed projected neural network (DIPNet) surrogate, which exploits the geometry, smoothness, and intrinsic low-dimensionality of the map using a small and dimension-independent number of PDE solves. The surrogate is then deployed within a greedy algorithm-based solution of the OED problem such that no further PDE solves are required. We analyze the EIG approximation error in terms of the generalization error of the DIPNet and show they are of the same order. Finally, the efficiency and accuracy of the method are demonstrated via numerical experiments on OED problems governed by inverse scattering and inverse reactive transport with up to 16,641 uncertain parameters and 100 experimental design variables, where we observe up to three orders of magnitude speedup relative to a reference double loop Monte Carlo method.

preprint2020arXiv

Ill-Posedness and Optimization Geometry for Nonlinear Neural Network Training

In this work we analyze the role nonlinear activation functions play at stationary points of dense neural network training problems. We consider a generic least squares loss function training formulation. We show that the nonlinear activation functions used in the network construction play a critical role in classifying stationary points of the loss landscape. We show that for shallow dense networks, the nonlinear activation function determines the Hessian nullspace in the vicinity of global minima (if they exist), and therefore determines the ill-posedness of the training problem. Furthermore, for shallow nonlinear networks we show that the zeros of the activation function and its derivatives can lead to spurious local minima, and discuss conditions for strict saddle points. We extend these results to deep dense neural networks, showing that the last activation function plays an important role in classifying stationary points, due to how it shows up in the gradient from the chain rule.

preprint2020arXiv

Projected Stein Variational Newton: A Fast and Scalable Bayesian Inference Method in High Dimensions

We propose a fast and scalable variational method for Bayesian inference in high-dimensional parameter space, which we call projected Stein variational Newton (pSVN) method. We exploit the intrinsic low-dimensional geometric structure of the posterior distribution in the high-dimensional parameter space via its Hessian (of the log posterior) operator and perform a parallel update of the parameter samples projected into a low-dimensional subspace by an SVN method. The subspace is adaptively constructed using the eigenvectors of the averaged Hessian at the current samples. We demonstrate fast convergence of the proposed method and its scalability with respect to the number of parameters, samples, and processor cores.

Thomas O'Leary-Roseberry

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Large-scale Bayesian optimal experimental design with derivative-informed projected neural network

Ill-Posedness and Optimization Geometry for Nonlinear Neural Network Training

Projected Stein Variational Newton: A Fast and Scalable Bayesian Inference Method in High Dimensions

Thomas O&#39;Leary-Roseberry

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Large-scale Bayesian optimal experimental design with derivative-informed projected neural network

Ill-Posedness and Optimization Geometry for Nonlinear Neural Network Training

Projected Stein Variational Newton: A Fast and Scalable Bayesian Inference Method in High Dimensions

Thomas O'Leary-Roseberry