Source author record

Bethany Lusch

Bethany Lusch appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning physics.comp-ph physics.flu-dyn Artificial Intelligence Data Structures and Algorithms Discrete Mathematics Methodology Neurons and Cognition Quantitative Methods stat.OT

Catalog footprint

What is connected

8works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

AutoDEUQ: Automated Deep Ensemble with Uncertainty Quantification

Deep neural networks are powerful predictors for a variety of tasks. However, they do not capture uncertainty directly. Using neural network ensembles to quantify uncertainty is competitive with approaches based on Bayesian neural networks while benefiting from better computational scalability. However, building ensembles of neural networks is a challenging task because, in addition to choosing the right neural architecture or hyperparameters for each member of the ensemble, there is an added cost of training each model. We propose AutoDEUQ, an automated approach for generating an ensemble of deep neural networks. Our approach leverages joint neural architecture and hyperparameter search to generate ensembles. We use the law of total variance to decompose the predictive variance of deep ensembles into aleatoric (data) and epistemic (model) uncertainties. We show that AutoDEUQ outperforms probabilistic backpropagation, Monte Carlo dropout, deep ensemble, distribution-free ensembles, and hyper ensemble methods on a number of regression benchmarks.

preprint2020arXiv

Non-autoregressive time-series methods for stable parametric reduced-order models

Advection-dominated dynamical systems, characterized by partial differential equations, are found in applications ranging from weather forecasting to engineering design where accuracy and robustness are crucial. There has been significant interest in the use of techniques borrowed from machine learning to reduce the computational expense and/or improve the accuracy of predictions for these systems. These rely on the identification of a basis that reduces the dimensionality of the problem and the subsequent use of time series and sequential learning methods to forecast the evolution of the reduced state. Often, however, machine-learned predictions after reduced-basis projection are plagued by issues of stability stemming from incomplete capture of multiscale processes as well as due to error growth for long forecast durations. To address these issues, we have developed a \emph{non-autoregressive} time series approach for predicting linear reduced-basis time histories of forward models. In particular, we demonstrate that non-autoregressive counterparts of sequential learning methods such as long short-term memory (LSTM) considerably improve the stability of machine-learned reduced-order models. We evaluate our approach on the inviscid shallow water equations and show that a non-autoregressive variant of the standard LSTM approach that is bidirectional in the PCA components obtains the best accuracy for recreating the nonlinear dynamics of partial observations. Moreover---and critical for many applications of these surrogates---inference times are reduced by three orders of magnitude using our approach, compared with both the equation-based Galerkin projection method and the standard LSTM approach.

preprint2020arXiv

Recurrent Neural Network Architecture Search for Geophysical Emulation

Developing surrogate geophysical models from data is a key research topic in atmospheric and oceanic modeling because of the large computational costs associated with numerical simulation methods. Researchers have started applying a wide range of machine learning models, in particular neural networks, to geophysical data for forecasting without these constraints. Constructing neural networks for forecasting such data is nontrivial, however, and often requires trial and error. To address these limitations, we focus on developing proper-orthogonal-decomposition-based long short-term memory networks (POD-LSTMs). We develop a scalable neural architecture search for generating stacked LSTMs to forecast temperature in the NOAA Optimum Interpolation Sea-Surface Temperature data set. Our approach identifies POD-LSTMs that are superior to manually designed variants and baseline time-series prediction methods. We also assess the scalability of different architecture search strategies on up to 512 Intel Knights Landing nodes of the Theta supercomputer at the Argonne Leadership Computing Facility.

preprint2019arXiv

Data-driven discovery of coordinates and governing equations

The discovery of governing equations from scientific data has the potential to transform data-rich fields that lack well-characterized quantitative descriptions. Advances in sparse regression are currently enabling the tractable identification of both the structure and parameters of a nonlinear dynamical system from data. The resulting models have the fewest terms necessary to describe the dynamics, balancing model complexity with descriptive ability, and thus promoting interpretability and generalizability. This provides an algorithmic approach to Occam's razor for model discovery. However, this approach fundamentally relies on an effective coordinate system in which the dynamics have a simple representation. In this work, we design a custom autoencoder to discover a coordinate transformation into a reduced space where the dynamics may be sparsely represented. Thus, we simultaneously learn the governing equations and the associated coordinate system. We demonstrate this approach on several example high-dimensional dynamical systems with low-dimensional behavior. The resulting modeling framework combines the strengths of deep neural networks for flexible representation and sparse identification of nonlinear dynamics (SINDy) for parsimonious models. It is the first method of its kind to place the discovery of coordinates and models on an equal footing.

preprint2019arXiv

Time-series learning of latent-space dynamics for reduced-order model closure

We study the performance of long short-term memory networks (LSTMs) and neural ordinary differential equations (NODEs) in learning latent-space representations of dynamical equations for an advection-dominated problem given by the viscous Burgers equation. Our formulation is devised in a non-intrusive manner with an equation-free evolution of dynamics in a reduced space with the latter being obtained through a proper orthogonal decomposition. In addition, we leverage the sequential nature of learning for both LSTMs and NODEs to demonstrate their capability for closure in systems which are not completely resolved in the reduced space. We assess our hypothesis for two advection-dominated problems given by the viscous Burgers equation. It is observed that both LSTMs and NODEs are able to reproduce the effects of the absent scales for our test cases more effectively than intrusive dynamics evolution through a Galerkin projection. This result empirically suggests that time-series learning techniques implicitly leverage a memory kernel for coarse-grained system closure as is suggested through the Mori-Zwanzig formalism.

preprint2016arXiv

Modeling cognitive deficits following neurodegenerative diseases and traumatic brain injuries with deep convolutional neural networks

The accurate diagnosis and assessment of neurodegenerative disease and traumatic brain injuries (TBI) remain open challenges. Both cause cognitive and functional deficits due to focal axonal swellings (FAS), but it is difficult to deliver a prognosis due to our limited ability to assess damaged neurons at a cellular level in vivo. We simulate the effects of neurodegenerative disease and TBI using convolutional neural networks (CNNs) as our model of cognition. We utilize biophysically relevant statistical data on FAS to damage the connections in CNNs in a functionally relevant way. We incorporate energy constraints on the brain by pruning the CNNs to be less over-engineered. Qualitatively, we demonstrate that damage leads to human-like mistakes. Our experiments also provide quantitative assessments of how accuracy is affected by various types and levels of damage. The deficit resulting from a fixed amount of damage greatly depends on which connections are randomly injured, providing intuition for why it is difficult to predict impairments. There is a large degree of subjectivity when it comes to interpreting cognitive deficits from complex systems such as the human brain. However, we provide important insight and a quantitative framework for disorders in which FAS are implicated.

preprint2016arXiv

Shape Constrained Tensor Decompositions using Sparse Representations in Over-Complete Libraries

We consider $N$-way data arrays and low-rank tensor factorizations where the time mode is coded as a sparse linear combination of temporal elements from an over-complete library. Our method, Shape Constrained Tensor Decomposition (SCTD) is based upon the CANDECOMP/PARAFAC (CP) decomposition which produces $r$-rank approximations of data tensors via outer products of vectors in each dimension of the data. By constraining the vector in the temporal dimension to known analytic forms which are selected from a large set of candidate functions, more readily interpretable decompositions are achieved and analytic time dependencies discovered. The SCTD method circumvents traditional {\em flattening} techniques where an $N$-way array is reshaped into a matrix in order to perform a singular value decomposition. A clear advantage of the SCTD algorithm is its ability to extract transient and intermittent phenomena which is often difficult for SVD-based methods. We motivate the SCTD method using several intuitively appealing results before applying it on a number of high-dimensional, real-world data sets in order to illustrate the efficiency of the algorithm in extracting interpretable spatio-temporal modes. With the rise of data-driven discovery methods, the decomposition proposed provides a viable technique for analyzing multitudes of data in a more comprehensible fashion.

preprint2015arXiv

Submodular Hamming Metrics

We show that there is a largely unexplored class of functions (positive polymatroids) that can define proper discrete metrics over pairs of binary vectors and that are fairly tractable to optimize over. By exploiting submodularity, we are able to give hardness results and approximation algorithms for optimizing over such metrics. Additionally, we demonstrate empirically the effectiveness of these metrics and associated algorithms on both a metric minimization task (a form of clustering) and also a metric maximization task (generating diverse k-best lists).

Bethany Lusch

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

AutoDEUQ: Automated Deep Ensemble with Uncertainty Quantification

Non-autoregressive time-series methods for stable parametric reduced-order models

Recurrent Neural Network Architecture Search for Geophysical Emulation

Data-driven discovery of coordinates and governing equations

Time-series learning of latent-space dynamics for reduced-order model closure

Modeling cognitive deficits following neurodegenerative diseases and traumatic brain injuries with deep convolutional neural networks

Shape Constrained Tensor Decompositions using Sparse Representations in Over-Complete Libraries

Submodular Hamming Metrics