Source author record

Douglas W. Nychka

Douglas W. Nychka appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Computation Machine Learning physics.ao-ph

Catalog footprint

What is connected

3works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Fast covariance parameter estimation of spatial Gaussian process models using neural networks

Gaussian processes (GPs) are a popular model for spatially referenced data and allow descriptive statements, predictions at new locations, and simulation of new fields. Often a few parameters are sufficient to parameterize the covariance function, and maximum likelihood (ML) methods can be used to estimate these parameters from data. ML methods, however, are computationally demanding. For example, in the case of local likelihood estimation, even fitting covariance models on modest size windows can overwhelm typical computational resources for data analysis. This limitation motivates the idea of using neural network (NN) methods to approximate ML estimates. We train NNs to take moderate size spatial fields or variograms as input and return the range and noise-to-signal covariance parameters. Once trained, the NNs provide estimates with a similar accuracy compared to ML estimation and at a speedup by a factor of 100 or more. Although we focus on a specific covariance estimation problem motivated by a climate science application, this work can be easily extended to other, more complex, spatial problems and provides a proof-of-concept for this use of machine learning in computational statistics.

preprint2019arXiv

Parallel cross-validation: a scalable fitting method for Gaussian process models

Gaussian process (GP) models are widely used to analyze spatially referenced data and to predict values at locations without observations. In contrast to many algorithmic procedures, GP models are based on a statistical framework, which enables uncertainty quantification of the model structure and predictions. Both the evaluation of the likelihood and the prediction involve solving linear systems. Hence, the computational costs are large and limit the amount of data that can be handled. While there are many approximation strategies that lower the computational cost of GP models, they often provide only sub-optimal support for the parallel computing capabilities of current (high-performance) computing environments. We aim at bridging this gap with a parameter estimation and prediction method that is designed to be parallelizable. More precisely, we divide the spatial domain into overlapping subsets and use cross-validation (CV) to estimate the covariance parameters in parallel. We present simulation studies, which assess the accuracy of the parameter estimates and predictions. Moreover, we show that our implementation has good weak and strong parallel scaling properties. For illustration, we fit an exponential covariance model to a scientifically relevant canopy height dataset with 5 million observations. Using 512 processor cores in parallel brings the evaluation time of one covariance parameter configuration to less than 1.5 minutes. The parallel CV method can be easily extended to include approximate likelihood methods, multivariate and spatio-temporal data, as well as non-stationary covariance models.

preprint2007arXiv

A Multiresolution Census Algorithm for Calculating Vortex Statistics in Turbulent Flows

The fundamental equations that model turbulent flow do not provide much insight into the size and shape of observed turbulent structures. We investigate the efficient and accurate representation of structures in two-dimensional turbulence by applying statistical models directly to the simulated vorticity field. Rather than extract the coherent portion of the image from the background variation, as in the classical signal-plus-noise model, we present a model for individual vortices using the non-decimated discrete wavelet transform. A template image, supplied by the user, provides the features to be extracted from the vorticity field. By transforming the vortex template into the wavelet domain, specific characteristics present in the template, such as size and symmetry, are broken down into components associated with spatial frequencies. Multivariate multiple linear regression is used to fit the vortex template to the vorticity field in the wavelet domain. Since all levels of the template decomposition may be used to model each level in the field decomposition, the resulting model need not be identical to the template. Application to a vortex census algorithm that records quantities of interest (such as size, peak amplitude, circulation, etc.) as the vorticity field evolves is given. The multiresolution census algorithm extracts coherent structures of all shapes and sizes in simulated vorticity fields and is able to reproduce known physical scaling laws when processing a set of voriticity fields that evolve over time.