Researcher profile

Jonathan Rougier

Jonathan Rougier contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2020arXiv

Multi-Scale Process Modelling and Distributed Computation for Spatial Data

Recent years have seen a huge development in spatial modelling and prediction methodology, driven by the increased availability of remote-sensing data and the reduced cost of distributed-processing technology. It is well known that modelling and prediction using infinite-dimensional process models is not possible with large data sets, and that both approximate models and, often, approximate-inference methods, are needed. The problem of fitting simple global spatial models to large data sets has been solved through the likes of multi-resolution approximations and nearest-neighbour techniques. Here we tackle the next challenge, that of fitting complex, nonstationary, multi-scale models to large data sets. We propose doing this through the use of superpositions of spatial processes with increasing spatial scale and increasing degrees of nonstationarity. Computation is facilitated through the use of Gaussian Markov random fields and parallel Markov chain Monte Carlo based on graph colouring. The resulting model allows for both distributed computing and distributed data. Importantly, it provides opportunities for genuine model and data scaleability and yet is still able to borrow strength across large spatial scales. We illustrate a two-scale version on a data set of sea-surface temperature containing on the order of one million observations, and compare our approach to state-of-the-art spatial modelling and prediction methods.

preprint2020arXiv

The exact form of the 'Ockham factor' in model selection

We explore the arguments for maximizing the `evidence' as an algorithm for model selection. We show, using a new definition of model complexity which we term `flexibility', that maximizing the evidence should appeal to both Bayesian and Frequentist statisticians. This is due to flexibility's unique position in the exact decomposition of log-evidence into log-fit minus flexibility. In the Gaussian linear model, flexibility is asymptotically equal to the Bayesian Information Criterion (BIC) penalty, but we caution against using BIC in place of flexibility for model selection.

preprint2014arXiv

Computation and Visualisation for large-scale Gaussian updates

In geostatistics, and also in other applications in science and engineering, we are now performing updates on Gaussian process models with many thousands or even millions of components. These large-scale inferences involve computational challenges, because the updating equations cannot be solved as written, owing to the size and cost of the matrix operations. They also involve representational challenges, to account for judgements of heterogeneity concerning the underlying fields, and diverse sources of observations. Diagnostics are particularly valuable in this situation. We present a diagnostic and visualisation tool for large-scale Gaussian updates, the `medal plot'. This shows the updated uncertainty for each observation, and also summarises the sharing of information across observations, as a proxy for the sharing of information across the state vector. It allows us to `sanity-check' the code implementing the update, but it can also reveal unexpected features in our modelling. We discuss computational issues for large-scale updates, and we illustrate with an application to assess mass trends in the Antarctic Ice Sheet.