Source author record

Jonathan Rougier

Jonathan Rougier appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Computation math.ST Statistics Theory Methodology physics.soc-ph

Catalog footprint

What is connected

7works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Multi-Scale Process Modelling and Distributed Computation for Spatial Data

Recent years have seen a huge development in spatial modelling and prediction methodology, driven by the increased availability of remote-sensing data and the reduced cost of distributed-processing technology. It is well known that modelling and prediction using infinite-dimensional process models is not possible with large data sets, and that both approximate models and, often, approximate-inference methods, are needed. The problem of fitting simple global spatial models to large data sets has been solved through the likes of multi-resolution approximations and nearest-neighbour techniques. Here we tackle the next challenge, that of fitting complex, nonstationary, multi-scale models to large data sets. We propose doing this through the use of superpositions of spatial processes with increasing spatial scale and increasing degrees of nonstationarity. Computation is facilitated through the use of Gaussian Markov random fields and parallel Markov chain Monte Carlo based on graph colouring. The resulting model allows for both distributed computing and distributed data. Importantly, it provides opportunities for genuine model and data scaleability and yet is still able to borrow strength across large spatial scales. We illustrate a two-scale version on a data set of sea-surface temperature containing on the order of one million observations, and compare our approach to state-of-the-art spatial modelling and prediction methods.

preprint2020arXiv

The exact form of the 'Ockham factor' in model selection

We explore the arguments for maximizing the `evidence' as an algorithm for model selection. We show, using a new definition of model complexity which we term `flexibility', that maximizing the evidence should appeal to both Bayesian and Frequentist statisticians. This is due to flexibility's unique position in the exact decomposition of log-evidence into log-fit minus flexibility. In the Gaussian linear model, flexibility is asymptotically equal to the Bayesian Information Criterion (BIC) penalty, but we caution against using BIC in place of flexibility for model selection.

preprint2015arXiv

Exchangeability, the 'Histogram Theorem', and population inference

Some practical results are derived for population inference based on a sample, under the two qualitative conditions of 'ignorability' and exchangeability. These are the 'Histogram Theorem', for predicting the outcome of a non-sampled member of the population, and its application to inference about the population, both without and with groups. There are discussions of parametric versus non-parametric models, and different approaches to marginalisation. An Appendix gives a self-contained proof of the Representation Theorem for finite exchangeable sequences.

preprint2015arXiv

Rapidly bounding the exceedance probabilities of high aggregate losses

We consider the task of assessing the righthand tail of an insurer's loss distribution for some specified period, such as a year. We present and analyse six different approaches: four upper bounds, and two approximations. We examine these approaches under a variety of conditions, using a large event loss table for US hurricanes. For its combination of tightness and computational speed, we favour the Moment bound. We also consider the appropriate size of Monte Carlo simulations, and the imposition of a cap on single event losses. We strongly favour the Gamma distribution as a flexible model for single event losses, for its tractable form in all of the methods we analyse, its generalisability, and because of the ease with which a cap on losses can be incorporated.

preprint2014arXiv

Computation and Visualisation for large-scale Gaussian updates

In geostatistics, and also in other applications in science and engineering, we are now performing updates on Gaussian process models with many thousands or even millions of components. These large-scale inferences involve computational challenges, because the updating equations cannot be solved as written, owing to the size and cost of the matrix operations. They also involve representational challenges, to account for judgements of heterogeneity concerning the underlying fields, and diverse sources of observations. Diagnostics are particularly valuable in this situation. We present a diagnostic and visualisation tool for large-scale Gaussian updates, the `medal plot'. This shows the updated uncertainty for each observation, and also summarises the sharing of information across observations, as a proxy for the sharing of information across the state vector. It allows us to `sanity-check' the code implementing the update, but it can also reveal unexpected features in our modelling. We discuss computational issues for large-scale updates, and we illustrate with an application to assess mass trends in the Antarctic Ice Sheet.

preprint2014arXiv

Uncertainty in climate science and climate policy

This essay, written by a statistician and a climate scientist, describes our view of the gap that exists between current practice in mainstream climate science, and the practical needs of policymakers charged with exploring possible interventions in the context of climate change. By `mainstream' we mean the type of climate science that dominates in universities and research centres, which we will term `academic' climate science, in contrast to `policy' climate science; aspects of this distinction will become clearer in what follows. In a nutshell, we do not think that academic climate science equips climate scientists to be as helpful as they might be, when involved in climate policy assessment. Partly, we attribute this to an over-investment in high resolution climate simulators, and partly to a culture that is uncomfortable with the inherently subjective nature of climate uncertainty.

preprint2011arXiv

Discussion of: A statistical analysis of multiple temperature proxies: Are reconstructions of surface temperatures over the last 1000 years reliable?

Discussion of "A statistical analysis of multiple temperature proxies: Are reconstructions of surface temperatures over the last 1000 years reliable?" by B.B. McShane and A.J. Wyner [arXiv:1104.4002]