Source author record

Håvard Rue

Håvard Rue appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Computation Applications math.ST Statistics Theory Distributed, Parallel, and Cluster Computing Machine Learning

Catalog footprint

What is connected

34works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

An Extended Simplified Laplace strategy for Approximate Bayesian inference of Latent Gaussian Models using R-INLA

Various computational challenges arise when applying Bayesian inference approaches to complex hierarchical models. Sampling-based inference methods, such as Markov Chain Monte Carlo strategies, are renowned for providing accurate results but with high computational costs and slow or questionable convergence. On the contrary, approximate methods like the Integrated Nested Laplace Approximation (INLA) construct a deterministic approximation to the univariate posteriors through nested Laplace Approximations. This method enables fast inference performance in Latent Gaussian Models, which encode a large class of hierarchical models. R-INLA software mainly consists of three strategies to compute all the required posterior approximations depending on the accuracy requirements. The Simplified Laplace approximation (SLA) is the most attractive because of its speed performance since it is based on a Taylor expansion up to order three of a full Laplace Approximation. Here we enhance the methodology by simplifying the computations necessary for the skewness and modal configuration. Then we propose an expansion up to order four and use the Extended Skew Normal distribution as a new parametric fit. The resulting approximations to the marginal posterior densities are more accurate than those calculated with the SLA, with essentially no additional cost.

preprint2022arXiv

Joint Modeling and Prediction of Massive Spatio-Temporal Wildfire Count and Burnt Area Data with the INLA-SPDE Approach

This paper describes the methodology used by the team RedSea in the data competition organized for EVA 2021 conference. We develop a novel two-part model to jointly describe the wildfire count data and burnt area data provided by the competition organizers with covariates. Our proposed methodology relies on the integrated nested Laplace approximation combined with the stochastic partial differential equation (INLA-SPDE) approach. In the first part, a binary non-stationary spatio-temporal model is used to describe the underlying process that determines whether or not there is wildfire at a specific time and location. In the second part, we consider a non-stationary model that is based on log-Gaussian Cox processes for positive wildfire count data, and a non-stationary log-Gaussian model for positive burnt area data. Dependence between the positive count data and positive burnt area data is captured by a shared spatio-temporal random effect. Our two-part modeling approach performs well in terms of the prediction score criterion chosen by the data competition organizers. Moreover, our model results show that surface pressure is the most influential driver for the occurrence of a wildfire, whilst surface net solar radiation and surface pressure are the key drivers for large numbers of wildfires, and temperature and evaporation are the key drivers of large burnt areas.

preprint2022arXiv

Joint Quantile Disease Mapping with Application to Malaria and G6PD Deficiency

Statistical analysis based on quantile regression methods is more comprehensive, flexible, and less sensitive to outliers when compared to mean regression methods. When the link between different diseases are of interest, joint disease mapping is useful for measuring directional correlation between them. Most studies study this link through multiple correlated mean regressions. In this paper we propose a joint quantile regression framework for multiple diseases where different quantile levels can be considered. We are motivated by the theorized link between the presence of Malaria and the gene deficiency G6PD, where medical scientist have anecdotally discovered a possible link between high levels of G6PD and lower than expected levels of Malaria initially pointing towards the occurrence of G6PD inhibiting the occurrence of Malaria. This link cannot be investigated with mean regressions and thus the need for flexible joint quantile regression in a disease mapping framework. Our joint quantile disease mapping model can be used for linear and non-linear effects of covariates by stochastic splines, since we define it as a latent Gaussian model. We perform Bayesian inference of this model using the INLA framework embedded in the R software package INLA. Finally, we illustrate the applicability of model by analyzing the malaria and G6PD deficiency incidences in 21 African countries using linked quantiles of different levels.

preprint2022arXiv

Parallelized integrated nested Laplace approximations for fast Bayesian inference

There is a growing demand for performing larger-scale Bayesian inference tasks, arising from greater data availability and higher-dimensional model parameter spaces. In this work we present parallelization strategies for the methodology of integrated nested Laplace approximations (INLA), a popular framework for performing approximate Bayesian inference on the class of Latent Gaussian models. Our approach makes use of nested OpenMP parallelism, a parallel line search procedure using robust regression in INLA's optimization phase and the state-of-the-art sparse linear solver PARDISO. We leverage mutually independent function evaluations in the algorithm as well as advanced sparse linear algebra techniques. This way we can flexibly utilize the power of today's multi-core architectures. We demonstrate the performance of our new parallelization scheme on a number of different real-world applications. The introduction of parallelism leads to speedups of a factor 10 and more for all larger models. Our work is already integrated in the current version of the open-source R-INLA package, making its improved performance conveniently available to all users.

preprint2022arXiv

Practical strategies for GEV-based regression models for extremes

The generalised extreme value (GEV) distribution is a three parameter family that describes the asymptotic behaviour of properly renormalised maxima of a sequence of independent and identically distributed random variables. If the shape parameter $ξ$ is zero, the GEV distribution has unbounded support, whereas if $ξ$ is positive, the limiting distribution is heavy-tailed with infinite upper endpoint but finite lower endpoint. In practical applications, we assume that the GEV family is a reasonable approximation for the distribution of maxima over blocks, and we fit it accordingly. This implies that GEV properties, such as finite lower endpoint in the case $ξ>0$, are inherited by the finite-sample maxima, which might not have bounded support. This is particularly problematic when predicting extreme observations based on multiple and interacting covariates. To tackle this usually overlooked issue, we propose a blended GEV distribution, which smoothly combines the left tail of a Gumbel distribution (GEV with $ξ=0$) with the right tail of a Fréchet distribution (GEV with $ξ>0$) and, therefore, has unbounded support. Using a Bayesian framework, we reparametrise the GEV distribution to offer a more natural interpretation of the (possibly covariate-dependent) model parameters. Independent priors over the new location and spread parameters induce a joint prior distribution for the original location and scale parameters. We introduce the concept of property-preserving penalised complexity (P$^3$C) priors and apply it to the shape parameter to preserve first and second moments. We illustrate our methods with an application to NO$_2$ pollution levels in California, which reveals the robustness of the bGEV distribution, as well as the suitability of the new parametrisation and the P$^3$C prior framework.

preprint2022arXiv

Variance partitioning in spatio-temporal disease mapping models

Bayesian disease mapping, yet if undeniably useful to describe variation in risk over time and space, comes with the hurdle of prior elicitation on hard-to-interpret random effect precision parameters. We introduce a reparametrized version of the popular spatio-temporal interaction models, based on Kronecker product intrinsic Gaussian Markov Random Fields, that we name the variance partitioning (VP) model. The VP model includes a mixing parameter that balances the contribution of the main and interaction effects to the total (generalized) variance and enhances interpretability. The use of a penalized complexity prior on the mixing parameter aids in coding prior information in a intuitive way. We illustrate the advantages of the VP model using two case studies.

preprint2021arXiv

Fast Bayesian inference of Block Nearest Neighbor Gaussian process for large data

This paper presents the development of a spatial block-Nearest Neighbor Gaussian process (block-NNGP) for location-referenced large spatial data. The key idea behind this approach is to divide the spatial domain into several blocks which are dependent under some constraints. The cross-blocks capture the large-scale spatial dependence, while each block captures the small-scale spatial dependence. The resulting block-NNGP enjoys Markov properties reflected on its sparse precision matrix. It is embedded as a prior within the class of latent Gaussian models, thus Bayesian inference is obtained using the integrated nested Laplace approximation (INLA). The performance of the block-NNGP is illustrated on simulated examples and massive real data for locations in the order of $10^4$.

preprint2021arXiv

Importance Sampling with the Integrated Nested Laplace Approximation

The Integrated Nested Laplace Approximation (INLA) is a deterministic approach to Bayesian inference on latent Gaussian models (LGMs) and focuses on fast and accurate approximation of posterior marginals for the parameters in the models. Recently, methods have been developed to extend this class of models to those that can be expressed as conditional LGMs by fixing some of the parameters in the models to descriptive values. These methods differ in the manner descriptive values are chosen. This paper proposes to combine importance sampling with INLA (IS-INLA), and extends this approach with the more robust adaptive multiple importance sampling algorithm combined with INLA (AMIS-INLA). This paper gives a comparison between these approaches and existing methods on a series of applications with simulated and observed datasets and evaluates their performance based on accuracy, efficiency, and robustness. The approaches are validated by exact posteriors in a simple bivariate linear model; then, they are applied to a Bayesian lasso model, a Bayesian imputation of missing covariate values, and lastly, in parametric Bayesian quantile regression. The applications show that the AMIS-INLA approach, in general, outperforms the other methods, but the IS-INLA algorithm could be considered for faster inference when good proposals are available.

preprint2020arXiv

Efficient Quantile Tracking Using an Oracle

For incremental quantile estimators the step size and possibly other tuning parameters must be carefully set. However, little attention has been given on how to set these values in an online manner. In this article we suggest two novel procedures that address this issue. The core part of the procedures is to estimate the current tracking mean squared error (MSE). The MSE is decomposed in tracking variance and bias and novel and efficient procedures to estimate these quantities are presented. It is shown that estimation bias can be tracked by associating it with the portion of observations below the quantile estimates. The first procedure runs an ensemble of $L$ quantile estimators for wide range of values of the tuning parameters and typically around $L = 100$. In each iteration an oracle selects the best estimate by the guidance of the estimated MSEs. The second method only runs an ensemble of $L = 3$ estimators and thus the values of the tuning parameters need from time to time to be adjusted for the running estimators. The procedures have a low memory foot print of $8L$ and a computational complexity of $8L$ per iteration. The experiments show that the procedures are highly efficient and track quantiles with an error close to the theoretical optimum. The Oracle approach performs best, but comes with higher computational cost. The procedures were further applied to a massive real-life data stream of tweets and proofed real world applicability of them.

preprint2020arXiv

Estimating Tukey Depth Using Incremental Quantile Estimators

The concept of depth represents methods to measure how deep an arbitrary point is positioned in a dataset and can be seen as the opposite of outlyingness. It has proved very useful and a wide range of methods have been developed based on the concept. To address the well-known computational challenges associated with the depth concept, we suggest to estimate Tukey depth contours using recently developed incremental quantile estimators. The suggested algorithm can estimate depth contours when the dataset in known in advance, but also recursively update and even track Tukey depth contours for dynamically varying data stream distributions. Tracking was demonstrated in a real-life data example where changes in human activity was detected in real-time from accelerometer observations.

preprint2016arXiv

An intuitive Bayesian spatial model for disease mapping that accounts for scaling

In recent years, disease mapping studies have become a routine application within geographical epidemiology and are typically analysed within a Bayesian hierarchical model formulation. A variety of model formulations for the latent level have been proposed but all come with inherent issues. In the classical BYM model, the spatially structured component cannot be seen independently from the unstructured component. This makes prior definitions for the hyperparameters of the two random effects challenging. There are alternative model formulations that address this confounding, however, the issue on how to choose interpretable hyperpriors is still unsolved. Here, we discuss a recently proposed parameterisation of the BYM model that leads to improved parameter control as the hyperparameters can be seen independently from each other. Furthermore, the need for a scaled spatial component is addressed, which facilitates assignment of interpretable hyperpriors and make these transferable between spatial applications with different graph structures. We provide implementation details for the new model formulation which preserve sparsity properties, and we investigate systematically the model performance and compare it to existing parameterisations. Through a simulation study, we show that the new model performs well, both showing good learning abilities and good shrinkage behaviour. In terms of model choice criteria, the proposed model performs at least equally well as existing parameterisations, but only the new formulation offers parameters that are interpretable and hyperpriors that have a clear meaning.

preprint2016arXiv

Bayesian Computing with INLA: A Review

The key operation in Bayesian inference, is to compute high-dimensional integrals. An old approximate technique is the Laplace method or approximation, which dates back to Pierre- Simon Laplace (1774). This simple idea approximates the integrand with a second order Taylor expansion around the mode and computes the integral analytically. By developing a nested version of this classical idea, combined with modern numerical techniques for sparse matrices, we obtain the approach of Integrated Nested Laplace Approximations (INLA) to do approximate Bayesian inference for latent Gaussian models (LGMs). LGMs represent an important model-abstraction for Bayesian inference and include a large proportion of the statistical models used today. In this review, we will discuss the reasons for the success of the INLA-approach, the R-INLA package, why it is so accurate, why the approximations are very quick to compute and why LGMs make such a useful concept for Bayesian computing.

preprint2016arXiv

Fractional Gaussian noise: Prior specification and model comparison

Fractional Gaussian noise (fGn) is a self-similar stochastic process used to model anti-persistent or persistent dependency structures in observed time series. Properties of the autocovariance function of fGn are characterised by the Hurst exponent (H), which in Bayesian contexts typically has been assigned a uniform prior on the unit interval. This paper argues why a uniform prior is unreasonable and introduces the use of a penalised complexity (PC) prior for H. The PC prior is computed to penalise divergence from the special case of white noise, and is invariant to reparameterisations. An immediate advantage is that the exact same prior can be used for the autocorrelation coefficient of a first-order autoregressive process AR(1), as this model also reflects a flexible version of white noise. Within the general setting of latent Gaussian models, this allows us to compare an fGn model component with AR(1) using Bayes factors, avoiding confounding effects of prior choices for the hyperparameters. Among others, this is useful in climate regression models where inference for underlying linear or smooth trends depends heavily on the assumed noise model.

preprint2016arXiv

Penalised complexity priors for stationary autoregressive processes

The autoregressive process of order $p$ (AR($p$)) is a central model in time series analysis. A Bayesian approach requires the user to define a prior distribution for the coefficients of the AR($p$) model. Although it is easy to write down some prior, it is not at all obvious how to understand and interpret the prior, to ensure that it behaves according to the users prior knowledge. In this paper, we approach this problem using the recently developed ideas of penalised complexity (PC) priors. These priors have important properties like robustness and invariance to reparameterisations, as well as a clear interpretation. A PC prior is computed based on specific principles, where model component complexity is penalised in terms of deviation from simple base model formulations. In the AR(1) case, we discuss two natural base model choices, corresponding to either independence in time or no change in time. The latter case is illustrated in a survival model with possible time-dependent frailty. For higher-order processes, we propose a sequential approach, where the base model for AR($p$) is the corresponding AR($p-1$) model expressed using the partial autocorrelations. The properties of the new prior are compared with the reference prior in a simulation study.

preprint2015arXiv

Bayesian bivariate meta-analysis of diagnostic test studies with interpretable priors

In a bivariate meta-analysis the number of diagnostic studies involved is often very low so that frequentist methods may result in problems. Bayesian inference is attractive as informative priors that add small amount of information can stabilise the analysis without overwhelming the data. However, Bayesian analysis is often computationally demanding and the selection of the prior for the covariance matrix of the bivariate structure is crucial with little data. The integrated nested Laplace approximations (INLA) method provides an efficient solution to the computational issues by avoiding any sampling, but the important question of priors remain. We explore the penalised complexity (PC) prior framework for specifying informative priors for the variance parameters and the correlation parameter. PC priors facilitate model interpretation and hyperparameter specification as expert knowledge can be incorporated intuitively. We conduct a simulation study to compare the properties and behaviour of differently defined PC priors to currently used priors in the field. The simulation study shows that the use of PC priors results in more precise estimates when specified in a sensible neighbourhood around the truth. To investigate the usage of PC priors in practice we reanalyse a meta-analysis using the telomerase marker for the diagnosis of bladder cancer.

preprint2015arXiv

Beyond the Valley of the Covariance Function

Discussion of "Cross-Covariance Functions for Multivariate Geostatistics" by Genton and Kleiber [arXiv:1507.08017].

preprint2015arXiv

Does non-stationary spatial data always require non-stationary random fields?

A stationary spatial model is an idealization and we expect that the true dependence structures of physical phenomena are spatially varying, but how should we handle this non-stationarity in practice? We study the challenges involved in applying a flexible non-stationary model to a dataset of annual precipitation in the conterminous US, where exploratory data analysis shows strong evidence of a non-stationary covariance structure. The aim of this paper is to investigate the modelling pipeline once non-stationarity has been detected in spatial data. We show that there is a real danger of over-fitting the model and that careful modelling is necessary in order to properly account for varying second-order structure. In fact, the example shows that sometimes non-stationary Gaussian random fields are not necessary to model non-stationary spatial data.

preprint2015arXiv

Going off grid: Computationally efficient inference for log-Gaussian Cox processes

This paper introduces a new method for performing computational inference on log-Gaussian Cox processes. The likelihood is approximated directly by making novel use of a continuously specified Gaussian random field. We show that for sufficiently smooth Gaussian random field prior distributions, the approximation can converge with arbitrarily high order, while an approximation based on a counting process on a partition of the domain only achieves first-order convergence. The given results improve on the general theory of convergence of the stochastic partial differential equation models, introduced by Lindgren et al. (2011). The new method is demonstrated on a standard point pattern data set and two interesting extensions to the classical log-Gaussian Cox process framework are discussed. The first extension considers variable sampling effort throughout the observation window and implements the method of Chakraborty et al. (2011). The second extension constructs a log-Gaussian Cox process on the world's oceans. The analysis is performed using integrated nested Laplace approximation for fast approximate inference.

preprint2015arXiv

Improving the INLA approach for approximate Bayesian inference for latent Gaussian models

We introduce a new copula-based correction for generalized linear mixed models (GLMMs) within the integrated nested Laplace approximation (INLA) approach for approximate Bayesian inference for latent Gaussian models. While INLA is usually very accurate, some (rather extreme) cases of GLMMs with e.g. binomial or Poisson data have been seen to be problematic. Inaccuracies can occur when there is a very low degree of smoothing or "borrowing strength" within the model, and we have therefore developed a correction aiming to push the boundaries of the applicability of INLA. Our new correction has been implemented as part of the R-INLA package, and adds only negligible computational cost. Empirical evaluations on both real and simulated data indicate that the method works well.

preprint2015arXiv

Penalising model component complexity: A principled, practical approach to constructing priors

In this paper, we introduce a new concept for constructing prior distributions. We exploit the natural nested structure inherent to many model components, which defines the model component to be a flexible extension of a base model. Proper priors are defined to penalise the complexity induced by deviating from the simpler base model and are formulated after the input of a user-defined scaling parameter for that model component, both in the univariate and the multivariate case. These priors are invariant to reparameterisations, have a natural connection to Jeffreys' priors, are designed to support Occam's razor and seem to have excellent robustness properties, all which are highly desirable and allow us to use this approach to define default prior distributions. Through examples and theoretical results, we demonstrate the appropriateness of this approach and how it can be applied in various situations.

preprint2015arXiv

Spatial Modelling of Temperature and Humidity using Systems of Stochastic Partial Differential Equations

This work is motivated by constructing a weather simulator for precipitation. Temperature and humidity are two of the most important driving forces of precipitation, and the strategy is to have a stochastic model for temperature and humidity, and use a deterministic model to go from these variables to precipitation. Temperature and humidity are empirically positively correlated. Generally speaking, if variables are empirically dependent, then multivariate models should be considered. In this work we model humidity and temperature in southern Norway. We want to construct bivariate Gaussian random fields (GRFs) based on this dataset. The aim of our work is to use the bivariate GRFs to capture both the dependence structure between humidity and temperature as well as their spatial dependencies. One important feature for the dataset is that the humidity and temperature are not necessarily observed at the same locations. Both univariate and bivariate spatial models are fitted and compared. For modeling and inference the SPDE approach for univariate models and the systems of SPDEs approach for multivariate models have been used. To evaluate the performance of the difference between the univariate and bivariate models, we compare predictive performance using some commonly used scoring rules: mean absolute error, mean-square error and continuous ranked probability score. The results illustrate that we can capture strong positive correlation between the temperature and the humidity. Furthermore, the results also agree with the physical or empirical knowledge. At the end, we conclude that using the bivariate GRFs to model this dataset is superior to the approach with independent univariate GRFs both when evaluating point predictions and for quantifying prediction uncertainty.

preprint2014arXiv

Exploring a New Class of Non-stationary Spatial Gaussian Random Fields with Varying Local Anisotropy

Gaussian random fields (GRFs) constitute an important part of spatial modelling, but can be computationally infeasible for general covariance structures. An efficient approach is to specify GRFs via stochastic partial differential equations (SPDEs) and derive Gaussian Markov random field (GMRF) approximations of the solutions. We consider the construction of a class of non-stationary GRFs with varying local anisotropy, where the local anisotropy is introduced by allowing the coefficients in the SPDE to vary with position. This is done by using a form of diffusion equation driven by Gaussian white noise with a spatially varying diffusion matrix. This allows for the introduction of parameters that control the GRF by parametrizing the diffusion matrix. These parameters and the GRF may be considered to be part of a hierarchical model and the parameters estimated in a Bayesian framework. The results show that the use of an SPDE with non-constant coefficients is a promising way of creating non-stationary spatial GMRFs that allow for physical interpretability of the parameters, although there are several remaining challenges that would need to be solved before these models can be put to general practical use.

preprint2013arXiv

A toolbox for fitting complex spatial point process models using integrated nested Laplace approximation (INLA)

This paper develops methodology that provides a toolbox for routinely fitting complex models to realistic spatial point pattern data. We consider models that are based on log-Gaussian Cox processes and include local interaction in these by considering constructed covariates. This enables us to use integrated nested Laplace approximation and to considerably speed up the inferential task. In addition, methods for model comparison and model assessment facilitate the modelling process. The performance of the approach is assessed in a simulation study. To demonstrate the versatility of the approach, models are fitted to two rather different examples, a large rainforest data set with covariates and a point pattern with multiple marks.

preprint2013arXiv

Bayesian computing with INLA: new features

The INLA approach for approximate Bayesian inference for latent Gaussian models has been shown to give fast and accurate estimates of posterior marginals and also to be a valuable tool in practice via the R-package R-INLA. In this paper we formalize new developments in the R-INLA package and show how these features greatly extend the scope of models that can be analyzed by this interface. We also discuss the current default method in R-INLA to approximate posterior marginals of the hyperparameters using only a modest number of evaluations of the joint posterior distribution of the hyperparameters, without any need for numerical integration.

preprint2013arXiv

Extending INLA to a class of near-Gaussian latent models

This work extends the Integrated Nested Laplace Approximation (INLA) method to latent models outside the scope of latent Gaussian models, where independent components of the latent field can have a near-Gaussian distribution. The proposed methodology is an essential component of a bigger project that aim to extend the R package INLA (R-INLA) in order to allow the user to add flexibility and challenge the Gaussian assumptions of some of the model components in a straightforward and intuitive way. Our approach is applied to two examples and the results are compared with that obtained by Markov Chain Monte Carlo (MCMC), showing similar accuracy with only a small fraction of computational time. Implementation of the proposed extension is available in the R-INLA package.

preprint2013arXiv

Multivariate Gaussian Random Fields Using Systems of Stochastic Partial Differential Equations

In this paper a new approach for constructing \emph{multivariate} Gaussian random fields (GRFs) using systems of stochastic partial differential equations (SPDEs) has been introduced and applied to simulated data and real data. By solving a system of SPDEs, we can construct multivariate GRFs. On the theoretical side, the notorious requirement of non-negative definiteness for the covariance matrix of the GRF is satisfied since the constructed covariance matrices with this approach are automatically symmetric positive definite. Using the approximate stochastic weak solutions to the systems of SPDEs, multivariate GRFs are represented by multivariate Gaussian \emph{Markov} random fields (GMRFs) with sparse precision matrices. Therefore, on the computational side, the sparse structures make it possible to use numerical algorithms for sparse matrices to do fast sampling from the random fields and statistical inference. Therefore, the \emph{big-n} problem can also be partially resolved for these models. These models out-preform existing multivariate GRF models on a commonly used real dataset.

preprint2013arXiv

Multivariate Gaussian Random Fields with Oscillating Covariance Functions using Systems of Stochastic Partial Differential Equations

In this paper we propose a new approach for constructing \emph{multivariate} Gaussian random fields (GRFs) with oscillating covariance functions through systems of stochastic partial differential equations (SPDEs). We discuss how to build systems of SPDEs that introduces oscillation characteristics in the covariance functions of the multivariate GRFs. By choosing different parametrization of the equations, some GRFs can be made with oscillating covariance functions but other fields can have Matérn covariance functions or close to Matérn covariance functions. The multivariate GRFs constructed by solving the systems of SPDEs automatically fulfill the hard requirement of nonnegative definiteness for the covariance functions. The approximate weak solutions to the systems of SPDEs are used to represent the multivariate GRFs by multivariate Gaussian \emph{Markov} random fields (GMRFs). Since the multivariate GMRFs have sparse precision matrices (inverse of the covariance matrices), numerical algorithms for sparse matrices can be applied to the precision matrices for sampling and inference. Thus from a computational point of view, the \emph{big-n} problem can be partially solved with these types of models. Another advantage of the method is that the oscillation in the covariance function can be controlled directly by the parameters in the system of SPDEs. We show how to use this proposed approach with simulated data and real data examples.

preprint2013arXiv

Non-stationary Spatial Modelling with Applications to Spatial Prediction of Precipitation

A non-stationary spatial Gaussian random field (GRF) is described as the solution of an inhomogeneous stochastic partial differential equation (SPDE), where the covariance structure of the GRF is controlled by the coefficients in the SPDE. This allows for a flexible way to vary the covariance structure, where intuition about the resulting structure can be gained from the local behaviour of the differential equation. Additionally, computations can be done with computationally convenient Gaussian Markov random fields which approximate the true GRFs. The model is applied to a dataset of annual precipitation in the conterminous US. The non-stationary model performs better than a stationary model measured with both CRPS and the logarithmic scoring rule.

preprint2013arXiv

Specifying Gaussian Markov Random Fields with Incomplete Orthogonal Factorization using Givens Rotations

In this paper an approach for finding a sparse incomplete Cholesky factor through an incomplete orthogonal factorization with Givens rotations is discussed and applied to Gaussian Markov random fields (GMRFs). The incomplete Cholesky factor obtained from the incomplete orthogonal factorization is usually sparser than the commonly used Cholesky factor obtained through the standard Cholesky factorization. On the computational side, this approach can provide a sparser Cholesky factor, which gives a computationally more efficient representation of GMRFs. On the theoretical side, this approach is stable and robust and always returns a sparse Cholesky factor. Since this approach applies both to square matrices and to rectangle matrices, it works well not only on precision matrices for GMRFs but also when the GMRFs are conditioned on a subset of the variables or on observed data. Some common structures for precision matrices are tested in order to illustrate the usefulness of the approach. One drawback to this approach is that the incomplete orthogonal factorization is usually slower than the standard Cholesky factorization implemented in standard libraries and currently it can be slower to build the sparse Cholesky factor.

preprint2012arXiv

Bayesian Adaptive Smoothing Spline using Stochastic Differential Equations

The smoothing spline is one of the most popular curve-fitting methods, partly because of empirical evidence supporting its effectiveness and partly because of its elegant mathematical formulation. However, there are two obstacles that restrict the use of smoothing spline in practical statistical work. Firstly, it becomes computationally prohibitive for large data sets because the number of basis functions roughly equals the sample size. Secondly, its global smoothing parameter can only provide constant amount of smoothing, which often results in poor performances when estimating inhomogeneous functions. In this work, we introduce a class of adaptive smoothing spline models that is derived by solving certain stochastic differential equations with finite element methods. The solution extends the smoothing parameter to a continuous data-driven function, which is able to capture the change of the smoothness of underlying process. The new model is Markovian, which makes Bayesian computation fast. A simulation study and real data example are presented to demonstrate the effectiveness of our method.

preprint2012arXiv

Estimation and extrapolation of time trends in registry data---Borrowing strength from related populations

To analyze and project age-specific mortality or morbidity rates age-period-cohort (APC) models are very popular. Bayesian approaches facilitate estimation and improve predictions by assigning smoothing priors to age, period and cohort effects. Adjustments for overdispersion are straightforward using additional random effects. When rates are further stratified, for example, by countries, multivariate APC models can be used, where differences of stratum-specific effects are interpretable as log relative risks. Here, we incorporate correlated stratum-specific smoothing priors and correlated overdispersion parameters into the multivariate APC model, and use Markov chain Monte Carlo and integrated nested Laplace approximations for inference. Compared to a model without correlation, the new approach may lead to more precise relative risk estimates, as shown in an application to chronic obstructive pulmonary disease mortality in three regions of England and Wales. Furthermore, the imputation of missing data for one particular stratum may be improved, since the new approach takes advantage of the remaining strata if the corresponding observations are available there. This is shown in an application to female mortality in Denmark, Sweden and Norway from the 20th century, where we treat for each country in turn either the first or second half of the observations as missing and then impute the omitted data. The projections are compared to those obtained from a univariate APC model and an extended Lee--Carter demographic forecasting approach using the proper Dawid--Sebastiani scoring rule.

preprint2011arXiv

Approximate simulation-free Bayesian inference for multiple changepoint models with dependence within segments

This paper proposes approaches for the analysis of multiple changepoint models when dependency in the data is modelled through a hierarchical Gaussian Markov random field. Integrated nested Laplace approximations are used to approximate data quantities, and an approximate filtering recursions approach is proposed for savings in compuational cost when detecting changepoints. All of these methods are simulation free. Analysis of real data demonstrates the usefulness of the approach in general. The new models which allow for data dependence are compared with conventional models where data within segments is assumed independent.

preprint2011arXiv

Fast approximate inference with INLA: the past, the present and the future

Latent Gaussian models are an extremely popular, flexible class of models. Bayesian inference for these models is, however, tricky and time consuming. Recently, Rue, Martino and Chopin introduced the Integrated Nested Laplace Approximation (INLA) method for deterministic fast approximate inference. In this paper, we outline the INLA approximation and its related R package. We will discuss the newer components of the r-INLA program as well as some possible extensions.

preprint2011arXiv

Think continuous: Markovian Gaussian models in spatial statistics

Gaussian Markov random fields (GMRFs) are frequently used as computationally efficient models in spatial statistics. Unfortunately, it has traditionally been difficult to link GMRFs with the more traditional Gaussian random field models as the Markov property is difficult to deploy in continuous space. Following the pioneering work of Lindgren et al. (2011), we expound on the link between Markovian Gaussian random fields and GMRFs. In particular, we discuss the theoretical and practical aspects of fast computation with continuously specified Markovian Gaussian random fields, as well as the clear advantages they offer in terms of clear, parsimonious and interpretable models of anisotropy and non-stationarity.

Håvard Rue

What is connected

Connect this record

See the researcher in context

Building this map preview

34 published item(s)

An Extended Simplified Laplace strategy for Approximate Bayesian inference of Latent Gaussian Models using R-INLA

Joint Modeling and Prediction of Massive Spatio-Temporal Wildfire Count and Burnt Area Data with the INLA-SPDE Approach

Joint Quantile Disease Mapping with Application to Malaria and G6PD Deficiency

Parallelized integrated nested Laplace approximations for fast Bayesian inference

Practical strategies for GEV-based regression models for extremes

Variance partitioning in spatio-temporal disease mapping models

Fast Bayesian inference of Block Nearest Neighbor Gaussian process for large data

Importance Sampling with the Integrated Nested Laplace Approximation

Efficient Quantile Tracking Using an Oracle

Estimating Tukey Depth Using Incremental Quantile Estimators

An intuitive Bayesian spatial model for disease mapping that accounts for scaling

Bayesian Computing with INLA: A Review

Fractional Gaussian noise: Prior specification and model comparison

Penalised complexity priors for stationary autoregressive processes

Bayesian bivariate meta-analysis of diagnostic test studies with interpretable priors

Beyond the Valley of the Covariance Function

Does non-stationary spatial data always require non-stationary random fields?

Going off grid: Computationally efficient inference for log-Gaussian Cox processes

Improving the INLA approach for approximate Bayesian inference for latent Gaussian models

Penalising model component complexity: A principled, practical approach to constructing priors

Spatial Modelling of Temperature and Humidity using Systems of Stochastic Partial Differential Equations

Exploring a New Class of Non-stationary Spatial Gaussian Random Fields with Varying Local Anisotropy

A toolbox for fitting complex spatial point process models using integrated nested Laplace approximation (INLA)

Bayesian computing with INLA: new features

Extending INLA to a class of near-Gaussian latent models

Multivariate Gaussian Random Fields Using Systems of Stochastic Partial Differential Equations

Multivariate Gaussian Random Fields with Oscillating Covariance Functions using Systems of Stochastic Partial Differential Equations

Non-stationary Spatial Modelling with Applications to Spatial Prediction of Precipitation

Specifying Gaussian Markov Random Fields with Incomplete Orthogonal Factorization using Givens Rotations

Bayesian Adaptive Smoothing Spline using Stochastic Differential Equations

Estimation and extrapolation of time trends in registry data---Borrowing strength from related populations

Approximate simulation-free Bayesian inference for multiple changepoint models with dependence within segments

Fast approximate inference with INLA: the past, the present and the future

Think continuous: Markovian Gaussian models in spatial statistics