Source author record

Andrew Parnell

Andrew Parnell appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Methodology Computation eess.SP Information Retrieval Machine Learning physics.ao-ph

Catalog footprint

What is connected

8works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Spatio-temporal analysis of extreme winter temperatures in Ireland

We analyse extreme daily minimum temperatures in winter months over the island of Ireland from 1950-2022. We model the marginal distributions of extreme winter minima using a generalised Pareto distribution (GPD), capturing temporal and spatial non-stationarities in the parameters of the GPD. We investigate two independent temporal non-stationarities in extreme winter minima. We model the long-term trend in magnitude of extreme winter minima as well as short-term, large fluctuations in magnitude caused by anomalous behaviour of the jet stream. We measure magnitudes of spatial events with a carefully chosen risk function and fit an r-Pareto process to extreme events exceeding a high-risk threshold. Our analysis is based on synoptic data observations courtesy of Met Éireann and the Met Office. We show that the frequency of extreme cold winter events is decreasing over the study period. The magnitude of extreme winter events is also decreasing, indicating that winters are warming, and apparently warming at a faster rate than extreme summer temperatures. We also show that extremely cold winter temperatures are warming at a faster rate than non-extreme winter temperatures. We find that a climate model output previously shown to be informative as a covariate for modelling extremely warm summer temperatures is less effective as a covariate for extremely cold winter temperatures. However, we show that the climate model is useful for informing a non-extreme temperature model.

preprint2022arXiv

A Bayesian Hierarchical Time Series Model for Reconstructing Hydroclimate from Multiple Proxies

We propose a Bayesian hierarchical model which produces probabilistic reconstructions of hydroclimatic variability in Queensland Australia. The model provides a standardised approach to hydroclimate reconstruction using multiple palaeoclimate proxy records derived from natural archives such as speleothems, ice cores and tree rings. The method combines time-series modelling with inverse prediction to quantify the relationships between a given hydroclimate index and relevant proxies over an instrumental period and subsequently reconstruct the hydroclimate back through time. We present case studies for Brisbane and Fitzroy catchments focusing on two hydroclimate indices, the Rainfall Index (RFI) and the Standardised Precipitation-Evapotranspiration Index (SPEI). The probabilistic nature of the reconstructions allows us to estimate the probability that a hydroclimate index in any reconstruction year was lower (higher) than the minimum (maximum) value observed over the instrumental period. In Brisbane, the RFI is unlikely (probabilities < 20%) to have exhibited extremes beyond the minimum/maximum values observed between 1889 and 2017. However, in Fitzroy there are several years during the reconstruction period where the RFI is likely (> 50% probability) to have exhibited behaviour beyond the minimum/maximum of what has been observed. For SPEI, the probability of observing such extremes since the end of the instrumental period in 1889 doesn't exceed 50% in any reconstruction year in Brisbane or Fitzroy.

preprint2022arXiv

Bayesian Multi-Species N-Mixture Models for Unmarked Animal Communities

We propose an extension of the N-mixture model which allows for the estimation of both abundances of multiple species simultaneously and their inter-species correlations. We also propose further extensions to this multi-species N-mixture model, one of which permits us to examine data which has an excess of zero counts, and another which allows us to relax the assumption of closure inherent in N-mixture models through the incorporation of an AR term in the abundance. The inclusion of a multivariate normal distribution as prior on the random effect in the abundance facilitates the estimation of a matrix of interspecies correlations. Each model is also fitted to avian point data collected as part of the NABBS 2010-2019. Results of simulation studies reveal that these models produce accurate estimates of abundance, inter-species correlations and detection probabilities at both small and large sample sizes, in scenarios with small, large and no zero inflation. Results of model-fitting to the North American Breeding Bird Survey data reveal an increase in Bald Eagle population size in southeastern Alaska in the decade examined.Our novel multi-species N-mixture model accounts for full communities, allowing us to examine abundances of every species present in a study area and, as these species do not exist in a vacuum, allowing us to estimate correlations between species' abundances.While previous multi-species abundance models have allowed for the estimation of abundance and detection probability, ours is the first to address the estimation of both positive and negative inter-species correlations, which allows us to begin to make inferences as to the effect that these species' abundances have on one another. Our modelling approach provides a method of quantifying the strength of association between species' population sizes, and is of practical use to population and conservation ecologists.

preprint2022arXiv

Vector Time Series Modelling of Turbidity in Dublin Bay

Turbidity is commonly monitored as an important water quality index. Human activities, such as dredging and dumping operations, can disrupt turbidity levels and should be monitored and analyzed for possible effects. In this paper, we model the variations of turbidity in Dublin Bay over space and time to investigate the effects of dumping and dredging while controlling for the effect of wind speed as a common atmospheric effect. We develop a novel Vector Auto-Regressive Conditional Heteroskedasticity (VARCH) approach to modelling the dynamical behaviour of turbidity over different locations and at different water depths. We use daily values of turbidity during the years 2017-2018 to fit the model. We show that the results of our fitted model are in line with the observed data and that the uncertainties, measured through Bayesian credible intervals, are well calibrated. Furthermore, we show that the daily effects of dredging and dumping on turbidity are negligible in comparison to that of wind speed.

preprint2022arXiv

Visualizations for Bayesian Additive Regression Trees

Tree-based regression and classification has become a standard tool in modern data science. Bayesian Additive Regression Trees (BART) has in particular gained wide popularity due its flexibility in dealing with interactions and non-linear effects. BART is a Bayesian tree-based machine learning method that can be applied to both regression and classification problems and yields competitive or superior results when compared to other predictive models. As a Bayesian model, BART allows the practitioner to explore the uncertainty around predictions through the posterior distribution. In this paper, we present new visualization techniques for exploring BART models. We construct conventional plots to analyze a model's performance and stability as well as create new tree-based plots to analyze variable importance, interaction, and tree structure. We employ Value Suppressing Uncertainty Palettes (VSUP) to construct heatmaps that display variable importance and interactions jointly using color scale to represent posterior uncertainty. Our new visualizations are designed to work with the most popular BART R packages available, namely BART, dbarts, and bartMachine. Our approach is implemented in the R package bartMan (BART Model ANalysis).

preprint2020arXiv

Generalizing Gain Penalization for Feature Selection in Tree-based Models

We develop a new approach for feature selection via gain penalization in tree-based models. First, we show that previous methods do not perform sufficient regularization and often exhibit sub-optimal out-of-sample performance, especially when correlated features are present. Instead, we develop a new gain penalization idea that exhibits a general local-global regularization for tree-based models. The new method allows for more flexibility in the choice of feature-specific importance weights. We validate our method on both simulated and real data and implement itas an extension of the popular R package ranger.

preprint2020arXiv

Real-Time Anomaly Detection for Advanced Manufacturing: Improving on Twitter's State of the Art

The detection of anomalies in real time is paramount to maintain performance and efficiency across a wide range of applications including web services and smart manufacturing. This paper presents a novel algorithm to detect anomalies in streaming time series data via statistical learning. We adapt the generalised extreme studentised deviate test [1] to streaming data by using a sliding window approach. This is made computationally feasible by recursive updates of the Grubbs test statistic [2]. Moreover, a priority queue [3] is employed to reduce memory requirements, where subsets of the required data streaming window are maintained in the algorithm rather than the full list. Our method is statistically principled. It is suitable for streaming data and it outperforms the AnomalyDetection software package, recently released by Twitter Inc. (Twitter) [4] and used by multiple teams at Twitter as their state of the art on a daily basis [5]. The methodology is demonstrated using an example of unlabelled data from the Twitter AnomalyDetection GitHub repository and using a real manufacturing example with labelled anomalies.

preprint2016arXiv

Prediction of tool-wear in turning of medical grade cobalt chromium molybdenum alloy (ASTM F75) using non-parametric Bayesian models

We present a novel approach to estimating the effect of control parameters on tool wear rates and related changes in the three force components in turning of medical grade Co-Cr-Mo (ASTM F75) alloy. Co-Cr-Mo is known to be a difficult to cut material which, due to a combination of mechanical and physical properties, is used for the critical structural components of implantable medical prosthetics. We run a designed experiment which enables us to estimate tool wear from feed rate and cutting speed, and constrain them using a Bayesian hierarchical Gaussian Process model which enables prediction of tool wear rates for untried experimental settings. The predicted tool wear rates are non-linear and, using our models, we can identify experimental settings which optimise the life of the tool. This approach has potential in the future for realtime application of data analytics to machining processes.

Andrew Parnell

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Spatio-temporal analysis of extreme winter temperatures in Ireland

A Bayesian Hierarchical Time Series Model for Reconstructing Hydroclimate from Multiple Proxies

Bayesian Multi-Species N-Mixture Models for Unmarked Animal Communities

Vector Time Series Modelling of Turbidity in Dublin Bay

Visualizations for Bayesian Additive Regression Trees

Generalizing Gain Penalization for Feature Selection in Tree-based Models

Real-Time Anomaly Detection for Advanced Manufacturing: Improving on Twitter's State of the Art

Prediction of tool-wear in turning of medical grade cobalt chromium molybdenum alloy (ASTM F75) using non-parametric Bayesian models