Researcher profile

Scott H. Holan

Scott H. Holan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

Echo State Networks for Spatio-Temporal Area-Level Data

Spatio-temporal area-level datasets play a critical role in official statistics, providing valuable insights for policy-making and regional planning. Accurate modeling and forecasting of these datasets can be extremely useful for policymakers to develop informed strategies for future planning. Echo State Networks (ESNs) are efficient methods for capturing nonlinear temporal dynamics and generating forecasts. However, ESNs lack a direct mechanism to account for the neighborhood structure inherent in area-level data. Ignoring these spatial relationships can significantly compromise the accuracy and utility of forecasts. In this paper, we incorporate approximate graph spectral filters at the input stage of the ESN, thereby improving forecast accuracy while preserving the model's computational efficiency during training. We demonstrate the effectiveness of our approach using Eurostat's tourism occupancy dataset and show how it can support more informed decision-making in policy and planning contexts.

preprint2022arXiv

A Look into the Problem of Preferential Sampling from the Lens of Survey Statistics

An evolving problem in the field of spatial and ecological statistics is that of preferential sampling, where biases may be present due to a relationship between sample data locations and a response of interest. This field of research bears a striking resemblance to the longstanding problem of informative sampling within survey methodology, although with some important distinctions. With the goal of promoting collaborative effort within and between these two problem domains, we make comparisons and contrasts between the two problem statements. Specifically, we review many of the solutions available to address each of these problems, noting the important differences in modeling techniques. Additionally, we construct a series of simulation studies to examine some of the methods available for preferential sampling, as well as a comparison analyzing heavy metal biomonitoring data.

preprint2022arXiv

Bayesian Circular Lattice Filters for Computationally Efficient Estimation of Multivariate Time-Varying Autoregressive Models

Nonstationary time series data exist in various scientific disciplines, including environmental science, biology, signal processing, econometrics, among others. Many Bayesian models have been developed to handle nonstationary time series. The time-varying vector autoregressive (TV-VAR) model is a well-established model for multivariate nonstationary time series. Nevertheless, in most cases, the large number of parameters presented by the model results in a high computational burden, ultimately limiting its usage. This paper proposes a computationally efficient multivariate Bayesian Circular Lattice Filter to extend the usage of the TV-VAR model to a broader class of high-dimensional problems. Our fully Bayesian framework allows both the autoregressive (AR) coefficients and innovation covariance to vary over time. Our estimation method is based on the Bayesian lattice filter (BLF), which is extremely computationally efficient and stable in univariate cases. To illustrate the effectiveness of our approach, we conduct a comprehensive comparison with other competing methods through simulation studies and find that, in most cases, our approach performs superior in terms of average squared error between the estimated and true time-varying spectral density. Finally, we demonstrate our methodology through applications to quarterly Gross Domestic Product (GDP) data and Northern California wind data.

preprint2020arXiv

Computationally Efficient Bayesian Unit-Level Models for Non-Gaussian Data Under Informative Sampling

Statistical estimates from survey samples have traditionally been obtained via design-based estimators. In many cases, these estimators tend to work well for quantities such as population totals or means, but can fall short as sample sizes become small. In today's "information age," there is a strong demand for more granular estimates. To meet this demand, using a Bayesian pseudo-likelihood, we propose a computationally efficient unit-level modeling approach for non-Gaussian data collected under informative sampling designs. Specifically, we focus on binary and multinomial data. Our approach is both multivariate and multiscale, incorporating spatial dependence at the area-level. We illustrate our approach through an empirical simulation study and through a motivating application to health insurance estimates using the American Community Survey.

preprint2020arXiv

Computationally Efficient Deep Bayesian Unit-Level Modeling of Survey Data under Informative Sampling for Small Area Estimation

The topic of deep learning has seen a surge of interest in recent years both within and outside of the field of Statistics. Deep models leverage both nonlinearity and interaction effects to provide superior predictions in many cases when compared to linear or generalized linear models. However, one of the main challenges with deep modeling approaches is quantification of uncertainty. The use of random weight models, such as the popularized "Extreme Learning Machine," offer a potential solution in this regard. In addition to uncertainty quantification, these models are extremely computationally efficient as they do not require optimization through stochastic gradient descent, which is what is typically done for deep learning. We show how the use of random weights in a deep model can fit into a likelihood based framework to allow for uncertainty quantification of the model parameters and any desired estimates. Furthermore, we show how this approach can be used to account for informative sampling of survey data through the use of a pseudo-likelihood. We illustrate the effectiveness of this methodology through simulation and with a real survey data application involving American National Election Studies data.

preprint2020arXiv

Nonlinear Time Series Classification Using Bispectrum-based Deep Convolutional Neural Networks

Time series classification using novel techniques has experienced a recent resurgence and growing interest from statisticians, subject-domain scientists, and decision makers in business and industry. This is primarily due to the ever increasing amount of big and complex data produced as a result of technological advances. A motivating example is that of Google trends data, which exhibit highly nonlinear behavior. Although a rich literature exists for addressing this problem, existing approaches mostly rely on first and second order properties of the time series, since they typically assume linearity of the underlying process. Often, these are inadequate for effective classification of nonlinear time series data such as Google Trends data. Given these methodological deficiencies and the abundance of nonlinear time series that persist among real-world phenomena, we introduce an approach that merges higher order spectral analysis (HOSA) with deep convolutional neural networks (CNNs) for classifying time series. The effectiveness of our approach is illustrated using simulated data and two motivating industry examples that involve Google trends data and electronic device energy consumption data.

preprint2020arXiv

Unit Level Modeling of Survey Data for Small Area Estimation Under Informative Sampling: A Comprehensive Overview with Extensions

Model-based small area estimation is frequently used in conjunction with survey data in order to establish estimates for under-sampled or unsampled geographies. These models can be specified at either the area-level, or the unit-level, but unit-level models often offer potential advantages such as more precise estimates and easy spatial aggregation. Nevertheless, relative to area-level models, literature on unit-level models is less prevalent. In modeling small areas at the unit level, challenges often arise as a consequence of the informative sampling mechanism used to collect the survey data. This paper provides a comprehensive methodological review for unit-level models under informative sampling, with an emphasis on Bayesian approaches. To provide insight into the differences between methods, we conduct a simulation study that compares several of the described approaches. In addition, the methods used for simulation are further illustrated through an application to the American Community Survey. Finally, we present several extensions and areas for future research.

preprint2019arXiv

Conjugate Bayesian Unit-level Modeling of Count Data Under Informative Sampling Designs

Unit-level models for survey data offer many advantages over their area-level counterparts, such as potential for more precise estimates and a natural benchmarking property. However two main challenges occur in this context: accounting for an informative survey design and handling non-Gaussian data types. The pseudo-likelihood approach is one solution to the former, and conjugate multivariate distribution theory offers a solution to the latter. By combining these approaches, we attain a unit-level model for count data that accounts for informative sampling designs and includes fully Bayesian model uncertainty propagation. Importantly, conjugate full conditional distributions hold under the pseudo-likelihood, yielding an extremely computationally efficient approach. Our method is illustrated via an empirical simulation study using count data from the American Community Survey public-use microdata sample.