Source author record

Paul A. Smith

Paul A. Smith appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Methodology Computation

Catalog footprint

What is connected

4works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Multiple system estimation using covariates having missing values and measurement error: estimating the size of the Māori population in New Zealand

We investigate use of two or more linked registers, or lists, for both population size estimation and to investigate the relationship between variables appearing on all or only some registers. This relationship is usually not fully known because some individuals appear in only some registers, and some are not in any register. These two problems have been solved simultaneously using the EM algorithm. We extend this approach to estimate the size of the indigenous Māori population in New Zealand, leading to several innovations: (1) the approach is extended to four registers (including the population census), where the reporting of Māori status differs between registers; (2) some individuals in one or more registers have missing ethnicity, and we adapt the approach to handle this additional missingness; (3) some registers cover subsets of the population by design. We discuss under which assumptions such structural undercoverage can be ignored and provide a general result; (4) we treat the Māori indicator in each register as a variable measured with error, and embed a latent class model in the multiple system estimation to estimate the population size of a latent variable, interpreted as the true Māori status. Finally, we discuss estimating the Māori population size from administrative data only. Supplementary materials for our article are available online.

preprint2020arXiv

On estimating the size of overcoverage with the latent class model. A critique of the paper "Population Size Estimation Using Multiple Incomplete Lists with Overcoverage" by di Cecco, di Zio, Filipponi and Rocchetti (2018, JOS 34 557-572)

We read with interest the article by di Cecco et al. (2018), but have reservations about the usefulness of the latent class model specifically for estimating overcoverage. In particular, we question the interpretation of the parameters of the fitted latent class model.

preprint2020arXiv

Robust estimation for small domains in business surveys

Small area (or small domain) estimation is still rarely applied in business statistics, because of challenges arising from the skewness and variability of variables such as turnover. We examine a range of small area estimation methods as the basis for estimating the activity of industries within the retail sector in the Netherlands. We use tax register data and a sampling procedure which replicates the sampling for the retail sector of Statistics Netherlands' Structural Business Survey as a basis for investigating the properties of small area estimators. In particular, we consider the use of the EBLUP under a random effects model and variations of the EBLUP derived under (a) a random effects model that includes a complex specification for the level 1 variance and (b) a random effects model that is fitted by using the survey weights. Although accounting for the survey weights in estimation is important, the impact of influential data points remains the main challenge in this case. The paper further explores the use of outlier robust estimators in business surveys, in particular a robust version of the EBLUP, M-regression based synthetic estimators, and M-quantile small area estimators. The latter family of small area estimators includes robust projective (without and with survey weights) and robust predictive versions. M-quantile methods have the lowest empirical mean squared error and are substantially better than direct estimators, though there is an open question about how to choose the tuning constant for bias adjustment in practice. The paper makes a further contribution by exploring a doubly robust approach comprising the use of survey weights in conjunction with outlier robust methods in small area estimation.

preprint2016arXiv

Supplementary Material for "Should we sample a time series more frequently? Decision support via multirate spectrum estimation (with discussion)"

This technical report includes an assortment of technical details and extended discussions related to paper "Should we sample a time series more frequently? Decision support via multirate spectrum estimation (with discussion)", which introduces a model for estimating the log-spectral density of a stationary discrete time process given systematically missing data and models the cost implication for changing the sampling rate.

Paul A. Smith

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Multiple system estimation using covariates having missing values and measurement error: estimating the size of the Māori population in New Zealand

On estimating the size of overcoverage with the latent class model. A critique of the paper "Population Size Estimation Using Multiple Incomplete Lists with Overcoverage" by di Cecco, di Zio, Filipponi and Rocchetti (2018, JOS 34 557-572)

Robust estimation for small domains in business surveys

Supplementary Material for "Should we sample a time series more frequently? Decision support via multirate spectrum estimation (with discussion)"