Researcher profile

Siu-Ming Tam

Siu-Ming Tam contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
1topics
2close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2024arXiv

A Calibrated Data-Driven Approach for Small Area Estimation using Big Data

Where the response variable in a big data set is consistent with the variable of interest for small area estimation, the big data by itself can provide the estimates for small areas. These estimates are often subject to the coverage and measurement error bias inherited from the big data. However, if a probability survey of the same variable of interest is available, the survey data can be used as a training data set to develop an algorithm to impute for the data missed by the big data and adjust for measurement errors. In this paper, we outline a methodology for such imputations based on an kNN algorithm calibrated to an asymptotically design-unbiased estimate of the national total and illustrate the use of a training data set to estimate the imputation bias and the fixed - asymptotic bootstrap to estimate the variance of the small area hybrid estimator. We illustrate the methodology of this paper using a public use data set and use it to compare the accuracy and precision of our hybrid estimator with the Fay-Harriot (FH) estimator. Finally, we also examine numerically the accuracy and precision of the FH estimator when the auxiliary variables used in the linking models are subject to under-coverage errors.

preprint2020arXiv

Data Integration by combining big data and survey sample data for finite population inference

The statistical challenges in using big data for making valid statistical inference in the finite population have been well documented in literature. These challenges are due primarily to statistical bias arising from under-coverage in the big data source to represent the population of interest and measurement errors in the variables available in the data set. By stratifying the population into a big data stratum and a missing data stratum, we can estimate the missing data stratum by using a fully responding probability sample, and hence the population as a whole by using a data integration estimator. By expressing the data integration estimator as a regression estimator, we can handle measurement errors in the variables in big data and also in the probability sample. We also propose a fully nonparametric classification method for identifying the overlapping units and develop a bias-corrected data integration estimator under misclassification errors. Finally, we develop a two-step regression data integration estimator to deal with measurement errors in the probability sample. An advantage of the approach advocated in this paper is that we do not have to make unrealistic missing-at-random assumptions for the methods to work. The proposed method is applied to the real data example using 2015-16 Australian Agricultural Census data.