Researcher profile

Shu Yang

Shu Yang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
26works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

26 published item(s)

preprint2026arXiv

A Semantic-Sampling Framework for Evaluating Calibration in Open-Ended Question Answering

Calibration measures whether a model's predicted confidence aligns with its empirical accuracy, and is central to the reliable deployment of large language models (LLMs) in high-stakes domains such as medicine and law. While much recent work focuses on improving LLM calibration, the equally important question of how to evaluate it in realistic settings remains underdeveloped. Open-ended question answering (QA), the most common deployment setting for modern LLMs, is where existing evaluation methods fall short: logit-based metrics need restricted output formats and internal probabilities; verbalized confidence is self-reported and often overconfident; and sampling-based methods rely on task-specific extraction rules without a clear finite-sample target. We introduce Sem-ECE (Semantic-Sampling Expected Calibration Error), a calibration evaluation framework for open-ended QA that samples answers from the model, groups them into semantic classes, and uses the resulting frequencies as confidence. We study two estimators within this framework: Sem$_1$-ECE, the same-sample self-consistency score, and Sem$_2$-ECE, a held-out variant that separates answer selection from confidence evaluation. We prove both are asymptotically unbiased, and further show that they agree on easy questions but diverge on hard ones with Sem$_2$ achieving strictly smaller calibration error, so their gap also serves as a diagnostic for question difficulty. Experiments on three open-ended QA benchmarks across five leading commercial LLMs match our theoretical predictions and show that Sem-ECE outperforms verbalized confidence and existing sampling-based methods, while complementing logit-based evaluation when internal probabilities are unavailable.

preprint2026arXiv

Can Large Language Models Identify Implicit Suicidal Ideation? An Empirical Evaluation

We present a comprehensive evaluation framework for assessing Large Language Models' (LLMs) capabilities in suicide prevention, focusing on two critical aspects: the Identification of Implicit Suicidal ideation (IIS) and the Provision of Appropriate Supportive responses (PAS). We introduce \ourdata, a novel dataset of 1,308 test cases built upon psychological frameworks including D/S-IAT and Negative Automatic Thinking, alongside real-world scenarios. Through extensive experiments with 8 widely used LLMs under different contextual settings, we find that current models struggle significantly with detecting implicit suicidal ideation and providing appropriate support, highlighting crucial limitations in applying LLMs to mental health contexts. Our findings underscore the need for more sophisticated approaches in developing and evaluating LLMs for sensitive psychological applications.

preprint2026arXiv

Estimating optimal interpretable individualized treatment regimes from a classification perspective using adaptive LASSO

Real-world data (RWD) gains growing interests to provide a representative sample of the population for selecting the optimal treatment options. However, existing complex black box methods for estimating individualized treatment rules (ITR) from RWD have problems in interpretability and convergence. Providing an interpretable and sparse ITR can be used to overcome the limitation of existing methods. We developed an algorithm using Adaptive LASSO to predict optimal interpretable linear ITR in the RWD. To encourage sparsity, we obtain an ITR by minimizing the risk function with various types of penalties and different methods of contrast estimation. Simulation studies were conducted to select the best configuration and to compare the novel algorithm with the existing state-of-the-art methods. The proposed algorithm was applied to RWD to predict the optimal interpretable ITR. Simulations show that adaptive LASSO had the highest rates of correctly selected variables and augmented inverse probability weighting with Super Learner performed best for estimating treatment contrast. Our method had a better performance than causal forest and R-learning in terms of the value function and variable selection. The proposed algorithm can strike a balance between the interpretability of estimated ITR (by selecting a small set of important variables) and its value.

preprint2026arXiv

Investigating CoT Monitorability in Large Reasoning Models

Large Reasoning Models (LRMs) have demonstrated remarkable performance on complex tasks by engaging in extended reasoning before producing final answers. Beyond improving abilities, these detailed reasoning traces also create a new opportunity for AI safety, CoT Monitorability: monitoring potential model misbehavior, such as the use of shortcuts or sycophancy, through their chain-of-thought (CoT) during decision-making. However, two key fundamental challenges arise when attempting to build more effective monitors through CoT analysis. First, as prior research on CoT faithfulness has pointed out, models do not always truthfully represent their internal decision-making in the generated reasoning. Second, monitors themselves may be either overly sensitive or insufficiently sensitive, and can potentially be deceived by models' long, elaborate reasoning traces. In this paper, we present the first systematic investigation of the challenges and potential of CoT monitorability. Motivated by two fundamental challenges we mentioned before, we structure our study around two central perspectives: (i) verbalization: to what extent do LRMs faithfully verbalize the true factors guiding their decisions in the CoT, and (ii) monitor reliability: to what extent can misbehavior be reliably detected by a CoT-based monitor? Specifically, we provide empirical evidence and correlation analyses between verbalization quality, monitor reliability, and LLM performance across mathematical, scientific, and ethical domains. Then we further investigate how different CoT intervention methods, designed to improve reasoning efficiency or performance, will affect monitoring effectiveness. Finally, we propose MoME, a new paradigm in which LLMs monitor other models' misbehavior through their CoT and provide structured judgments along with supporting evidence.

preprint2023arXiv

A Unified Inference Framework for Multiple Imputation Using Martingales

Multiple imputation is widely used to handle missing data. Although Rubin's combining rule is simple, it is not clear whether or not the standard multiple imputation inference is consistent when coupled with the commonly-used full sample estimators. This article establishes a unified martingale representation of multiple imputation for a wide class of asymptotically linear full sample estimators. This representation invokes the wild bootstrap inference to provide consistent variance estimation under the correct specification of the imputation models. As a motivating application, we illustrate the proposed method to estimate the average causal effect (ACE) with partially observed confounders in causal inference. Our framework applies to asymptotically linear ACE estimators, including the regression imputation, weighting, and matching estimators. We extend to the scenarios when both outcome and confounders are subject to missingness and when the data are missing not at random.

preprint2023arXiv

Causal inference methods for combining randomized trials and observational studies: a review

With increasing data availability, causal effects can be evaluated across different data sets, both randomized controlled trials (RCTs) and observational studies. RCTs isolate the effect of the treatment from that of unwanted (confounding) co-occurring effects but they may suffer from unrepresentativeness, and thus lack external validity. On the other hand, large observational samples are often more representative of the target population but can conflate confounding effects with the treatment of interest. In this paper, we review the growing literature on methods for causal inference on combined RCTs and observational studies, striving for the best of both worlds. We first discuss identification and estimation methods that improve generalizability of RCTs using the representativeness of observational data. Classical estimators include weighting, difference between conditional outcome models, and doubly robust estimators. We then discuss methods that combine RCTs and observational data to either ensure uncounfoundedness of the observational analysis or to improve (conditional) average treatment effect estimation. We also connect and contrast works developed in both the potential outcomes literature and the structural causal model literature. Finally, we compare the main methods using a simulation study and real world data to analyze the effect of tranexamic acid on the mortality rate in major trauma patients. A review of available codes and new implementations is also provided.

preprint2023arXiv

Efficient and robust transfer learning of optimal individualized treatment regimes with right-censored survival data

An individualized treatment regime (ITR) is a decision rule that assigns treatments based on patients' characteristics. The value function of an ITR is the expected outcome in a counterfactual world had this ITR been implemented. Recently, there has been increasing interest in combining heterogeneous data sources, such as leveraging the complementary features of randomized controlled trial (RCT) data and a large observational study (OS). Usually, a covariate shift exists between the source and target population, rendering the source-optimal ITR unnecessarily optimal for the target population. We present an efficient and robust transfer learning framework for estimating the optimal ITR with right-censored survival data that generalizes well to the target population. The value function accommodates a broad class of functionals of survival distributions, including survival probabilities and restrictive mean survival times (RMSTs). We propose a doubly robust estimator of the value function, and the optimal ITR is learned by maximizing the value function within a pre-specified class of ITRs. We establish the $N^{-1/3}$ rate of convergence for the estimated parameter indexing the optimal ITR, and show that the proposed optimal value estimator is consistent and asymptotically normal even with flexible machine learning methods for nuisance parameter estimation. We evaluate the empirical performance of the proposed method by simulation studies and a real data application of sodium bicarbonate therapy for patients with severe metabolic acidaemia in the intensive care unit (ICU), combining a RCT and an observational study with heterogeneity.

preprint2022arXiv

Asymptotic causal inference with observational studies trimmed by the estimated propensity scores

Causal inference with observational studies often relies on the assumptions of unconfoundedness and overlap of covariate distributions in different treatment groups. The overlap assumption is violated when some units have propensity scores close to 0 or 1, and therefore both practical and theoretical researchers suggest dropping units with extreme estimated propensity scores. However, existing trimming methods ignore the uncertainty in this design stage and restrict inference only to the trimmed sample, due to the non-smoothness of the trimming. We propose a smooth weighting, which approximates the existing sample trimming but has better asymptotic properties. An advantage of the new smoothly weighted estimator is its asymptotic linearity, which ensures that the bootstrap can be used to make inference for the target population, incorporating uncertainty arising from both the design and analysis stages. We also extend the theory to the average treatment effect on the treated, suggesting trimming samples with estimated propensity scores close to 1.

preprint2022arXiv

Multiply robust estimation of causal effects under principal ignorability

Causal inference concerns not only the average effect of the treatment on the outcome but also the underlying mechanism through an intermediate variable of interest. Principal stratification characterizes such a mechanism by targeting subgroup causal effects within principal strata, which are defined by the joint potential values of an intermediate variable. Due to the fundamental problem of causal inference, principal strata are inherently latent, rendering it challenging to identify and estimate subgroup effects within them. A line of research leverages the principal ignorability assumption that the latent principal strata are mean independent of the potential outcomes conditioning on the observed covariates. Under principal ignorability, we derive various nonparametric identification formulas for causal effects within principal strata in observational studies, which motivate estimators relying on the correct specifications of different parts of the observed-data distribution. Appropriately combining these estimators yields triply robust estimators for the causal effects within principal strata. These triply robust estimators are consistent if two of the treatment, intermediate variable, and outcome models are correctly specified, and moreover, they are locally efficient if all three models are correctly specified. We show that these estimators arise naturally from either the efficient influence functions in the semiparametric theory or the model-assisted estimators in the survey sampling theory. We evaluate different estimators based on their finite-sample performance through simulation and apply them to two observational studies.

preprint2022arXiv

Robust analyses for longitudinal clinical trials with missing and non-normal continuous outcomes

Missing data is unavoidable in longitudinal clinical trials, and outcomes are not always normally distributed. In the presence of outliers or heavy-tailed distributions, the conventional multiple imputation with the mixed model with repeated measures analysis of the average treatment effect (ATE) based on the multivariate normal assumption may produce bias and power loss. Control-based imputation (CBI) is an approach for evaluating the treatment effect under the assumption that participants in both the test and control groups with missing outcome data have a similar outcome profile as those with an identical history in the control group. We develop a general robust framework to handle non-normal outcomes under CBI without imposing any parametric modeling assumptions. Under the proposed framework, sequential weighted robust regressions are applied to protect the constructed imputation model against non-normality in both the covariates and the response variables. Accompanied by the subsequent mean imputation and robust model analysis, the resulting ATE estimator has good theoretical properties in terms of consistency and asymptotic normality. Moreover, our proposed method guarantees the analysis model robustness of the ATE estimation, in the sense that its asymptotic results remain intact even when the analysis model is misspecified. The superiority of the proposed robust method is demonstrated by comprehensive simulation studies and an AIDS clinical trial data application.

preprint2022arXiv

Sensitivity analysis in longitudinal clinical trials via distributional imputation

Missing data is inevitable in longitudinal clinical trials. Conventionally, the missing at random assumption is assumed to handle missingness, which however is unverifiable empirically. Thus, sensitivity analysis is critically important to assess the robustness of the study conclusions against untestable assumptions. Toward this end, regulatory agencies often request using imputation models such as return-to-baseline, control-based, and washout imputation. Multiple imputation is popular in sensitivity analysis; however, it may be inefficient and result in an unsatisfying interval estimation by Rubin's combining rule. We propose distributional imputation (DI) in sensitivity analysis, which imputes each missing value by samples from its target imputation model given the observed data. Drawn on the idea of Monte Carlo integration, the DI estimator solves the mean estimating equations of the imputed dataset. It is fully efficient with theoretical guarantees. Moreover, we propose weighted bootstrap to obtain a consistent variance estimator, taking into account the variabilities due to model parameter estimation and target parameter estimation. The finite-sample performance of DI inference is assessed in the simulation study. We apply the proposed framework to an antidepressant longitudinal clinical trial involving missing data to investigate the robustness of the treatment effect. Our proposed DI approach detects a statistically significant treatment effect in both the primary analysis and sensitivity analysis under certain prespecified sensitivity models in terms of the average treatment effect, the risk difference, and the quantile treatment effect in lower quantiles of the responses, uncovering the benefit of the test drug for curing depression.

preprint2022arXiv

UMSNet: An Universal Multi-sensor Network for Human Activity Recognition

Human activity recognition (HAR) based on multimodal sensors has become a rapidly growing branch of biometric recognition and artificial intelligence. However, how to fully mine multimodal time series data and effectively learn accurate behavioral features has always been a hot topic in this field. Practical applications also require a well-generalized framework that can quickly process a variety of raw sensor data and learn better feature representations. This paper proposes a universal multi-sensor network (UMSNet) for human activity recognition. In particular, we propose a new lightweight sensor residual block (called LSR block), which improves the performance by reducing the number of activation function and normalization layers, and adding inverted bottleneck structure and grouping convolution. Then, the Transformer is used to extract the relationship of series features to realize the classification and recognition of human activities. Our framework has a clear structure and can be directly applied to various types of multi-modal Time Series Classification (TSC) tasks after simple specialization. Extensive experiments show that the proposed UMSNet outperforms other state-of-the-art methods on two popular multi-sensor human activity recognition datasets (i.e. HHAR dataset and MHEALTH dataset).

preprint2021arXiv

Estimating intervention effects on infectious disease control: the effect of community mobility reduction on Coronavirus spread

Understanding the effects of interventions, such as restrictions on community and large group gatherings, is critical to controlling the spread of COVID-19. Susceptible-Infectious-Recovered (SIR) models are traditionally used to forecast the infection rates but do not provide insights into the causal effects of interventions. We propose a spatiotemporal model that estimates the causal effect of changes in community mobility (intervention) on infection rates. Using an approximation to the SIR model and incorporating spatiotemporal dependence, the proposed model estimates a direct and indirect (spillover) effect of intervention. Under an interference and treatment ignorability assumption, this model is able to estimate causal intervention effects, and additionally allows for spatial interference between locations. Reductions in community mobility were measured by cell phone movement data. The results suggest that the reductions in mobility decrease Coronavirus cases 4 to 7 weeks after the intervention.

preprint2021arXiv

Estimation of Partially Conditional Average Treatment Effect by Hybrid Kernel-covariate Balancing

We study nonparametric estimation for the partially conditional average treatment effect, defined as the treatment effect function over an interested subset of confounders. We propose a hybrid kernel weighting estimator where the weights aim to control the balancing error of any function of the confounders from a reproducing kernel Hilbert space after kernel smoothing over the subset of interested variables. In addition, we present an augmented version of our estimator which can incorporate estimations of outcome mean functions. Based on the representer theorem, gradient-based algorithms can be applied for solving the corresponding infinite-dimensional optimization problem. Asymptotic properties are studied without any smoothness assumptions for propensity score function or the need of data splitting, relaxing certain existing stringent assumptions. The numerical performance of the proposed estimator is demonstrated by a simulation study and an application to the effect of a mother's smoking on a baby's birth weight conditioned on the mother's age.

preprint2021arXiv

Improving trial generalizability using observational studies

Complementary features of randomized controlled trials (RCTs) and observational studies (OSs) can be used jointly to estimate the average treatment effect of a target population. We propose a calibration weighting estimator that enforces the covariate balance between the RCT and OS, therefore improving the trial-based estimator's generalizability. Exploiting semiparametric efficiency theory, we propose a doubly robust augmented calibration weighting estimator that achieves the efficiency bound derived under the identification assumptions. A nonparametric sieve method is provided as an alternative to the parametric approach, which enables the robust approximation of the nuisance functions and data-adaptive selection of outcome predictors for calibration. We establish asymptotic results and confirm the finite sample performances of the proposed estimators by simulation experiments and an application on the estimation of the treatment effect of adjuvant chemotherapy for early-stage non-small cell lung patients after surgery.

preprint2021arXiv

Instrumental variables, spatial confounding and interference

Unobserved spatial confounding variables are prevalent in environmental and ecological applications where the system under study is complex and the data are often observational. Instrumental variables (IVs) are a common way to address unobserved confounding; however, the efficacy of using IVs on spatial confounding is largely unknown. This paper explores the effectiveness of IVs in this situation -- with particular attention paid to the spatial scale of the instrument. We show that, in case of spatially-dependent treatments, IVs are most effective when they vary at a finer spatial resolution than the treatment. We investigate IV performance in extensive simulations and apply the model in the example of long term trends in the air pollution and cardiovascular mortality in the United States over 1990-2010. Finally, the IV approach is also extended to the spatial interference setting, in which treatments can affect nearby responses.

preprint2020arXiv

A review of spatial causal inference methods for environmental and epidemiological applications

The scientific rigor and computational methods of causal inference have had great impacts on many disciplines, but have only recently begun to take hold in spatial applications. Spatial casual inference poses analytic challenges due to complex correlation structures and interference between the treatment at one location and the outcomes at others. In this paper, we review the current literature on spatial causal inference and identify areas of future work. We first discuss methods that exploit spatial structure to account for unmeasured confounding variables. We then discuss causal analysis in the presence of spatial interference including several common assumptions used to reduce the complexity of the interference patterns under consideration. These methods are extended to the spatiotemporal case where we compare and contrast the potential outcomes framework with Granger causality, and to geostatistical analyses involving spatial random fields of treatments and responses. The methods are introduced in the context of observational environmental and epidemiological studies, and are compared using both a simulation study and analysis of the effect of ambient air pollution on COVID-19 mortality rate. Code to implement many of the methods using the popular Bayesian software OpenBUGS is provided.

preprint2020arXiv

A spatial causal analysis of wildland fire-contributed PM2.5 using numerical model output

Wildland fire smoke contains hazardous levels of fine particulate matter PM2.5, a pollutant shown to adversely effect health. Estimating fire attributable PM2.5 concentrations is key to quantifying the impact on air quality and subsequent health burden. This is a challenging problem since only total PM2.5 is measured at monitoring stations and both fire-attributable PM2.5 and PM2.5 from all other sources are correlated in space and time. We propose a framework for estimating fire-contributed PM2.5 and PM2.5 from all other sources using a novel causal inference framework and bias-adjusted chemical model representations of PM2.5 under counterfactual scenarios. The chemical model representation of PM2.5 for this analysis is simulated using Community Multi-Scale Air Quality Modeling System (CMAQ), run with and without fire emissions across the contiguous U.S. for the 2008-2012 wildfire seasons. The CMAQ output is calibrated with observations from monitoring sites for the same spatial domain and time period. We use a Bayesian model that accounts for spatial variation to estimate the effect of wildland fires on PM2.5 and state assumptions under which the estimate has a valid causal interpretation. Our results include estimates of absolute, relative and cumulative contributions of wildfire smoke to PM2.5 for the contiguous U.S. Additionally, we compute the health burden associated with the PM2.5 attributable to wildfire smoke.

preprint2020arXiv

Effects of Inflow Turbulence on Structural Deformation of Wind Turbine Blades

The present investigation provides the first field characterization of the influence of turbulent inflow on the blade structural response of a utility-scale wind turbine (2.5MW), using the unique facility available at the Eolos Wind Energy Research Station of the University of Minnesota. A representative one-hour dataset under a stable atmosphere is selected for the characterization, including the inflow turbulent data measured from the meteorological tower, high-resolution blade strain measurement at different circumferential and radiation positions along the blade, and the wind turbine operational conditions. The results indicate that the turbulent inflow modulates the turbine blade structural response in three representative frequency ranges: a lower frequency range (corresponding to modulations due to large eddies in the atmosphere), a higher frequency range (corresponding to flow structures in scales smaller than the rotor diameter), and an intermediate-range in between. The blade structure responds strongly to the turbulent inflow in the lower and intermediate ranges, while it is primarily dominated by the rotation effect and other high-frequency characteristics of wind turbines in the higher frequency range. Moreover, the blade structural behaviors at different azimuth angles, circumferential and radial locations along the blade are also compared, suggesting the comparatively high possibility of structural failure at certain positions. Further, the present study also uncovers the linkage between the turbulent inflow and blade structural response using temporal correlation. The derived findings provide insights into the development of advanced control strategies or blade design to mitigate the structural impact and increase blade longevity for the safer and more efficient operation of large-scale wind turbines.

preprint2020arXiv

Estimating Average Treatment Effects Utilizing Fractional Imputation when Confounders are Subject to Missingness

The problem of missingness in observational data is ubiquitous. When the confounders are missing at random, multiple imputation is commonly used; however, the method requires congeniality conditions for valid inferences, which may not be satisfied when estimating average causal treatment effects. Alternatively, fractional imputation, proposed by Kim 2011, has been implemented to handling missing values in regression context. In this article, we develop fractional imputation methods for estimating the average treatment effects with confounders missing at random. We show that the fractional imputation estimator of the average treatment effect is asymptotically normal, which permits a consistent variance estimate. Via simulation study, we compare fractional imputation's accuracy and precision with that of multiple imputation.

preprint2020arXiv

Generalized propensity score approach to causal inference with spatial interference

Many spatial phenomena exhibit treatment interference where treatments at one location may affect the response at other locations. Because interference violates the stable unit treatment value assumption, standard methods for causal inference do not apply. We propose a new causal framework to recover direct and spill-over effects in the presence of spatial interference, taking into account that treatments at nearby locations are more influential than treatments at locations further apart. Under the no unmeasured confounding assumption, we show that a generalized propensity score is sufficient to remove all measured confounding. To reduce dimensionality issues, we propose a Bayesian spline-based regression model accounting for a sufficient set of variables for the generalized propensity score. A simulation study demonstrates the accuracy and coverage properties. We apply the method to estimate the causal effect of wildland fires on air pollution in the Western United States over 2005--2018.

preprint2020arXiv

Multiply robust matching estimators of average and quantile treatment effects

Propensity score matching has been a long-standing tradition for handling confounding in causal inference, however requiring stringent model assumptions. In this article, we propose double score matching(DSM) for general causal estimands utilizing two balancing scores including the propensity score and prognostic score. To gain the protection of possible model misspecification, we posit multiple candidate models for each score. We show that the de-biasing DSM estimator achieves the multiple robustness property in that it is consistent for the true causal estimand if any model of the propensity score or prognostic score is correct.

preprint2020arXiv

Robust inference of conditional average treatment effects using dimension reduction

It is important to make robust inference of the conditional average treatment effect from observational data, but this becomes challenging when the confounder is multivariate or high-dimensional. In this article, we propose a double dimension reduction method, which reduces the curse of dimensionality as much as possible while keeping the nonparametric merit. We identify the central mean subspace of the conditional average treatment effect using dimension reduction. A nonparametric regression with prior dimension reduction is also used to impute counterfactual outcomes. This step helps improve the stability of the imputation and leads to a better estimator than existing methods. We then propose an effective bootstrapping procedure without bootstrapping the estimated central mean subspace to make valid inference.

preprint2020arXiv

Semiparametric efficient estimation of structural nested mean models with irregularly spaced observations

Structural Nested Mean Models (SNMMs) are useful for causal inference of treatment effects in longitudinal observational studies. Most existing works assume that the data are collected at pre-fixed time points for all subjects, which, however, is restrictive in practice. To deal with irregularly spaced observations, we assume a class of continuous-time SNMMs and a martingale condition of no unmeasured confounding (NUC) to identify the causal parameters. We develop the first semiparametric efficiency theory and locally efficient estimators for continuous-time SNMMs. This task is non-trivial due to the restrictions from the NUC assumption imposed on the SNMM parameter. In the presence of dependent censoring, we propose an inverse probability of censoring weighting estimator, which achieves a multiple robustness feature in that it is unbiased if either the model for the treatment process or the potential outcome mean function is correctly specified, regardless whether the censoring model is correctly specified. The new framework allows us to conduct causal analysis respecting the underlying continuous-time nature of the data processes. We estimate the effect of time to initiate highly active antiretroviral therapy on the CD4 count at year 2 from the observational Acute Infection and Early Disease Research Program database.

preprint2020arXiv

SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization

With the increasing popularity of graph-based learning, Graph Neural Networks (GNNs) win lots of attention from the research and industry field because of their high accuracy. However, existing GNNs suffer from high memory footprints (e.g., node embedding features). This high memory footprint hurdles the potential applications towards memory-constrained devices, such as the widely-deployed IoT devices. To this end, we propose a specialized GNN quantization scheme, SGQuant, to systematically reduce the GNN memory consumption. Specifically, we first propose a GNN-tailored quantization algorithm design and a GNN quantization fine-tuning scheme to reduce memory consumption while maintaining accuracy. Then, we investigate the multi-granularity quantization strategy that operates at different levels (components, graph topology, and layers) of GNN computation. Moreover, we offer an automatic bit-selecting (ABS) to pinpoint the most appropriate quantization bits for the above multi-granularity quantizations. Intensive experiments show that SGQuant can effectively reduce the memory footprint from 4.25x to 31.9x compared with the original full-precision GNNs while limiting the accuracy drop to 0.4% on average.

preprint2020arXiv

Statistical Data Integration in Survey Sampling: A Review

Finite population inference is a central goal in survey sampling. Probability sampling is the main statistical approach to finite population inference. Challenges arise due to high cost and increasing non-response rates. Data integration provides a timely solution by leveraging multiple data sources to provide more robust and efficient inference than using any single data source alone. The technique for data integration varies depending on types of samples and available information to be combined. This article provides a systematic review of data integration techniques for combining probability samples, probability and non-probability samples, and probability and big data samples. We discuss a wide range of integration methods such as generalized least squares, calibration weighting, inverse probability weighting, mass imputation and doubly robust methods. Finally, we highlight important questions for future research.