Researcher profile

Chao Cheng

Chao Cheng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

Semiparametric causal mediation analysis of cluster-randomized trials for indirect and spillover effects

In cluster-randomized trials (CRTs), there is emerging interest in exploring the causal mechanism in which a cluster-level treatment affects the outcome through an intermediate outcome. The majority of existing causal mediation methods are applicable to independent data and only a few exceptions have considered assessing causal mediation in CRTs, all of which heavily depend on parametric assumptions. In this article, we develop a formal semiparametric efficiency theory to motivate new doubly-robust methods for addressing different mediation effect estimands -- the natural indirect effect, individual mediation effect, and spillover mediation effect (the extent to which one's outcome is influenced by others' mediators). We derive the efficient influence function for each estimand, and carefully parameterize each efficient influence function to motivate practical estimators. We consider both parametric working models and data-adaptive machine learners to estimate the nuisance functions, and obtain the semiparametric efficient estimators in the latter case. We conduct simulation studies to demonstrate the finite-sample performance of our new estimators and illustrate our proposed methods by reanalyzing a real-world CRT.

preprint2022arXiv

Addressing Extreme Propensity Scores in Estimating Counterfactual Survival Functions via the Overlap Weights

The inverse probability weighting approach is popular for evaluating treatment effects in observational studies, but extreme propensity scores could bias the estimator and induce excessive variance. Recently, the overlap weighting approach has been proposed to alleviate this problem, which smoothly down-weighs the subjects with extreme propensity scores. Although advantages of overlap weighting have been extensively demonstrated in literature with continuous and binary outcomes, research on its performance with time-to-event or survival outcomes is limited. In this article, we propose two weighting estimators that combine propensity score weighting and inverse probability of censoring weighting to estimate the counterfactual survival functions. These estimators are applicable to the general class of balancing weights, which includes inverse probability weighting, trimming, and overlap weighting as special cases. We conduct simulations to examine the empirical performance of these estimators with different weighting schemes in terms of bias, variance, and 95% confidence interval coverage, under various degree of covariate overlap between treatment groups and censoring rate. We demonstrate that overlap weighting consistently outperforms inverse probability weighting and associated trimming methods in bias, variance, and coverage for time-to-event outcomes, and the advantages increase as the degree of covariate overlap between the treatment groups decreases.

preprint2022arXiv

On intermittency in sheared granular systems

We consider a system of granular particles, modeled by two dimensional frictional elastic disks, that is exposed to externally applied time-dependent shear stress in a planar Couette geometry. We concentrate on the external forcing that produces intermittent dynamics of stick-slip type. In this regime, the top wall remains almost at rest until the applied stress becomes sufficiently large, and then it slips. We focus on the evolution of the system as it approaches a slip event. Our main finding is that there are two distinct groups of measures describing system behavior before a slip event. The first group consists of global measures defined as system-wide averages at a fixed time. Typical examples of measures in this group are averages of the normal or tangent forces acting between the particles, system size and number of contacts between the particles. These measures do not seem to be sensitive to an approaching slip event. On average, they tend to increase linearly with the force pulling the spring. The second group consists of the time-dependent measures that quantify the evolution of the system on a micro (particle) or mesoscale. Measures in this group first quantify the temporal differences between two states and only then aggregate them to a single number. For example, Wasserstein distance quantitatively measures the changes of the force network as it evolves in time while the number of broken contacts quantifies the evolution of the contact network. The behavior of the measures in the second group changes dramatically before a slip event starts. They increase rapidly as a slip event approaches, indicating a significant increase in fluctuations of the system before a slip event is triggered.

preprint2020arXiv

Failure of confined granular media due to pullout of an intruder: From force networks to a system wide response

We investigate computationally the pullout of a spherical intruder initially buried at the bottom of a granular column. The intruder starts to move out of the granular bed once the pulling force reaches a critical value, leading to material failure. The failure point is found to depend on the diameter of the granular column, pointing out the importance of particle-wall interaction in determining the material response. Discrete element simulations show that prior to failure, the contact network is essentially static, with only minor rearrangements of the particles. However, the force network, which includes not only the contact information, but also the information about the interaction strength, undergoes a nontrivial evolution. An initial insight is reached by considering the relative magnitudes of normal and tangential forces between the particles, and in particular the proportion of contacts that reach Coulomb threshold. More detailed understanding of the processes leading to failure is reached by the analysis of both spatial and temporal properties of the force network using the tools of persistent homology. We find that the forces between the particles undergo intermittent temporal variations ahead of the failure. In addition to this temporal intermittency, the response of the force network is found to be spatially dependent and influenced by proximity to the intruder. Furthermore, the response is modified significantly by the interaction strength, with the relevant measures describing the response showing differing behavior for the contacts characterized by large interaction forces.

preprint2020arXiv

Parallel subgroup analysis of high-dimensional data via M-regression

It becomes an interesting problem to identify subgroup structures in data analysis as populations are probably heterogeneous in practice. In this paper, we consider M-estimators together with both concave and pairwise fusion penalties, which can deal with high-dimensional data containing some outliers. The penalties are applied both on covariates and treatment effects, where the estimation is expected to achieve both variable selection and data clustering simultaneously. An algorithm is proposed to process relatively large datasets based on parallel computing. We establish the convergence analysis of the proposed algorithm, the oracle property of the penalized M-estimators, and the selection consistency of the proposed criterion. Our numerical study demonstrates that the proposed method is promising to efficiently identify subgroups hidden in high-dimensional data.