Source author record

Hyunseung Kang

Hyunseung Kang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Applications Machine Learning

Catalog footprint

What is connected

11works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A More Efficient, Doubly Robust, Nonparametric Estimator of Treatment Effects in Multilevel Studies

When studying treatment effects in multilevel studies, investigators commonly use (semi-)parametric estimators, which make strong parametric assumptions about the outcome, the treatment, and/or the correlation structure between study units in a cluster. We propose a novel estimator of treatment effects that does not make such assumptions. Specifically, the new estimator is shown to be doubly robust, asymptotically Normal, and often more efficient than existing estimators, all without having to make any parametric modeling assumptions about the outcome, the treatment, and the correlation structure. We achieve this by estimating two non-standard nuisance functions in causal inference, the conditional propensity score and the outcome covariance model, using existing existing machine learning methods designed for independent and identically distributed (i.i.d) data. The new estimator is also demonstrated in simulated and real data where the new estimator is drastically more efficient than existing estimators, especially when studying cluster-specific treatment effects.

preprint2022arXiv

Propensity Score Modeling: Key Challenges When Moving Beyond the No-Interference Assumption

The paper presents some models for the propensity score. Considerable attention is given to a recently popular, but relatively under-explored setting in causal inference where the no-interference assumption does not hold. We lay out some key challenges in propensity score modeling under interference and present a few promising models based on existing works on mixed effects models.

preprint2022arXiv

Semiparametric Efficient Dimension Reduction in multivariate regression with an Inner Envelope

Recently, Su and Cook proposed a dimension reduction technique called the inner envelope which can be substantially more efficient than the original envelope or existing dimension reduction techniques for multivariate regression. However, their technique relied on a linear model with normally distributed error, which may be violated in practice. In this work, we propose a semiparametric variant of the inner envelope that does not rely on the linear model nor the normality assumption. We show that our proposal leads to globally and locally efficient estimators of the inner envelope spaces. We also present a computationally tractable algorithm to estimate the inner envelope. Our simulations and real data analysis show that our method is both robust and efficient compared to existing dimension reduction methods in a diverse array of settings.

preprint2021arXiv

Detecting Heterogeneous Treatment Effect with Instrumental Variables

There is an increasing interest in estimating heterogeneity in causal effects in randomized and observational studies. However, little research has been conducted to understand heterogeneity in an instrumental variables study. In this work, we present a method to estimate heterogeneous causal effects using an instrumental variable approach. The method has two parts. The first part uses subject-matter knowledge and interpretable machine learning techniques, such as classification and regression trees, to discover potential effect modifiers. The second part uses closed testing to test for the statistical significance of the effect modifiers while strongly controlling familywise error rate. We conducted this method on the Oregon Health Insurance Experiment, estimating the effect of Medicaid on the number of days an individual's health does not impede their usual activities, and found evidence of heterogeneity in older men who prefer English and don't self-identify as Asian and younger individuals who have at most a high school diploma or GED and prefer English.

preprint2020arXiv

Inferring Treatment Effects After Testing Instrument Strength in Linear Models

A common practice in IV studies is to check for instrument strength, i.e. its association to the treatment, with an F-test from regression. If the F-statistic is above some threshold, usually 10, the instrument is deemed to satisfy one of the three core IV assumptions and used to test for the treatment effect. However, in many cases, the inference on the treatment effect does not take into account the strength test done a priori. In this paper, we show that not accounting for this pretest can severely distort the distribution of the test statistic and propose a method to correct this distortion, producing valid inference. A key insight in our method is to frame the F-test as a randomized convex optimization problem and to leverage recent methods in selective inference. We prove that our method provides conditional and marginal Type I error control. We also extend our method to weak instrument settings. We conclude with a reanalysis of studies concerning the effect of education on earning where we show that not accounting for pre-testing can dramatically alter the original conclusion about education's effects.

preprint2020arXiv

ivmodel: An R Package for Inference and Sensitivity Analysis of Instrumental Variables Models with One Endogenous Variable

We present a comprehensive R software ivmodel for analyzing instrumental variables with one endogenous variable. The package implements a general class of estimators called k- class estimators and two confidence intervals that are fully robust to weak instruments. The package also provides power formulas for various test statistics in instrumental variables. Finally, the package contains methods for sensitivity analysis to examine the sensitivity of the inference to instrumental variables assumptions. We demonstrate the software on the data set from Card (1995), looking at the causal effect of levels of education on log earnings where the instrument is proximity to a four-year college.

preprint2020arXiv

Two Robust Tools for Inference about Causal Effects with Invalid Instruments

Instrumental variables have been widely used to estimate the causal effect of a treatment on an outcome. Existing confidence intervals for causal effects based on instrumental variables assume that all of the putative instrumental variables are valid; a valid instrumental variable is a variable that affects the outcome only by affecting the treatment and is not related to unmeasured confounders. However, in practice, some of the putative instrumental variables are likely to be invalid. This paper presents two tools to conduct valid inference and tests in the presence of invalid instruments. First, we propose a simple and general approach to construct confidence intervals based on taking unions of well-known confidence intervals. Second, we propose a novel test for the null causal effect based on a collider bias. Our two proposals, especially when fused together, outperform traditional instrumental variable confidence intervals when invalid instruments are present, and can also be used as a sensitivity analysis when there is concern that instrumental variables assumptions are violated. The new approach is applied to a Mendelian randomization study on the causal effect of low-density lipoprotein on the incidence of cardiovascular diseases.

preprint2016arXiv

A simple and robust confidence interval for causal effects with possibly invalid instruments

Instrumental variables have been widely used to estimate the causal effect of a treatment on an outcome. Existing confidence intervals for causal effects based on instrumental variables assume that all of the putative instrumental variables are valid; a valid instrumental variable is a variable that affects the outcome only by affecting the treatment and is not related to unmeasured confounders. However, in practice, some of the putative instrumental variables are likely to be invalid. This paper presents a simple and general approach to construct a confidence interval that is robust to possibly invalid instruments. The robust confidence interval has theoretical guarantees on having the correct coverage and can also be used to assess the sensitivity of inference when instrumental variables assumptions are violated. The paper also shows that the robust confidence interval outperforms traditional confidence intervals popular in instrumental variables literature when invalid instruments are present. The new approach is applied to a developmental economics study of the causal effect of income on food expenditures.

preprint2016arXiv

Peer Encouragement Designs in Causal Inference with Partial Interference and Identification of Local Average Network Effects

In non-network settings, encouragement designs have been widely used to analyze causal effects of a treatment, policy, or intervention on an outcome of interest when randomizing the treatment was considered impractical or when compliance to treatment cannot be perfectly enforced. Unfortunately, such questions related to treatment compliance have received less attention in network settings and the most well-studied experimental design in networks, the two-stage randomization design, requires perfect compliance with treatment. The paper proposes a new experimental design called peer encouragement design to study network treatment effects when enforcing treatment randomization is not feasible. The key idea in peer encouragement design is the idea of personalized encouragement, which allows point-identification of familiar estimands in the encouragement design literature. The paper also defines new causal estimands, local average network effects, that can be identified under the new design and analyzes the effect of non-compliance behavior in randomized experiments on networks.

preprint2015arXiv

Full Matching Approach to Instrumental Variables Estimation with Application to the Effect of Malaria on Stunting

Most previous studies of the causal relationship between malaria and stunting have been studies where potential confounders are controlled via regression-based methods, but these studies may have been biased by unobserved confounders. Instrumental variables (IV) regression offers a way to control for unmeasured confounders where, in our case, the sickle cell trait can be used as an instrument. However, for the instrument to be valid, it may still be important to account for measured confounders. The most commonly used instrumental variable regression method, two-stage least squares, relies on parametric assumptions on the effects of measured confounders to account for them. Additionally, two-stage least squares lacks transparency with respect to covariate balance and weighing of subjects and does not blind the researcher to the outcome data. To address these drawbacks, we propose an alternative method for IV estimation based on full matching. We evaluate our new procedure on simulated data and real data concerning the causal effect of malaria on stunting among children. We estimate that the risk of stunting among children with the sickle cell trait decrease by 0.22 times the average number of malaria episodes prevented by the sickle cell trait, a substantial effect of malaria on stunting (p-value: 0.011, 95% CI: 0.044, 1).

preprint2014arXiv

Instrumental Variables Estimation with Some Invalid Instruments and its Application to Mendelian Randomization

Instrumental variables have been widely used for estimating the causal effect between exposure and outcome. Conventional estimation methods require complete knowledge about all the instruments' validity; a valid instrument must not have a direct effect on the outcome and not be related to unmeasured confounders. Often, this is impractical as highlighted by Mendelian randomization studies where genetic markers are used as instruments and complete knowledge about instruments' validity is equivalent to complete knowledge about the involved genes' functions. In this paper, we propose a method for estimation of causal effects when this complete knowledge is absent. It is shown that causal effects are identified and can be estimated as long as less than $50$% of instruments are invalid, without knowing which of the instruments are invalid. We also introduce conditions for identification when the 50% threshold is violated. A fast penalized $\ell_1$ estimation method, called sisVIVE, is introduced for estimating the causal effect without knowing which instruments are valid, with theoretical guarantees on its performance. The proposed method is demonstrated on simulated data and a real Mendelian randomization study concerning the effect of body mass index on health-related quality of life index. An R package \emph{sisVIVE} is available online.

Hyunseung Kang

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

A More Efficient, Doubly Robust, Nonparametric Estimator of Treatment Effects in Multilevel Studies

Propensity Score Modeling: Key Challenges When Moving Beyond the No-Interference Assumption

Semiparametric Efficient Dimension Reduction in multivariate regression with an Inner Envelope

Detecting Heterogeneous Treatment Effect with Instrumental Variables

Inferring Treatment Effects After Testing Instrument Strength in Linear Models

ivmodel: An R Package for Inference and Sensitivity Analysis of Instrumental Variables Models with One Endogenous Variable

Two Robust Tools for Inference about Causal Effects with Invalid Instruments

A simple and robust confidence interval for causal effects with possibly invalid instruments

Peer Encouragement Designs in Causal Inference with Partial Interference and Identification of Local Average Network Effects

Full Matching Approach to Instrumental Variables Estimation with Application to the Effect of Malaria on Stunting

Instrumental Variables Estimation with Some Invalid Instruments and its Application to Mendelian Randomization