Source author record

Max H. Farrell

Max H. Farrell appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

econ.EM Methodology math.ST Statistics Theory Computation econ.GN Machine Learning q-fin.EC

Catalog footprint

What is connected

5works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Optimal Bandwidth Choice for Robust Bias Corrected Inference in Regression Discontinuity Designs

Modern empirical work in Regression Discontinuity (RD) designs often employs local polynomial estimation and inference with a mean square error (MSE) optimal bandwidth choice. This bandwidth yields an MSE-optimal RD treatment effect estimator, but is by construction invalid for inference. Robust bias corrected (RBC) inference methods are valid when using the MSE-optimal bandwidth, but we show they yield suboptimal confidence intervals in terms of coverage error. We establish valid coverage error expansions for RBC confidence interval estimators and use these results to propose new inference-optimal bandwidth choices for forming these intervals. We find that the standard MSE-optimal bandwidth for the RD point estimator is too large when the goal is to construct RBC confidence intervals with the smallest coverage error. We further optimize the constant terms behind the coverage error to derive new optimal choices for the auxiliary bandwidth required for RBC inference. Our expansions also establish that RBC inference yields higher-order refinements (relative to traditional undersmoothing) in the context of RD designs. Our main results cover sharp and sharp kink RD designs under conditional heteroskedasticity, and we discuss extensions to fuzzy and other RD designs, clustered sampling, and pre-intervention covariates adjustments. The theoretical findings are illustrated with a Monte Carlo experiment and an empirical application, and the main methodological results are available in \texttt{R} and \texttt{Stata} packages.

preprint2019arXiv

Characteristic-Sorted Portfolios: Estimation and Inference

Portfolio sorting is ubiquitous in the empirical finance literature, where it has been widely used to identify pricing anomalies. Despite its popularity, little attention has been paid to the statistical properties of the procedure. We develop a general framework for portfolio sorting by casting it as a nonparametric estimator. We present valid asymptotic inference methods and a valid mean square error expansion of the estimator leading to an optimal choice for the number of portfolios. In practical settings, the optimal choice may be much larger than the standard choices of 5 or 10. To illustrate the relevance of our results, we revisit the size and momentum anomalies.

preprint2019arXiv

Deep Neural Networks for Estimation and Inference

We study deep neural networks and their use in semiparametric inference. We establish novel rates of convergence for deep feedforward neural nets. Our new rates are sufficiently fast (in some cases minimax optimal) to allow us to establish valid second-step inference after first-step estimation with deep learning, a result also new to the literature. Our estimation rates and semiparametric inference results handle the current standard architecture: fully connected feedforward neural networks (multi-layer perceptrons), with the now-common rectified linear unit activation function and a depth explicitly diverging with the sample size. We discuss other architectures as well, including fixed-width, very deep networks. We establish nonasymptotic bounds for these deep nets for a general class of nonparametric regression-type loss functions, which includes as special cases least squares, logistic regression, and other generalized linear models. We then apply our theory to develop semiparametric inference, focusing on causal parameters for concreteness, such as treatment effects, expected welfare, and decomposition effects. Inference in many other semiparametric contexts can be readily obtained. We demonstrate the effectiveness of deep learning with a Monte Carlo analysis and an empirical application to direct mail marketing.

preprint2019arXiv

Large Sample Properties of Partitioning-Based Series Estimators

We present large sample results for partitioning-based least squares nonparametric regression, a popular method for approximating conditional expectation functions in statistics, econometrics, and machine learning. First, we obtain a general characterization of their leading asymptotic bias. Second, we establish integrated mean squared error approximations for the point estimator and propose feasible tuning parameter selection. Third, we develop pointwise inference methods based on undersmoothing and robust bias correction. Fourth, employing different coupling approaches, we develop uniform distributional approximations for the undersmoothed and robust bias-corrected t-statistic processes and construct valid confidence bands. In the univariate case, our uniform distributional approximations require seemingly minimal rate restrictions and improve on approximation rates known in the literature. Finally, we apply our general results to three partitioning-based estimators: splines, wavelets, and piecewise polynomials. The supplemental appendix includes several other general and example-specific technical and methodological results. A companion R package is provided.

preprint2019arXiv

nprobust: Nonparametric Kernel-Based Estimation and Robust Bias-Corrected Inference

Nonparametric kernel density and local polynomial regression estimators are very popular in Statistics, Economics, and many other disciplines. They are routinely employed in applied work, either as part of the main empirical analysis or as a preliminary ingredient entering some other estimation or inference procedure. This article describes the main methodological and numerical features of the software package nprobust, which offers an array of estimation and inference procedures for nonparametric kernel-based density and local polynomial regression methods, implemented in both the R and Stata statistical platforms. The package includes not only classical bandwidth selection, estimation, and inference methods (Wand and Jones, 1995; Fan and Gijbels, 1996), but also other recent developments in the statistics and econometrics literatures such as robust bias-corrected inference and coverage error optimal bandwidth selection (Calonico, Cattaneo and Farrell, 2018, 2019). Furthermore, this article also proposes a simple way of estimating optimal bandwidths in practice that always delivers the optimal mean square error convergence rate regardless of the specific evaluation point, that is, no matter whether it is implemented at a boundary or interior point. Numerical performance is illustrated using an empirical application and simulated data, where a detailed numerical comparison with other R packages is given.

Max H. Farrell

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Optimal Bandwidth Choice for Robust Bias Corrected Inference in Regression Discontinuity Designs

Characteristic-Sorted Portfolios: Estimation and Inference

Deep Neural Networks for Estimation and Inference

Large Sample Properties of Partitioning-Based Series Estimators

nprobust: Nonparametric Kernel-Based Estimation and Robust Bias-Corrected Inference