Source author record

Richard Berk

Richard Berk appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications math.ST Methodology Statistics Theory

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2014arXiv

Using Regression Kernels to Forecast A Failure to Appear in Court

Forecasts of prospective criminal behavior have long been an important feature of many criminal justice decisions. There is now substantial evidence that machine learning procedures will classify and forecast at least as well, and typically better, than logistic regression, which has to date dominated conventional practice. However, machine learning procedures are adaptive. They "learn" inductively from training data. As a result, they typically perform best with very large datasets. There is a need, therefore, for forecasting procedures with the promise of machine learning that will perform well with small to moderately-sized datasets. Kernel methods provide precisely that promise. In this paper, we offer an overview of kernel methods in regression settings and compare such a method, regularized with principle components, to stepwise logistic regression. We apply both to a timely and important criminal justice concern: a failure to appear (FTA) at court proceedings following an arraignment. A forecast of an FTA can be an important factor is a judge's decision to release a defendant while awaiting trial and can influence the conditions imposed on that release. Forecasting accuracy matters, and our kernel approach forecasts far more accurately than stepwise logistic regression. The methods developed here are implemented in the R package kernReg currently available on CRAN.

preprint2013arXiv

Improved Precision in Estimating Average Treatment Effects

The Average Treatment Effect (ATE) is a global measure of the effectiveness of an experimental treatment intervention. Classical methods of its estimation either ignore relevant covariates or do not fully exploit them. Moreover, past work has considered covariates as fixed. We present a method for improving the precision of the ATE estimate: the treatment and control responses are estimated via a regression, and information is pooled between the groups to produce an asymptotically unbiased estimate; we subsequently justify the random X paradigm underlying the result. Standard errors are derived, and the estimator's performance is compared to the traditional estimator. Conditions under which the regression-based estimator is preferable are detailed, and a demonstration on real data is presented.

preprint2013arXiv

Valid post-selection inference

It is common practice in statistical data analysis to perform data-driven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid ``post-selection inference'' by reducing the problem to one of simultaneous inference and hence suitably widening conventional confidence and retention intervals. Simultaneity is required for all linear functions that arise as coefficient estimates in all submodels. By purchasing ``simultaneity insurance'' for all possible submodels, the resulting post-selection inference is rendered universally valid under all possible model selection procedures. This inference is therefore generally conservative for particular selection procedures, but it is always less conservative than full Scheffe protection. Importantly it does not depend on the truth of the selected submodel, and hence it produces valid inference even in wrong models. We describe the structure of the simultaneous inference problem and give some asymptotic results.

preprint2010arXiv

Small area estimation of the homeless in Los Angeles: An application of cost-sensitive stochastic gradient boosting

In many metropolitan areas efforts are made to count the homeless to ensure proper provision of social services. Some areas are very large, which makes spatial sampling a viable alternative to an enumeration of the entire terrain. Counts are observed in sampled regions but must be imputed in unvisited areas. Along with the imputation process, the costs of underestimating and overestimating may be different. For example, if precise estimation in areas with large homeless c ounts is critical, then underestimation should be penalized more than overestimation in the loss function. We analyze data from the 2004--2005 Los Angeles County homeless study using an augmentation of $L_1$ stochastic gradient boosting that can weight overestimates and underestimates asymmetrically. We discuss our choice to utilize stochastic gradient boosting over other function estimation procedures. In-sample fitted and out-of-sample imputed values, as well as relationships between the response and predictors, are analyzed for various cost functions. Practical usage and policy implications of these results are discussed briefly.