Researcher profile

Kory D. Johnson

Kory D. Johnson contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

Impartial Predictive Modeling and the Use of Proxy Variables

Fairness aware data mining (FADM) aims to prevent algorithms from discriminating against protected groups. The literature has come to an impasse as to what constitutes explainable variability as opposed to discrimination. This distinction hinges on a rigorous understanding of the role of proxy variables; i.e., those variables which are associated both the protected feature and the outcome of interest. We demonstrate that fairness is achieved by ensuring impartiality with respect to sensitive characteristics and provide a framework for impartiality by accounting for different perspectives on the data generating process. In particular, fairness can only be precisely defined in a full-data scenario in which all covariates are observed. We then analyze how these models may be conservatively estimated via regression in partial-data settings. Decomposing the regression estimates provides insights into previously unexplored distinctions between explainable variability and discrimination that illuminate the use of proxy variables in fairness aware data mining.

preprint2022arXiv

Robust models of SARS-CoV-2 heterogeneity and control

In light of the continuing emergence of new SARS-CoV-2 variants and vaccines, we create a simulation framework for exploring possible infection trajectories under various scenarios. The situations of primary interest involve the interaction between three components: vaccination campaigns, non-pharmaceutical interventions (NPIs), and the emergence of new SARS-CoV-2 variants. Additionally, immunity waning and vaccine boosters are modeled to account for their growing importance. New infections are generated according to a hierarchical model in which people have a random, individual infectiousness. The model thus includes super-spreading observed in the COVID-19 pandemic. Our simulation functions as a dynamic compartment model in which an individual's history of infection, vaccination, and possible reinfection all play a role in their resistance to further infections. We present a risk measure for each SARS-CoV-2 variant, $ρ^\V$, that accounts for the amount of resistance within a population and show how this risk changes as the vaccination rate increases. Furthermore, by considering different population compositions in terms of previous infection and type of vaccination, we can learn about variants which pose differential risk to different countries. Different control strategies are implemented which aim to both suppress COVID-19 outbreaks when they occur as well as relax restrictions when possible. We demonstrate that a controller that responds to the effective reproduction number in addition to case numbers is more efficient and effective in controlling new waves than monitoring case numbers alone. This is of interest as the majority of the public discussion and well-known statistics deal primarily with case numbers.

preprint2021arXiv

Evidence suggests that SARS-CoV-2 rapid antigen tests provide benefits for epidemic control -- observations from Austrian schools

Rapid antigen tests detect proteins at the surface of virus particles, identifying the disease during its infectious phase. In contrast, PCR tests detect viral genomes; they can thus diagnose COVID-19 before the infectious phase but also react to remnants of the virus genome, even weeks after live virus ceases to be detectable in the respiratory tract. Furthermore, the logistics for administering the tests are different, with rapid antigen tests being much easier to administer at-scale. In this article, we discuss the relative advantages of the different testing procedures and summarise evidence that shows that using antigen tests 2-3 times per week could become a powerful tool to suppress the COVID-19 pandemic. We also discuss the results of recent large-scale rapid antigen testing in Austrian schools. While our report on testing predates Delta, we have updated the review with recent data on viral loads in breakthrough infections and more information about testing efficacy, especially in children.

preprint2020arXiv

Adaptive, Distribution-Free Prediction Intervals for Deep Networks

The machine learning literature contains several constructions for prediction intervals that are intuitively reasonable but ultimately ad-hoc in that they do not come with provable performance guarantees. We present methods from the statistics literature that can be used efficiently with neural networks under minimal assumptions with guaranteed performance. We propose a neural network that outputs three values instead of a single point estimate and optimizes a loss function motivated by the standard quantile regression loss. We provide two prediction interval methods with finite sample coverage guarantees solely under the assumption that the observations are independent and identically distributed. The first method leverages the conformal inference framework and provides average coverage. The second method provides a new, stronger guarantee by conditioning on the observed data. Lastly, our loss function does not compromise the predictive accuracy of the network like other prediction interval methods. We demonstrate the ease of use of our procedures as well as its improvements over other methods on both simulated and real data. As most deep networks can easily be modified by our method to output predictions with valid prediction intervals, its use should become standard practice, much like reporting standard errors along with mean estimates.

preprint2020arXiv

Fitting High-Dimensional Interaction Models with Error Control

There is a renewed interest in polynomial regression in the form of identifying influential interactions between features. In many settings, this takes place in a high-dimensional model, making the number of interactions unwieldy or computationally infeasible. Furthermore, it is difficult to analyze such spaces directly as they are often highly correlated. Standard feature selection issues remain such as how to determine a final model which generalizes well. This paper solves these problems with a sequential algorithm called Revisiting Alpha-Investing (RAI). RAI is motivated by the principle of marginality and searches the feature-space of higher-order interactions by greedily building upon lower-order terms. RAI controls a notion of false rejections and comes with a performance guarantee relative to the best-subset model. This ensures that signal is identified while providing a valid stopping criterion to prevent over-selection. We apply RAI in a novel setting over a family of regressions in order to select gene-specific interaction models for differential expression profiling.