Source author record

Katherine Heller

Katherine Heller appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.FA Methodology Applications Artificial Intelligence physics.soc-ph Computation and Language Human-Computer Interaction math.CV math.DS Populations and Evolution Quantitative Methods Social and Information Networks

Catalog footprint

What is connected

18works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Analysis of SIR epidemic models with sociological phenomenon

We propose two SIR models which incorporate sociological behavior of groups of individuals. It is these differences in behaviors which impose different infection rates on the individual susceptible populations, rather than biological differences. We compute the basic reproduction number for each model, as well as analyze the sensitivity of $R_0$ to changes in sociological parameter values.

preprint2022arXiv

Compact differences of composition operators on weighted Dirichlet spaces

Here we consider when the difference of two composition operators is compact on the weighted Dirichlet spaces $\mathcal{D}_α$. Specifically we study differences of composition operators on the Dirichlet space $\mathcal{D}$ and $S^2$, the space of analytic functions whose first derivative is in $H^2$, and then use Calderón's complex interpolation to extend the results to the general weighted Dirichlet spaces. As a corollary we consider composition operators induced by linear fractional self-maps of the disk.

preprint2022arXiv

Composition-differentiation operators on $S^2(\mathbb{D})$

We investigate composition-differentiation operators acting on the space $S^2$, the space of analytic functions on the open unit disk whose first derivative is in $H^2$. Specifically, we determine characterizations for bounded and compact composition-differentiation operators acting on $S^p$. In addition, for particular classes of inducing maps, we compute the norm, and identify the spectrum. Finally, for particular linear fractional inducing maps, we determine the adjoint of the composition-differentiation operator acting on weighted Bergman spaces which include $S^2, H^2$, and the Dirichlet space.

preprint2022arXiv

Composition-differentiation operators on the Dirichlet space

We investigate composition-differentiation operators acting on the Dirichlet space of the unit disk. Specifically, we determine characterizations for bounded, compact, and Hilbert-Schmidt composition-differentiation operators. In addition, for particular classes of inducing maps, we derive an adjoint formula, compute the norm, and identify the spectrum.

preprint2022arXiv

Deep Cox Mixtures for Survival Regression

Survival analysis is a challenging variation of regression modeling because of the presence of censoring, where the outcome measurement is only partially known, due to, for example, loss to follow up. Such problems come up frequently in medical applications, making survival analysis a key endeavor in biostatistics and machine learning for healthcare, with Cox regression models being amongst the most commonly employed models. We describe a new approach for survival analysis regression models, based on learning mixtures of Cox regressions to model individual survival distributions. We propose an approximation to the Expectation Maximization algorithm for this model that does hard assignments to mixture groups to make optimization efficient. In each group assignment, we fit the hazard ratios within each group using deep neural networks, and the baseline hazard for each mixture component non-parametrically. We perform experiments on multiple real world datasets, and look at the mortality rates of patients across ethnicity and gender. We emphasize the importance of calibration in healthcare settings and demonstrate that our approach outperforms classical and modern survival analysis baselines, both in terms of discriminative performance and calibration, with large gains in performance on the minority demographics.

preprint2022arXiv

Disability prediction in multiple sclerosis using performance outcome measures and demographic data

Literature on machine learning for multiple sclerosis has primarily focused on the use of neuroimaging data such as magnetic resonance imaging and clinical laboratory tests for disease identification. However, studies have shown that these modalities are not consistent with disease activity such as symptoms or disease progression. Furthermore, the cost of collecting data from these modalities is high, leading to scarce evaluations. In this work, we used multi-dimensional, affordable, physical and smartphone-based performance outcome measures (POM) in conjunction with demographic data to predict multiple sclerosis disease progression. We performed a rigorous benchmarking exercise on two datasets and present results across 13 clinically actionable prediction endpoints and 6 machine learning models. To the best of our knowledge, our results are the first to show that it is possible to predict disease progression using POMs and demographic data in the context of both clinical trials and smartphone-base studies by using two datasets. Moreover, we investigate our models to understand the impact of different POMs and demographics on model performance through feature ablation studies. We also show that model performance is similar across different demographic subgroups (based on age and sex). To enable this work, we developed an end-to-end reusable pre-processing and machine learning framework which allows quicker experimentation over disparate MS datasets.

preprint2022arXiv

Evaluation Gaps in Machine Learning Practice

Forming a reliable judgement of a machine learning (ML) model's appropriateness for an application ecosystem is critical for its responsible use, and requires considering a broad range of factors including harms, benefits, and responsibilities. In practice, however, evaluations of ML models frequently focus on only a narrow range of decontextualized predictive behaviours. We examine the evaluation gaps between the idealized breadth of evaluation concerns and the observed narrow focus of actual evaluations. Through an empirical study of papers from recent high-profile conferences in the Computer Vision and Natural Language Processing communities, we demonstrate a general focus on a handful of evaluation methods. By considering the metrics and test data distributions used in these methods, we draw attention to which properties of models are centered in the field, revealing the properties that are frequently neglected or sidelined during evaluation. By studying these properties, we demonstrate the machine learning discipline's implicit assumption of a range of commitments which have normative impacts; these include commitments to consequentialism, abstractability from context, the quantifiability of impacts, the limited role of model inputs in evaluation, and the equivalence of different failure modes. Shedding light on these assumptions enables us to question their appropriateness for ML system contexts, pointing the way towards more contextualized evaluation methodologies for robustly examining the trustworthiness of ML models

preprint2022arXiv

Healthsheet: Development of a Transparency Artifact for Health Datasets

Machine learning (ML) approaches have demonstrated promising results in a wide range of healthcare applications. Data plays a crucial role in developing ML-based healthcare systems that directly affect people's lives. Many of the ethical issues surrounding the use of ML in healthcare stem from structural inequalities underlying the way we collect, use, and handle data. Developing guidelines to improve documentation practices regarding the creation, use, and maintenance of ML healthcare datasets is therefore of critical importance. In this work, we introduce Healthsheet, a contextualized adaptation of the original datasheet questionnaire ~\cite{gebru2018datasheets} for health-specific applications. Through a series of semi-structured interviews, we adapt the datasheets for healthcare data documentation. As part of the Healthsheet development process and to understand the obstacles researchers face in creating datasheets, we worked with three publicly-available healthcare datasets as our case studies, each with different types of structured data: Electronic health Records (EHR), clinical trial study data, and smartphone-based performance outcome measures. Our findings from the interviewee study and case studies show 1) that datasheets should be contextualized for healthcare, 2) that despite incentives to adopt accountability practices such as datasheets, there is a lack of consistency in the broader use of these practices 3) how the ML for health community views datasheets and particularly \textit{Healthsheets} as diagnostic tool to surface the limitations and strength of datasets and 4) the relative importance of different fields in the datasheet to healthcare concerns.

preprint2022arXiv

Isometric composition operators on the analytic Besov spaces

We investigate the isometric composition operators on the analytic Besov spaces. For $1<p<2$ we show that an isometric composition operator is induced only by a rotation of the disk. For $p>2$, we extend previous work on the subject. Finally, we analyze this same problem for the Besov spaces with an equivalent norm.

preprint2022arXiv

Multiplication operators on $S^2(\mathbb D)$

In this paper, we study the multiplication operators on $S^2$, the space of analytic functions on the open unit disk $\mathbb D$ whose first derivative is in $H^2$. Specifically, we characterize the bounded and the compact multiplication operators, establish estimates on the operator norm, and determine the spectrum. Finally, we prove that the isometric multiplication operators are precisely those induced by a constant function of modulus one.

preprint2020arXiv

Analyzing the Role of Model Uncertainty for Electronic Health Records

In medicine, both ethical and monetary costs of incorrect predictions can be significant, and the complexity of the problems often necessitates increasingly complex models. Recent work has shown that changing just the random seed is enough for otherwise well-tuned deep neural networks to vary in their individual predicted probabilities. In light of this, we investigate the role of model uncertainty methods in the medical domain. Using RNN ensembles and various Bayesian RNNs, we show that population-level metrics, such as AUC-PR, AUC-ROC, log-likelihood, and calibration error, do not capture model uncertainty. Meanwhile, the presence of significant variability in patient-specific predictions and optimal decisions motivates the need for capturing model uncertainty. Understanding the uncertainty for individual patients is an area with clear clinical impact, such as determining when a model decision is likely to be brittle. We further show that RNNs with only Bayesian embeddings can be a more efficient way to capture model uncertainty compared to ensembles, and we analyze how model uncertainty is impacted across individual input features and patient subgroups.

preprint2020arXiv

Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

Bayesian neural networks (BNNs) demonstrate promising success in improving the robustness and uncertainty quantification of modern deep learning. However, they generally struggle with underfitting at scale and parameter efficiency. On the other hand, deep ensembles have emerged as alternatives for uncertainty quantification that, while outperforming BNNs on certain problems, also suffer from efficiency issues. It remains unclear how to combine the strengths of these two approaches and remediate their common issues. To tackle this challenge, we propose a rank-1 parameterization of BNNs, where each weight matrix involves only a distribution on a rank-1 subspace. We also revisit the use of mixture approximate posteriors to capture multiple modes, where unlike typical mixtures, this approach admits a significantly smaller memory increase (e.g., only a 0.4% increase for a ResNet-50 mixture of size 10). We perform a systematic empirical study on the choices of prior, variational posterior, and methods to improve training. For ResNet-50 on ImageNet, Wide ResNet 28-10 on CIFAR-10/100, and an RNN on MIMIC-III, rank-1 BNNs achieve state-of-the-art performance across log-likelihood, accuracy, and calibration on the test sets and out-of-distribution variants.

preprint2016arXiv

Scalable Modeling of Multivariate Longitudinal Data for Prediction of Chronic Kidney Disease Progression

Prediction of the future trajectory of a disease is an important challenge for personalized medicine and population health management. However, many complex chronic diseases exhibit large degrees of heterogeneity, and furthermore there is not always a single readily available biomarker to quantify disease severity. Even when such a clinical variable exists, there are often additional related biomarkers routinely measured for patients that may better inform the predictions of their future disease state. To this end, we propose a novel probabilistic generative model for multivariate longitudinal data that captures dependencies between multivariate trajectories. We use a Gaussian process based regression model for each individual trajectory, and build off ideas from latent class models to induce dependence between their mean functions. We fit our method using a scalable variational inference algorithm to a large dataset of longitudinal electronic patient health records, and find that it improves dynamic predictions compared to a recent state of the art method. Our local accountable care organization then uses the model predictions during chart reviews of high risk patients with chronic kidney disease.

preprint2015arXiv

$k$-means: Fighting against Degeneracy in Sequential Monte Carlo with an Application to Tracking

For regular particle filter algorithm or Sequential Monte Carlo (SMC) methods, the initial weights are traditionally dependent on the proposed distribution, the posterior distribution at the current timestamp in the sampled sequence, and the target is the posterior distribution of the previous timestamp. This is technically correct, but leads to algorithms which usually have practical issues with degeneracy, where all particles eventually collapse onto a single particle. In this paper, we propose and evaluate using $k$ means clustering to attack and even take advantage of this degeneracy. Specifically, we propose a Stochastic SMC algorithm which initializes the set of $k$ means, providing the initial centers chosen from the collapsed particles. To fight against degeneracy, we adjust the regular SMC weights, mediated by cluster proportions, and then correct them to retain the same expectation as before. We experimentally demonstrate that our approach has better performance than vanilla algorithms.

preprint2015arXiv

The Bayesian Echo Chamber: Modeling Social Influence via Linguistic Accommodation

We present the Bayesian Echo Chamber, a new Bayesian generative model for social interaction data. By modeling the evolution of people's language usage over time, this model discovers latent influence relationships between them. Unlike previous work on inferring influence, which has primarily focused on simple temporal dynamics evidenced via turn-taking behavior, our model captures more nuanced influence relationships, evidenced via linguistic accommodation patterns in interaction content. The model, which is based on a discrete analog of the multivariate Hawkes process, permits a fully Bayesian inference algorithm. We validate our model's ability to discover latent influence patterns using transcripts of arguments heard by the US Supreme Court and the movie "12 Angry Men." We showcase our model's capabilities by using it to infer latent influence patterns from Federal Open Market Committee meeting transcripts, demonstrating state-of-the-art performance at uncovering social dynamics in group discussions.

preprint2013arXiv

Ranking relations using analogies in biological and information networks

Analogical reasoning depends fundamentally on the ability to learn and generalize about relations between objects. We develop an approach to relational learning which, given a set of pairs of objects $\mathbf{S}=\{A^{(1)}:B^{(1)},A^{(2)}:B^{(2)},\ldots,A^{(N)}:B ^{(N)}\}$, measures how well other pairs A:B fit in with the set $\mathbf{S}$. Our work addresses the following question: is the relation between objects A and B analogous to those relations found in $\mathbf{S}$? Such questions are particularly relevant in information retrieval, where an investigator might want to search for analogous pairs of objects that match the query set of interest. There are many ways in which objects can be related, making the task of measuring analogies very challenging. Our approach combines a similarity measure on function spaces with Bayesian analysis to produce a ranking. It requires data containing features of the objects of interest and a link matrix specifying which relationships exist; no further attributes of such relationships are necessary. We illustrate the potential of our method on text analysis and information networks. An application on discovering functional interactions between pairs of proteins is discussed in detail, where we show that our approach can work in practice even if a small set of protein pairs is provided.

preprint2012arXiv

Bayesian and L1 Approaches to Sparse Unsupervised Learning

The use of L1 regularisation for sparse learning has generated immense research interest, with successful application in such diverse areas as signal acquisition, image coding, genomics and collaborative filtering. While existing work highlights the many advantages of L1 methods, in this paper we find that L1 regularisation often dramatically underperforms in terms of predictive performance when compared with other methods for inferring sparsity. We focus on unsupervised latent variable models, and develop L1 minimising factor models, Bayesian variants of "L1", and Bayesian models with a stronger L0-like sparsity induced through spike-and-slab distributions. These spike-and-slab Bayesian factor models encourage sparsity while accounting for uncertainty in a principled manner and avoiding unnecessary shrinkage of non-zero values. We demonstrate on a number of data sets that in practice spike-and-slab Bayesian methods outperform L1 minimisation, even on a computational budget. We thus highlight the need to re-assess the wide use of L1 methods in sparsity-reliant applications, particularly when we care about generalising to previously unseen data, and provide an alternative that, over many varying conditions, provides improved generalisation performance.

preprint2010arXiv

Compact differences of composition operators

When $φ$ and $ψ$ are linear-fractional self-maps of the unit ball $B_N$ in ${\mathbb C}^N$, $N\geq 1$, we show that the difference $C_φ-C_ψ$ cannot be non-trivially compact on either the Hardy space $H^2(B_N)$ or any weighted Bergman space $A^2_α(B_N)$. Our arguments emphasize geometrical properties of the inducing maps $φ$ and $ψ$.

Katherine Heller

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

Analysis of SIR epidemic models with sociological phenomenon

Compact differences of composition operators on weighted Dirichlet spaces

Composition-differentiation operators on $S^2(\mathbb{D})$

Composition-differentiation operators on the Dirichlet space

Deep Cox Mixtures for Survival Regression

Disability prediction in multiple sclerosis using performance outcome measures and demographic data

Evaluation Gaps in Machine Learning Practice

Healthsheet: Development of a Transparency Artifact for Health Datasets

Isometric composition operators on the analytic Besov spaces

Multiplication operators on $S^2(\mathbb D)$

Analyzing the Role of Model Uncertainty for Electronic Health Records

Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

Scalable Modeling of Multivariate Longitudinal Data for Prediction of Chronic Kidney Disease Progression

$k$-means: Fighting against Degeneracy in Sequential Monte Carlo with an Application to Tracking

The Bayesian Echo Chamber: Modeling Social Influence via Linguistic Accommodation

Ranking relations using analogies in biological and information networks

Bayesian and L1 Approaches to Sparse Unsupervised Learning

Compact differences of composition operators