Source author record

Austin Talbot

Austin Talbot appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Genomics Applications Methodology Neurons and Cognition

Catalog footprint

What is connected

4works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Detecting Batch Heterogeneity via Likelihood Clustering

Batch effects represent a major confounder in genomic diagnostics. In copy number variant (CNV) detection from NGS, many algorithms compare read depth between test samples and a reference sample, assuming they are process-matched. When this assumption is violated, with causes ranging from reagent lot changes to multi-site processing, the reference becomes inappropriate, introducing false CNV calls or masking true pathogenic variants. Detecting such heterogeneity before downstream analysis is critical for reliable clinical interpretation. Existing batch effect detection methods either cluster samples based on raw features, risking conflation of biological signal with technical variation, or require known batch labels that are frequently unavailable. We introduce a method that addresses both limitations by clustering samples according to their Bayesian model evidence. The central insight is that evidence quantifies compatibility between data and model assumptions, technical artifacts violate assumptions and reduce evidence, whereas biological variation, including CNV status, is anticipated by the model and yields high evidence. This asymmetry provides a discriminative signal that separates batch effects from biology. We formalize heterogeneity detection as a likelihood ratio test for mixture structure in evidence space, using parametric bootstrap calibration to ensure conservative false positive rates. We validate our approach on synthetic data demonstrating proper Type I error control, three clinical targeted sequencing panels (liquid biopsy, BRCA, and thalassemia) exhibiting distinct batch effect mechanisms, and mouse electrophysiology recordings demonstrating cross-modality generalization. Our method achieves superior clustering accuracy compared to standard correlation-based and dimensionality-reduction approaches while maintaining the conservativeness required for clinical usage.

preprint2022arXiv

AugmentedPCA: A Python Package of Supervised and Adversarial Linear Factor Models

Deep autoencoders are often extended with a supervised or adversarial loss to learn latent representations with desirable properties, such as greater predictivity of labels and outcomes or fairness with respects to a sensitive variable. Despite the ubiquity of supervised and adversarial deep latent factor models, these methods should demonstrate improvement over simpler linear approaches to be preferred in practice. This necessitates a reproducible linear analog that still adheres to an augmenting supervised or adversarial objective. We address this methodological gap by presenting methods that augment the principal component analysis (PCA) objective with either a supervised or an adversarial objective and provide analytic and reproducible solutions. We implement these methods in an open-source Python package, AugmentedPCA, that can produce excellent real-world baselines. We demonstrate the utility of these factor models on an open-source, RNA-seq cancer gene expression dataset, showing that augmenting with a supervised objective results in improved downstream classification performance, produces principal components with greater class fidelity, and facilitates identification of genes aligned with the principal axes of data variance with implications to development of specific types of cancer.

preprint2022arXiv

Personalized rTMS for Depression: A Review

Personalized treatments are gaining momentum across all fields of medicine. Precision medicine can be applied to neuromodulatory techniques, where focused brain stimulation treatments such as repetitive transcranial magnetic stimulation (rTMS) are used to modulate brain circuits and alleviate clinical symptoms. rTMS is well-tolerated and clinically effective for treatment-resistant depression (TRD) and other neuropsychiatric disorders. However, despite its wide stimulation parameter space (location, angle, pattern, frequency, and intensity can be adjusted), rTMS is currently applied in a one-size-fits-all manner, potentially contributing to its suboptimal clinical response (~50%). In this review, we examine components of rTMS that can be optimized to account for inter-individual variability in neural function and anatomy. We discuss current treatment options for TRD, the neural mechanisms thought to underlie treatment, differences in FDA-cleared devices, targeting strategies, stimulation parameter selection, and adaptive closed-loop rTMS to improve treatment outcomes. We suggest that better understanding of the wide and modifiable parameter space of rTMS will greatly improve clinical outcome.

preprint2022arXiv

Supervising the Decoder of Variational Autoencoders to Improve Scientific Utility

Probabilistic generative models are attractive for scientific modeling because their inferred parameters can be used to generate hypotheses and design experiments. This requires that the learned model provide an accurate representation of the input data and yield a latent space that effectively predicts outcomes relevant to the scientific question. Supervised Variational Autoencoders (SVAEs) have previously been used for this purpose, where a carefully designed decoder can be used as an interpretable generative model while the supervised objective ensures a predictive latent representation. Unfortunately, the supervised objective forces the encoder to learn a biased approximation to the generative posterior distribution, which renders the generative parameters unreliable when used in scientific models. This issue has remained undetected as reconstruction losses commonly used to evaluate model performance do not detect bias in the encoder. We address this previously-unreported issue by developing a second order supervision framework (SOS-VAE) that influences the decoder to induce a predictive latent representation. This ensures that the associated encoder maintains a reliable generative interpretation. We extend this technique to allow the user to trade-off some bias in the generative parameters for improved predictive performance, acting as an intermediate option between SVAEs and our new SOS-VAE. We also use this methodology to address missing data issues that often arise when combining recordings from multiple scientific experiments. We demonstrate the effectiveness of these developments using synthetic data and electrophysiological recordings with an emphasis on how our learned representations can be used to design scientific experiments.

Austin Talbot

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Detecting Batch Heterogeneity via Likelihood Clustering

AugmentedPCA: A Python Package of Supervised and Adversarial Linear Factor Models

Personalized rTMS for Depression: A Review

Supervising the Decoder of Variational Autoencoders to Improve Scientific Utility