Source author record

David Carlson

David Carlson appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning physics.app-ph physics.optics Applications Genomics Neurons and Cognition physics.atom-ph

Catalog footprint

What is connected

9works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

AugmentedPCA: A Python Package of Supervised and Adversarial Linear Factor Models

Deep autoencoders are often extended with a supervised or adversarial loss to learn latent representations with desirable properties, such as greater predictivity of labels and outcomes or fairness with respects to a sensitive variable. Despite the ubiquity of supervised and adversarial deep latent factor models, these methods should demonstrate improvement over simpler linear approaches to be preferred in practice. This necessitates a reproducible linear analog that still adheres to an augmenting supervised or adversarial objective. We address this methodological gap by presenting methods that augment the principal component analysis (PCA) objective with either a supervised or an adversarial objective and provide analytic and reproducible solutions. We implement these methods in an open-source Python package, AugmentedPCA, that can produce excellent real-world baselines. We demonstrate the utility of these factor models on an open-source, RNA-seq cancer gene expression dataset, showing that augmenting with a supervised objective results in improved downstream classification performance, produces principal components with greater class fidelity, and facilitates identification of genes aligned with the principal axes of data variance with implications to development of specific types of cancer.

preprint2022arXiv

Multiple Domain Causal Networks

Observational studies are regarded as economic alternatives to randomized trials, often used in their stead to investigate and determine treatment efficacy. Due to lack of sample size, observational studies commonly combine data from multiple sources or different sites/centers. Despite the benefits of an increased sample size, a naive combination of multicenter data may result in incongruities stemming from center-specific protocols for generating cohorts or reactions towards treatments distinct to a given center, among other things. These issues arise in a variety of other contexts, including capturing a treatment effect related to an individual's unique biological characteristics. Existing methods for estimating heterogeneous treatment effects have not adequately addressed the multicenter context, but rather treat it simply as a means to obtain sufficient sample size. Additionally, previous approaches to estimating treatment effects do not straightforwardly generalize to the multicenter design, especially when required to provide treatment insights for patients from a new, unobserved center. To address these shortcomings, we propose Multiple Domain Causal Networks (MDCN), an approach that simultaneously strengthens the information sharing between similar centers while addressing the selection bias in treatment assignment through learning of a new feature embedding. In empirical evaluations, MDCN is consistently more accurate when estimating the heterogeneous treatment effect in new centers compared to benchmarks that adjust solely based on treatment imbalance or general center differences. Finally, we justify our approach by providing theoretical analyses that demonstrate that MDCN improves on the generalization bound of the new, unobserved target center.

preprint2022arXiv

Supervising the Decoder of Variational Autoencoders to Improve Scientific Utility

Probabilistic generative models are attractive for scientific modeling because their inferred parameters can be used to generate hypotheses and design experiments. This requires that the learned model provide an accurate representation of the input data and yield a latent space that effectively predicts outcomes relevant to the scientific question. Supervised Variational Autoencoders (SVAEs) have previously been used for this purpose, where a carefully designed decoder can be used as an interpretable generative model while the supervised objective ensures a predictive latent representation. Unfortunately, the supervised objective forces the encoder to learn a biased approximation to the generative posterior distribution, which renders the generative parameters unreliable when used in scientific models. This issue has remained undetected as reconstruction losses commonly used to evaluate model performance do not detect bias in the encoder. We address this previously-unreported issue by developing a second order supervision framework (SOS-VAE) that influences the decoder to induce a predictive latent representation. This ensures that the associated encoder maintains a reliable generative interpretation. We extend this technique to allow the user to trade-off some bias in the generative parameters for improved predictive performance, acting as an intermediate option between SVAEs and our new SOS-VAE. We also use this methodology to address missing data issues that often arise when combining recordings from multiple scientific experiments. We demonstrate the effectiveness of these developments using synthetic data and electrophysiological recordings with an emphasis on how our learned representations can be used to design scientific experiments.

preprint2022arXiv

Universal visible emitters in nanoscale integrated photonics

Visible wavelengths of light control the quantum matter of atoms and molecules and are foundational for quantum technologies, including computers, sensors, and clocks. The development of visible integrated photonics opens the possibility for scalable circuits with complex functionalities, advancing both the scientific and technological frontiers. We experimentally demonstrate an inverse design approach based on superposition of guided-mode sources, allowing the generation and full control of free-space radiation directly from within a single 150 nm layer Ta2O5, showing low loss across visible and near-infrared spectra. We generate diverging circularly-polarized beams at the challenging 461 nm wavelength that can be directly used for magneto-optical traps of strontium atoms, constituting a fundamental building block for a range of atomic-physics-based quantum technologies. Our generated topological vortex beams and spatially-varying polarization emitters could open unexplored light-matter interaction pathways, enabling a broad new photonic-atomic paradigm. Our platform highlights the generalizability of nanoscale devices for visible-laser emission and will be critical for scaling quantum technologies.

preprint2020arXiv

Mid-infrared frequency combs at 10 GHz

We demonstrate mid-infrared (MIR) frequency combs at 10 GHz repetition rate via intra-pulse difference-frequency generation (DFG) in quasi-phase-matched nonlinear media. Few-cycle pump pulses ($\mathbf{\lesssim}$15 fs, 100 pJ) from a near-infrared (NIR) electro-optic frequency comb are provided via nonlinear soliton-like compression in photonic-chip silicon-nitride waveguides. Subsequent intra-pulse DFG in periodically-poled lithium niobate waveguides yields MIR frequency combs in the 3.1--4.1 $μ$m region, while orientation-patterned gallium phosphide provides coverage across 7--11 $μ$m. Cascaded second-order nonlinearities simultaneously provide access to the carrier-envelope-offset frequency of the pump source via in-line f-2f nonlinear interferometry. The high-repetition rate MIR frequency combs introduced here can be used for condensed phase spectroscopy and applications such as laser heterodyne radiometry.

preprint2016arXiv

Bridging the Gap between Stochastic Gradient MCMC and Stochastic Optimization

Stochastic gradient Markov chain Monte Carlo (SG-MCMC) methods are Bayesian analogs to popular stochastic optimization methods; however, this connection is not well studied. We explore this relationship by applying simulated annealing to an SGMCMC algorithm. Furthermore, we extend recent SG-MCMC methods with two key components: i) adaptive preconditioners (as in ADAgrad or RMSprop), and ii) adaptive element-wise momentum weights. The zero-temperature limit gives a novel stochastic optimization method with adaptive element-wise momentum weights, while conventional optimization methods only have a shared, static momentum weight. Under certain assumptions, our theoretical analysis suggests the proposed simulated annealing approach converges close to the global optima. Experiments on several deep neural network models show state-of-the-art results compared to related stochastic optimization algorithms.

preprint2016arXiv

Neuroprosthetic decoder training as imitation learning

Neuroprosthetic brain-computer interfaces function via an algorithm which decodes neural activity of the user into movements of an end effector, such as a cursor or robotic arm. In practice, the decoder is often learned by updating its parameters while the user performs a task. When the user's intention is not directly observable, recent methods have demonstrated value in training the decoder against a surrogate for the user's intended movement. We describe how training a decoder in this way is a novel variant of an imitation learning problem, where an oracle or expert is employed for supervised training in lieu of direct observations, which are not available. Specifically, we describe how a generic imitation learning meta-algorithm, dataset aggregation (DAgger, [1]), can be adapted to train a generic brain-computer interface. By deriving existing learning algorithms for brain-computer interfaces in this framework, we provide a novel analysis of regret (an important metric of learning efficacy) for brain-computer interfaces. This analysis allows us to characterize the space of algorithmic variants and bounds on their regret rates. Existing approaches for decoder learning have been performed in the cursor control setting, but the available design principles for these decoders are such that it has been impossible to scale them to naturalistic settings. Leveraging our findings, we then offer an algorithm that combines imitation learning with optimal control, which should allow for training of arbitrary effectors for which optimal control can generate goal-oriented control. We demonstrate this novel and general BCI algorithm with simulated neuroprosthetic control of a 26 degree-of-freedom model of an arm, a sophisticated and realistic end effector.

preprint2016arXiv

Partition Functions from Rao-Blackwellized Tempered Sampling

Partition functions of probability distributions are important quantities for model evaluation and comparisons. We present a new method to compute partition functions of complex and multimodal distributions. Such distributions are often sampled using simulated tempering, which augments the target space with an auxiliary inverse temperature variable. Our method exploits the multinomial probability law of the inverse temperatures, and provides estimates of the partition function in terms of a simple quotient of Rao-Blackwellized marginal inverse temperature probability estimates, which are updated while sampling. We show that the method has interesting connections with several alternative popular methods, and offers some significant advantages. In particular, we empirically find that the new method provides more accurate estimates than Annealed Importance Sampling when calculating partition functions of large Restricted Boltzmann Machines (RBM); moreover, the method is sufficiently accurate to track training and validation log-likelihoods during learning of RBMs, at minimal computational cost.

preprint2015arXiv

Deep Temporal Sigmoid Belief Networks for Sequence Modeling

Deep dynamic generative models are developed to learn sequential dependencies in time-series data. The multi-layered model is designed by constructing a hierarchy of temporal sigmoid belief networks (TSBNs), defined as a sequential stack of sigmoid belief networks (SBNs). Each SBN has a contextual hidden state, inherited from the previous SBNs in the sequence, and is used to regulate its hidden bias. Scalable learning and inference algorithms are derived by introducing a recognition model that yields fast sampling from the variational posterior. This recognition model is trained jointly with the generative model, by maximizing its variational lower bound on the log-likelihood. Experimental results on bouncing balls, polyphonic music, motion capture, and text streams show that the proposed approach achieves state-of-the-art predictive performance, and has the capacity to synthesize various sequences.

David Carlson

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

AugmentedPCA: A Python Package of Supervised and Adversarial Linear Factor Models

Multiple Domain Causal Networks

Supervising the Decoder of Variational Autoencoders to Improve Scientific Utility

Universal visible emitters in nanoscale integrated photonics

Mid-infrared frequency combs at 10 GHz

Bridging the Gap between Stochastic Gradient MCMC and Stochastic Optimization

Neuroprosthetic decoder training as imitation learning

Partition Functions from Rao-Blackwellized Tempered Sampling

Deep Temporal Sigmoid Belief Networks for Sequence Modeling