Source author record

Andrew Jones

Andrew Jones appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.SR Machine Learning astro-ph.IM Methodology astro-ph.CO Genomics physics.chem-ph physics.geo-ph physics.ins-det physics.space-ph Quantitative Methods

Catalog footprint

What is connected

10works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Contrastive linear regression

Contrastive dimension reduction methods have been developed for case-control study data to identify variation that is enriched in the foreground (case) data X relative to the background (control) data Y. Here, we develop contrastive regression for the setting when there is a response variable r associated with each foreground observation. This situation occurs frequently when, for example, the unaffected controls do not have a disease grade or intervention dosage but the affected cases have a disease grade or intervention dosage, as in autism severity, solid tumors stages, polyp sizes, or warfarin dosages. Our contrastive regression model captures shared low-dimensional variation between the predictors in the cases and control groups, and then explains the case-specific response variables through the variance that remains in the predictors after shared variation is removed. We show that, in one single-nucleus RNA sequencing dataset on autism severity in postmortem brain samples from donors with and without autism and in another single-cell RNA sequencing dataset on cellular differentiation in chronic rhinosinusitis with and without nasal polyps, our contrastive linear regression performs feature ranking and identifies biologically-informative predictors associated with response that cannot be identified using other approaches

preprint2023arXiv

Spectral embedding of weighted graphs

When analyzing weighted networks using spectral embedding, a judicious transformation of the edge weights may produce better results. To formalize this idea, we consider the asymptotic behavior of spectral embedding for different edge-weight representations, under a generic low rank model. We measure the quality of different embeddings -- which can be on entirely different scales -- by how easy it is to distinguish communities, in an information-theoretic sense. For common types of weighted graphs, such as count networks or p-value networks, we find that transformations such as tempering or thresholding can be highly beneficial, both in theory and in practice.

preprint2022arXiv

Spectral embedding for dynamic networks with stability guarantees

We consider the problem of embedding a dynamic network, to obtain time-evolving vector representations of each node, which can then be used to describe changes in behaviour of individual nodes, communities, or the entire graph. Given this open-ended remit, we argue that two types of stability in the spatio-temporal positioning of nodes are desirable: to assign the same position, up to noise, to nodes behaving similarly at a given time (cross-sectional stability) and a constant position, up to noise, to a single node behaving similarly across different times (longitudinal stability). Similarity in behaviour is defined formally using notions of exchangeability under a dynamic latent position network model. By showing how this model can be recast as a multilayer random dot product graph, we demonstrate that unfolded adjacency spectral embedding satisfies both stability conditions. We also show how two alternative methods, omnibus and independent spectral embedding, alternately lack one or the other form of stability.

preprint2021arXiv

Contrastive latent variable modeling with application to case-control sequencing experiments

High-throughput RNA-sequencing (RNA-seq) technologies are powerful tools for understanding cellular state. Often it is of interest to quantify and summarize changes in cell state that occur between experimental or biological conditions. Differential expression is typically assessed using univariate tests to measure gene-wise shifts in expression. However, these methods largely ignore changes in transcriptional correlation. Furthermore, there is a need to identify the low-dimensional structure of the gene expression shift to identify collections of genes that change between conditions. Here, we propose contrastive latent variable models designed for count data to create a richer portrait of differential expression in sequencing data. These models disentangle the sources of transcriptional variation in different conditions, in the context of an explicit model of variation at baseline. Moreover, we develop a model-based hypothesis testing framework that can test for global and gene subset-specific changes in expression. We test our model through extensive simulations and analyses with count-based gene expression data from perturbation and observational sequencing experiments. We find that our methods can effectively summarize and quantify complex transcriptional changes in case-control experimental sequencing data.

preprint2021arXiv

The multilayer random dot product graph

We present a comprehensive extension of the latent position network model known as the random dot product graph to accommodate multiple graphs -- both undirected and directed -- which share a common subset of nodes, and propose a method for jointly embedding the associated adjacency matrices, or submatrices thereof, into a suitable latent space. Theoretical results concerning the asymptotic behaviour of the node representations thus obtained are established, showing that after the application of a linear transformation these converge uniformly in the Euclidean norm to the latent positions with Gaussian error. Within this framework, we present a generalisation of the stochastic block model to a number of different multiple graph settings, and demonstrate the effectiveness of our joint embedding method through several statistical inference tasks in which we achieve comparable or better results than rival spectral methods. Empirical improvements in link prediction over single graph embeddings are exhibited in a cyber-security example.

preprint2020arXiv

CME Acceleration as a Probe of the Coronal Magnetic Field

By 2050, we expect that CME models will accurately describe, and ideally predict, observed solar eruptions and the propagation of the CMEs through the corona. We describe some of the present known unknowns in observations and models that would need to be addressed in order to reach this goal. We also describe how we might prepare for some of the unknown unknowns that will surely become challenges.

preprint2015arXiv

Miniature X-Ray Solar Spectrometer (MinXSS) - A Science-Oriented, University 3U CubeSat

The Miniature X-ray Solar Spectrometer (MinXSS) is a 3-Unit (3U) CubeSat developed at the Laboratory for Atmospheric and Space Physics (LASP) at the University of Colorado, Boulder (CU). Over 40 students contributed to the project with professional mentorship and technical contributions from professors in the Aerospace Engineering Sciences Department at CU and from LASP scientists and engineers. The scientific objective of MinXSS is to study processes in the dynamic Sun, from quiet-Sun to solar flares, and to further understand how these changes in the Sun influence the Earth's atmosphere by providing unique spectral measurements of solar soft x-rays (SXRs). The enabling technology providing the advanced solar SXR spectral measurements is the Amptek X123, a commercial-off-the-shelf (COTS) silicon drift detector (SDD). The Amptek X123 has a low mass (~324 g after modification), modest power consumption (~2.50 W), and small volume (6.86 cm x 9.91 cm x 2.54 cm), making it ideal for a CubeSat. This paper provides an overview of the MinXSS mission: the science objectives, project history, subsystems, and lessons learned that can be useful for the small-satellite community.

preprint2014arXiv

The Adaptive Buffered Force QM/MM method in the CP2K and AMBER software packages

The implementation and validation of the adaptive buffered force QM/MM method in two popular packages, CP2K and AMBER are presented. The implementations build on the existing QM/MM functionality in each code, extending it to allow for redefinition of the QM and MM regions during the simulation and reducing QM-MM interface errors by discarding forces near the boundary according to the buffered force-mixing approach. New adaptive thermostats, needed by force-mixing methods, are also implemented. Different variants of the method are benchmarked by simulating the structure of bulk water, water autoprotolysis in the presence of zinc and dimethyl-phosphate hydrolysis using various semiempirical Hamiltonians and density functional theory as the QM model. It is shown that with suitable parameters, based on force convergence tests, the adaptive buffered-force QM/MM scheme can provide an accurate approximation of the structure in the dynamical QM region matching the corresponding fully QM simulations, as well as reproducing the correct energetics in all cases. Adaptive unbuffered force-mixing and adaptive conventional QM/MM methods also provide reasonable results for some systems, but are more likely to suffer from instabilities and inaccuracies.

preprint2010arXiv

The Atacama Cosmology Telescope: Cosmology from Galaxy Clusters Detected via the Sunyaev-Zel'dovich Effect

We present constraints on cosmological parameters based on a sample of Sunyaev-Zel'dovich-selected galaxy clusters detected in a millimeter-wave survey by the Atacama Cosmology Telescope. The cluster sample used in this analysis consists of 9 optically-confirmed high-mass clusters comprising the high-significance end of the total cluster sample identified in 455 square degrees of sky surveyed during 2008 at 148 GHz. We focus on the most massive systems to reduce the degeneracy between unknown cluster astrophysics and cosmology derived from SZ surveys. We describe the scaling relation between cluster mass and SZ signal with a 4-parameter fit. Marginalizing over the values of the parameters in this fit with conservative priors gives sigma_8 = 0.851 +/- 0.115 and w = -1.14 +/- 0.35 for a spatially-flat wCDM cosmological model with WMAP 7-year priors on cosmological parameters. This gives a modest improvement in statistical uncertainty over WMAP 7-year constraints alone. Fixing the scaling relation between cluster mass and SZ signal to a fiducial relation obtained from numerical simulations and calibrated by X-ray observations, we find sigma_8 = 0.821 +/- 0.044 and w = -1.05 +/- 0.20. These results are consistent with constraints from WMAP 7 plus baryon acoustic oscillations plus type Ia supernoava which give sigma_8 = 0.802 +/- 0.038 and w = -0.98 +/- 0.053. A stacking analysis of the clusters in this sample compared to clusters simulated assuming the fiducial model also shows good agreement. These results suggest that, given the sample of clusters used here, both the astrophysics of massive clusters and the cosmological parameters derived from them are broadly consistent with current models.

preprint2009arXiv

EUV SpectroPhotometer (ESP) in Extreme Ultraviolet Variability Experiment (EVE): Algorithms and Calibrations

The Extreme ultraviolet SpectroPhotometer (ESP) is one of five channels of the Extreme ultraviolet Variability Experiment (EVE) onboard the NASA Solar Dynamics Observatory (SDO). The ESP channel design is based on a highly stable diffraction transmission grating and is an advanced version of the Solar Extreme ultraviolet Monitor (SEM), which has been successfully observing solar irradiance onboard the Solar and Heliospheric Observatory (SOHO) since December 1995. ESP is designed to measure solar Extreme UltraViolet (EUV) irradiance in four first order bands of the diffraction grating centered around 19 nm, 25 nm, 30 nm, and 36 nm, and in a soft X-ray band from 0.1 to 7.0 nm in the zeroth order of the grating. Each band's detector system converts the photo-current into a count rate (frequency). The count rates are integrated over 0.25 sec increments and transmitted to the EVE Science and Operations Center for data processing. An algorithm for converting the measured count rates into solar irradiance and the ESP calibration parameters are described. The ESP pre-flight calibration was performed at the Synchrotron Ultraviolet Radiation Facility of the National Institute of Standards and Technology. Calibration parameters were used to calculate absolute solar irradiance from the Sounding Rocket flight measurements on 14 April 2008. These irradiances for the ESP bands closely match the irradiance determined for two other EUV channels flown simultaneously, EVE's Multiple Euv Grating Spectrograph (MEGS) and SOHO's Charge, Element and Isotope Analysis System / Solar EUV Monitor (CELIAS/SEM).