Researcher profile

Andrew J. Connolly

Andrew J. Connolly contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

Hyrax: An Extensible Framework for Rapid ML Experimentation and Unsupervised Discovery in the Era of Rubin, Roman, and Euclid

The NSF-DOE Vera C. Rubin Observatory, Roman Space Telescope, Euclid, and other next-generation surveys will deliver imaging, spectroscopic, and time-domain data at scales that increasingly shift the bottleneck in astronomical machine learning (ML) projects from model design to infrastructure. We present Hyrax, an open-source, modular, GPU-enabled Python framework that supports the full ML lifecycle in astronomy: from data acquisition and training to inference and experiment comparison, with capabilities including multimodal dataset support, integrated vector databases for similarity search, and interactive two- and three-dimensional latent-space exploration for unsupervised discovery. We demonstrate Hyrax's versatility through five representative applications on real survey data: (i) unsupervised representation learning on $\sim 4\times10^5$ Rubin Legacy Survey of Space and Time (LSST) Data Preview 1 (DP1) galaxies, surfacing new merger and low-surface-brightness candidates missing from reference Euclid and Dark Energy Survey catalogs, while also isolating imaging artifacts -- all without labeled training data; (ii) hybrid density-based clustering for identifying cluster-scale gravitational lens candidates in DP1 data; (iii) multimodal early-time transient classification in the Zwicky Transient Facility leveraging light curves, spectra, images, and metadata; (iv) supervised false-positive filtering in shift-and-stack searches for distant solar system objects in the Dark Energy Camera Ecliptic Exploration Project survey; and (v) supervised detection of semi-resolved dwarf galaxies in Hyper Suprime-Cam and LSST-like imaging using synthetic source injection. Together, these results demonstrate that Hyrax provides astronomy-specific ML infrastructure that enables systematic discovery and rapid methodological iteration across next-generation astronomical surveys.

preprint2026arXiv

You Only Stack Once (YOSO): A Motion-Filtered, Deep-Learning Framework for Detecting Faint Moving Sources

We present You Only Stack Once (YOSO), an automated pipeline designed to detect faint, slow-moving Solar System objects in wide-field astronomical surveys. The pipeline integrates a novel Gaussian Motion Filter (GMoF) that operates at the pixel level to enhance signal-to-noise for objects exhibiting a range of apparent rates of motion. Unlike conventional shift-and-stack methods, which rely on discrete velocity trials, GMoF amplifies trails while suppressing random noise and static background features. Applied to a subset of DEEP observations from the Dark Energy Camera, YOSO recovered 45 out of 73 previously detected objects, as well as 11 new TNOs. It also discovered 216 objects in the near Solar System. Although alternative shift-and-stack methods are sensitive to objects about 0.88 magnitudes fainter, YOSO's false positive rate is extremely low, since it detects only sources that exhibit a trail and are consistent with a point source when shifted at the right rate. We show how this method can be deployed on large surveys like LSST, and adapted for other domains that require motion-based signal enhancement, including exoplanet imaging through Angular Differential Imaging (ADI), and near-Earth object (NEO) detection for missions like NEO Surveyor. YOSO thus provides a versatile, scalable approach for extracting faint, motion-dependent signals in the era of data-intensive astronomy.

preprint2022arXiv

From Data to Software to Science with the Rubin Observatory LSST

The Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) dataset will dramatically alter our understanding of the Universe, from the origins of the Solar System to the nature of dark matter and dark energy. Much of this research will depend on the existence of robust, tested, and scalable algorithms, software, and services. Identifying and developing such tools ahead of time has the potential to significantly accelerate the delivery of early science from LSST. Developing these collaboratively, and making them broadly available, can enable more inclusive and equitable collaboration on LSST science. To facilitate such opportunities, a community workshop entitled "From Data to Software to Science with the Rubin Observatory LSST" was organized by the LSST Interdisciplinary Network for Collaboration and Computing (LINCC) and partners, and held at the Flatiron Institute in New York, March 28-30th 2022. The workshop included over 50 in-person attendees invited from over 300 applications. It identified seven key software areas of need: (i) scalable cross-matching and distributed joining of catalogs, (ii) robust photometric redshift determination, (iii) software for determination of selection functions, (iv) frameworks for scalable time-series analyses, (v) services for image access and reprocessing at scale, (vi) object image access (cutouts) and analysis at scale, and (vii) scalable job execution systems. This white paper summarizes the discussions of this workshop. It considers the motivating science use cases, identified cross-cutting algorithms, software, and services, their high-level technical specifications, and the principles of inclusive collaborations needed to develop them. We provide it as a useful roadmap of needs, as well as to spur action and collaboration between groups and individuals looking to develop reusable software for early LSST science.

preprint2022arXiv

Machine Learning and Cosmology

Methods based on machine learning have recently made substantial inroads in many corners of cosmology. Through this process, new computational tools, new perspectives on data collection, model development, analysis, and discovery, as well as new communities and educational pathways have emerged. Despite rapid progress, substantial potential at the intersection of cosmology and machine learning remains untapped. In this white paper, we summarize current and ongoing developments relating to the application of machine learning within cosmology and provide a set of recommendations aimed at maximizing the scientific impact of these burgeoning tools over the coming decade through both technical development as well as the fostering of emerging communities.

preprint2022arXiv

MUSSES2020J: The Earliest Discovery of a Fast Blue Ultraluminous Transient at Redshift 1.063

In this Letter, we report the discovery of an ultraluminous fast-evolving transient in rest-frame UV wavelengths, MUSSES2020J, soon after its occurrence by using the Hyper Suprime-Cam (HSC) mounted on the 8.2 m Subaru telescope. The rise time of about 5 days with an extremely high UV peak luminosity shares similarities to a handful of fast blue optical transients whose peak luminosities are comparable with the most luminous supernovae while their timescales are significantly shorter (hereafter "fast blue ultraluminous transient," FBUT). In addition, MUSSES2020J is located near the center of a normal low-mass galaxy at a redshift of 1.063, suggesting a possible connection between the energy source of MUSSES2020J and the central part of the host galaxy. Possible physical mechanisms powering this extreme transient such as a wind-driven tidal disruption event and an interaction between supernova and circumstellar material are qualitatively discussed based on the first multiband early-phase light curve of FBUTs, although whether the scenarios can quantitatively explain the early photometric behavior of MUSSES2020J requires systematical theoretical investigations. Thanks to the ultrahigh luminosity in UV and blue optical wavelengths of these extreme transients, a promising number of FBUTs from the local to the high-z universe can be discovered through deep wide-field optical surveys in the near future.

preprint2021arXiv

Optimization of the Observing Cadence for the Rubin Observatory Legacy Survey of Space and Time: a pioneering process of community-focused experimental design

Vera C. Rubin Observatory is a ground-based astronomical facility under construction, a joint project of the National Science Foundation and the U.S. Department of Energy, designed to conduct a multi-purpose 10-year optical survey of the southern hemisphere sky: the Legacy Survey of Space and Time. Significant flexibility in survey strategy remains within the constraints imposed by the core science goals of probing dark energy and dark matter, cataloging the Solar System, exploring the transient optical sky, and mapping the Milky Way. The survey's massive data throughput will be transformational for many other astrophysics domains and Rubin's data access policy sets the stage for a huge potential users' community. To ensure that the survey science potential is maximized while serving as broad a community as possible, Rubin Observatory has involved the scientific community at large in the process of setting and refining the details of the observing strategy. The motivation, history, and decision-making process of this strategy optimization are detailed in this paper, giving context to the science-driven proposals and recommendations for the survey strategy included in this Focus Issue.

preprint2021arXiv

The LSST DESC DC2 Simulated Sky Survey

We describe the simulated sky survey underlying the second data challenge (DC2) carried out in preparation for analysis of the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) by the LSST Dark Energy Science Collaboration (LSST DESC). Significant connections across multiple science domains will be a hallmark of LSST; the DC2 program represents a unique modeling effort that stresses this interconnectivity in a way that has not been attempted before. This effort encompasses a full end-to-end approach: starting from a large N-body simulation, through setting up LSST-like observations including realistic cadences, through image simulations, and finally processing with Rubin's LSST Science Pipelines. This last step ensures that we generate data products resembling those to be delivered by the Rubin Observatory as closely as is currently possible. The simulated DC2 sky survey covers six optical bands in a wide-fast-deep (WFD) area of approximately 300 deg^2 as well as a deep drilling field (DDF) of approximately 1 deg^2. We simulate 5 years of the planned 10-year survey. The DC2 sky survey has multiple purposes. First, the LSST DESC working groups can use the dataset to develop a range of DESC analysis pipelines to prepare for the advent of actual data. Second, it serves as a realistic testbed for the image processing software under development for LSST by the Rubin Observatory. In particular, simulated data provide a controlled way to investigate certain image-level systematic effects. Finally, the DC2 sky survey enables the exploration of new scientific ideas in both static and time-domain cosmology.

preprint2020arXiv

Applying Information Theory to Design Optimal Filters for Photometric Redshifts

In this paper we apply ideas from information theory to create a method for the design of optimal filters for photometric redshift estimation. We show the method applied to a series of simple example filters in order to motivate an intuition for how photometric redshift estimators respond to the properties of photometric passbands. We then design a realistic set of six filters covering optical wavelengths that optimize photometric redshifts for $z <= 2.3$ and $i < 25.3$. We create a simulated catalog for these optimal filters and use our filters with a photometric redshift estimation code to show that we can improve the standard deviation of the photometric redshift error by 7.1% overall and improve outliers 9.9% over the standard filters proposed for the Large Synoptic Survey Telescope (LSST). We compare features of our optimal filters to LSST and find that the LSST filters incorporate key features for optimal photometric redshift estimation. Finally, we describe how information theory can be applied to a range of optimization problems in astronomy.

preprint2020arXiv

Dimensionality Reduction of SDSS Spectra with Variational Autoencoders

High resolution galaxy spectra contain much information about galactic physics, but the high dimensionality of these spectra makes it difficult to fully utilize the information they contain. We apply variational autoencoders (VAEs), a non-linear dimensionality reduction technique, to a sample of spectra from the Sloan Digital Sky Survey. In contrast to Principal Component Analysis (PCA), a widely used technique, VAEs can capture non-linear relationships between latent parameters and the data. We find that a VAE can reconstruct the SDSS spectra well with only six latent parameters, outperforming PCA with the same number of components. Different galaxy classes are naturally separated in this latent space, without class labels having been given to the VAE. The VAE latent space is interpretable because the VAE can be used to make synthetic spectra at any point in latent space. For example, making synthetic spectra along tracks in latent space yields sequences of realistic spectra that interpolate between two different types of galaxies. Using the latent space to find outliers may yield interesting spectra: in our small sample, we immediately find unusual data artifacts and stars misclassified as galaxies. In this exploratory work, we show that VAEs create compact, interpretable latent spaces that capture non-linear features of the data. While a VAE takes substantial time to train (~1 day for 48000 spectra), once trained, VAEs can enable the fast exploration of large astronomical data sets.

preprint2020arXiv

Photometric Redshifts with the LSST II: The Impact of Near-Infrared and Near-Ultraviolet Photometry

Accurate photometric redshift (photo-$z$) estimates are essential to the cosmological science goals of the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST). In this work we use simulated photometry for mock galaxy catalogs to explore how LSST photo-$z$ estimates can be improved by the addition of near-infrared (NIR) and/or ultraviolet (UV) photometry from the Euclid, WFIRST, and/or CASTOR space telescopes. Generally, we find that deeper optical photometry can reduce the standard deviation of the photo-$z$ estimates more than adding NIR or UV filters, but that additional filters are the only way to significantly lower the fraction of galaxies with catastrophically under- or over-estimated photo-$z$. For Euclid, we find that the addition of ${JH}$ $5σ$ photometric detections can reduce the standard deviation for galaxies with $z>1$ ($z>0.3$) by ${\sim}20\%$ (${\sim}10\%$), and the fraction of outliers by ${\sim}40\%$ (${\sim}25\%$). For WFIRST, we show how the addition of deep ${YJHK}$ photometry could reduce the standard deviation by ${\gtrsim}50\%$ at $z>1.5$ and drastically reduce the fraction of outliers to just ${\sim}2\%$ overall. For CASTOR, we find that the addition of its ${UV}$ and $u$-band photometry could reduce the standard deviation by ${\sim}30\%$ and the fraction of outliers by ${\sim}50\%$ for galaxies with $z<0.5$. We also evaluate the photo-$z$ results within sky areas that overlap with both the NIR and UV surveys, and when spectroscopic training sets built from the surveys&#39; small-area deep fields are used.