Source author record

Michelle Ntampaka

Michelle Ntampaka appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.CO astro-ph.IM astro-ph.EP astro-ph.GA astro-ph.HE astro-ph.SR

Catalog footprint

What is connected

9works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Referee Primer for Early Career Astronomers

Refereeing is a crucial component of publishing astronomical research, but few professional astronomers receive formal training on how to effectively referee a manuscript. In this article, we lay out considerations and best practices for referees. This document is intended as a tool for early career researchers to develop a fair, effective, and efficient approach to refereeing.

preprint2022arXiv

Emulating Sunyaev-Zeldovich Images of Galaxy Clusters using Auto-Encoders

We develop a machine learning algorithm that generates high-resolution thermal Sunyaev-Zeldovich (SZ) maps of novel galaxy clusters given only halo mass and mass accretion rate. The algorithm uses a conditional variational autoencoder (CVAE) in the form of a convolutional neural network and is trained with SZ maps generated from the IllustrisTNG simulation. Our method can reproduce many of the details of galaxy clusters that analytical models usually lack, such as internal structure and aspherical distribution of gas created by mergers, while achieving the same computational feasibility, allowing us to generate mock SZ maps for over $10^5$ clusters in 30 seconds on a laptop. We show that the model is capable of generating novel clusters (i.e. not found in the training set) and that the model accurately reproduces the effects of mass and mass accretion rate on the SZ images, such as scatter, asymmetry, and concentration, in addition to modeling merging sub-clusters. This work demonstrates the viability of machine-learning--based methods for producing the number of realistic, high-resolution maps of galaxy clusters necessary to achieve statistical constraints from future SZ surveys.

preprint2022arXiv

R2-D2: Roman and Rubin -- From Data to Discovery

The NASA Nancy Grace Roman Space Telescope (Roman) and the Vera C. Rubin Observatory Legacy Survey of Space and Time (Rubin), will transform our view of the wide-field sky, with similar sensitivities, but complementary in wavelength, spatial resolution, and time domain coverage. Here we present findings from the AURA Roman+Rubin Synergy Working group, charged by the STScI and NOIRLab Directors to identify frontier science questions in General Astrophysics, beyond the well-covered areas of Dark Energy and Cosmology, that can be uniquely addressed with Roman and Rubin synergies in observing strategy, data products and archiving, joint analysis, and community engagement. This analysis was conducted with input from the community in the form of brief (1-2 paragraph) "science pitches" (see Appendix), and testimony from "outside experts" (included as co-authors). We identify a rich and broad landscape of potential discoveries catalyzed by the combination of exceptional quality and quantity of Roman and Rubin data, and summarize implementation requirements that would facilitate this bounty of additional science with coordination of survey fields, joint coverage of the Galactic plane, bulge, and ecliptic, expansion of General Investigator and Target of Opportunity observing modes, co-location of Roman and Rubin data, and timely distribution of data, transient alerts, catalogs, value-added joint analysis products, and simulations to the broad astronomical community.

preprint2022arXiv

The Dynamical Mass of the Coma Cluster from Deep Learning

In 1933, Fritz Zwicky's famous investigations of the mass of the Coma cluster led him to infer the existence of dark matter \cite{1933AcHPh...6..110Z}. His fundamental discoveries have proven to be foundational to modern cosmology; as we now know such dark matter makes up 85\% of the matter and 25\% of the mass-energy content in the universe. Galaxy clusters like Coma are massive, complex systems of dark matter in addition to hot ionized gas and thousands of galaxies, and serve as excellent probes of the dark matter distribution. However, empirical studies show that the total mass of such systems remains elusive and difficult to precisely constrain. Here, we present new estimates for the dynamical mass of the Coma cluster based on Bayesian deep learning methodologies developed in recent years. Using our novel data-driven approach, we predict Coma's $\mthc$ mass to be $10^{15.10 \pm 0.15}\ \hmsun$ within a radius of $1.78 \pm 0.03\ h^{-1}\mathrm{Mpc}$ of its center. We show that our predictions are rigorous across multiple training datasets and statistically consistent with historical estimates of Coma's mass. This measurement reinforces our understanding of the dynamical state of the Coma cluster and advances rigorous analyses and verification methods for empirical applications of machine learning in astronomy.

preprint2021arXiv

A Machine Learning Approach for Dynamical Mass Measurements of Galaxy Clusters

We present a modern machine learning approach for cluster dynamical mass measurements that is a factor of two improvement over using a conventional scaling relation. Different methods are tested against a mock cluster catalog constructed using halos with mass >= 10^14 Msolar/h from Multidark's publicly-available N-body MDPL halo catalog. In the conventional method, we use a standard M(sigma_v) power law scaling relation to infer cluster mass, M, from line-of-sight (LOS) galaxy velocity dispersion, sigma_v. The resulting fractional mass error distribution is broad, with width=0.87 (68% scatter), and has extended high-error tails. The standard scaling relation can be simply enhanced by including higher-order moments of the LOS velocity distribution. Applying the kurtosis as a correction term to log(sigma_v) reduces the width of the error distribution to 0.74 (16% improvement). Machine learning can be used to take full advantage of all the information in the velocity distribution. We employ the Support Distribution Machines (SDMs) algorithm that learns from distributions of data to predict single values. SDMs trained and tested on the distribution of LOS velocities yield width=0.46 (47% improvement). Furthermore, the problematic tails of the mass error distribution are effectively eliminated. Decreasing cluster mass errors will improve measurements of the growth of structure and lead to tighter constraints on cosmological parameters.

preprint2021arXiv

The Importance of Being Interpretable: Toward An Understandable Machine Learning Encoder for Galaxy Cluster Cosmology

We present a deep machine learning (ML) approach to constraining cosmological parameters with multi-wavelength observations of galaxy clusters. The ML approach has two components: an encoder that builds a compressed representation of each galaxy cluster and a flexible CNN to estimate the cosmological model from a cluster sample. It is trained and tested on simulated cluster catalogs built from the Magneticum simulations. From the simulated catalogs, the ML method estimates the amplitude of matter fluctuations, sigma_8, at approximately the expected theoretical limit. More importantly, the deep ML approach can be interpreted. We lay out three schemes for interpreting the ML technique: a leave-one-out method for assessing cluster importance, an average saliency for evaluating feature importance, and correlations in the terse layer for understanding whether an ML technique can be safely applied to observational data. These interpretation schemes led to the discovery of a previously unknown self-calibration mode for flux- and volume-limited cluster surveys. We describe this new mode, which uses the amplitude and peak of the cluster mass PDF as anchors for mass calibration. We introduce the term "overspecialized" to describe a common pitfall in astronomical applications of machine learning in which the ML method learns simulation-specific details, and we show how a carefully constructed architecture can be used to check for this source of systematic error.

preprint2021arXiv

The Role of Machine Learning in the Next Decade of Cosmology

In recent years, machine learning (ML) methods have remarkably improved how cosmologists can interpret data. The next decade will bring new opportunities for data-driven cosmological discovery, but will also present new challenges for adopting ML methodologies and understanding the results. ML could transform our field, but this transformation will require the astronomy community to both foster and promote interdisciplinary research endeavors.

preprint2019arXiv

A Hybrid Deep Learning Approach to Cosmological Constraints From Galaxy Redshift Surveys

We present a deep machine learning (ML)-based technique for accurately determining $σ_8$ and $Ω_m$ from mock 3D galaxy surveys. The mock surveys are built from the AbacusCosmos suite of $N$-body simulations, which comprises 40 cosmological volume simulations spanning a range of cosmological models, and we account for uncertainties in galaxy formation scenarios through the use of generalized halo occupation distributions (HODs). We explore a trio of ML models: a 3D convolutional neural network (CNN), a power-spectrum-based fully connected network, and a hybrid approach that merges the two to combine physically motivated summary statistics with flexible CNNs. We describe best practices for training a deep model on a suite of matched-phase simulations and we test our model on a completely independent sample that uses previously unseen initial conditions, cosmological parameters, and HOD parameters. Despite the fact that the mock observations are quite small ($\sim0.07h^{-3}\,\mathrm{Gpc}^3$) and the training data span a large parameter space (6 cosmological and 6 HOD parameters), the CNN and hybrid CNN can constrain $σ_8$ and $Ω_m$ to $\sim3\%$ and $\sim4\%$, respectively.

preprint2013arXiv

A First Look at creating mock catalogs with machine learning techniques

We investigate machine learning (ML) techniques for predicting the number of galaxies (N_gal) that occupy a halo, given the halo's properties. These types of mappings are crucial for constructing the mock galaxy catalogs necessary for analyses of large-scale structure. The ML techniques proposed here distinguish themselves from traditional halo occupation distribution (HOD) modeling as they do not assume a prescribed relationship between halo properties and N_gal. In addition, our ML approaches are only dependent on parent halo properties (like HOD methods), which are advantageous over subhalo-based approaches as identifying subhalos correctly is difficult. We test 2 algorithms: support vector machines (SVM) and k-nearest-neighbour (kNN) regression. We take galaxies and halos from the Millennium simulation and predict N_gal by training our algorithms on the following 6 halo properties: number of particles, M_200, σ_v, v_max, half-mass radius and spin. For Millennium, our predicted N_gal values have a mean-squared-error (MSE) of ~0.16 for both SVM and kNN. Our predictions match the overall distribution of halos reasonably well and the galaxy correlation function at large scales to ~5-10%. In addition, we demonstrate a feature selection algorithm to isolate the halo parameters that are most predictive, a useful technique for understanding the mapping between halo properties and N_gal. Lastly, we investigate these ML-based approaches in making mock catalogs for different galaxy subpopulations (e.g. blue, red, high M_star, low M_star). Given its non-parametric nature as well as its powerful predictive and feature selection capabilities, machine learning offers an interesting alternative for creating mock catalogs.

Michelle Ntampaka

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

A Referee Primer for Early Career Astronomers

Emulating Sunyaev-Zeldovich Images of Galaxy Clusters using Auto-Encoders

R2-D2: Roman and Rubin -- From Data to Discovery

The Dynamical Mass of the Coma Cluster from Deep Learning

A Machine Learning Approach for Dynamical Mass Measurements of Galaxy Clusters

The Importance of Being Interpretable: Toward An Understandable Machine Learning Encoder for Galaxy Cluster Cosmology

The Role of Machine Learning in the Next Decade of Cosmology

A Hybrid Deep Learning Approach to Cosmological Constraints From Galaxy Redshift Surveys

A First Look at creating mock catalogs with machine learning techniques