Researcher profile

Michelle Ntampaka

Michelle Ntampaka contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2022arXiv

Emulating Sunyaev-Zeldovich Images of Galaxy Clusters using Auto-Encoders

We develop a machine learning algorithm that generates high-resolution thermal Sunyaev-Zeldovich (SZ) maps of novel galaxy clusters given only halo mass and mass accretion rate. The algorithm uses a conditional variational autoencoder (CVAE) in the form of a convolutional neural network and is trained with SZ maps generated from the IllustrisTNG simulation. Our method can reproduce many of the details of galaxy clusters that analytical models usually lack, such as internal structure and aspherical distribution of gas created by mergers, while achieving the same computational feasibility, allowing us to generate mock SZ maps for over $10^5$ clusters in 30 seconds on a laptop. We show that the model is capable of generating novel clusters (i.e. not found in the training set) and that the model accurately reproduces the effects of mass and mass accretion rate on the SZ images, such as scatter, asymmetry, and concentration, in addition to modeling merging sub-clusters. This work demonstrates the viability of machine-learning--based methods for producing the number of realistic, high-resolution maps of galaxy clusters necessary to achieve statistical constraints from future SZ surveys.

preprint2022arXiv

R2-D2: Roman and Rubin -- From Data to Discovery

The NASA Nancy Grace Roman Space Telescope (Roman) and the Vera C. Rubin Observatory Legacy Survey of Space and Time (Rubin), will transform our view of the wide-field sky, with similar sensitivities, but complementary in wavelength, spatial resolution, and time domain coverage. Here we present findings from the AURA Roman+Rubin Synergy Working group, charged by the STScI and NOIRLab Directors to identify frontier science questions in General Astrophysics, beyond the well-covered areas of Dark Energy and Cosmology, that can be uniquely addressed with Roman and Rubin synergies in observing strategy, data products and archiving, joint analysis, and community engagement. This analysis was conducted with input from the community in the form of brief (1-2 paragraph) "science pitches" (see Appendix), and testimony from "outside experts" (included as co-authors). We identify a rich and broad landscape of potential discoveries catalyzed by the combination of exceptional quality and quantity of Roman and Rubin data, and summarize implementation requirements that would facilitate this bounty of additional science with coordination of survey fields, joint coverage of the Galactic plane, bulge, and ecliptic, expansion of General Investigator and Target of Opportunity observing modes, co-location of Roman and Rubin data, and timely distribution of data, transient alerts, catalogs, value-added joint analysis products, and simulations to the broad astronomical community.

preprint2022arXiv

The Dynamical Mass of the Coma Cluster from Deep Learning

In 1933, Fritz Zwicky's famous investigations of the mass of the Coma cluster led him to infer the existence of dark matter \cite{1933AcHPh...6..110Z}. His fundamental discoveries have proven to be foundational to modern cosmology; as we now know such dark matter makes up 85\% of the matter and 25\% of the mass-energy content in the universe. Galaxy clusters like Coma are massive, complex systems of dark matter in addition to hot ionized gas and thousands of galaxies, and serve as excellent probes of the dark matter distribution. However, empirical studies show that the total mass of such systems remains elusive and difficult to precisely constrain. Here, we present new estimates for the dynamical mass of the Coma cluster based on Bayesian deep learning methodologies developed in recent years. Using our novel data-driven approach, we predict Coma's $\mthc$ mass to be $10^{15.10 \pm 0.15}\ \hmsun$ within a radius of $1.78 \pm 0.03\ h^{-1}\mathrm{Mpc}$ of its center. We show that our predictions are rigorous across multiple training datasets and statistically consistent with historical estimates of Coma's mass. This measurement reinforces our understanding of the dynamical state of the Coma cluster and advances rigorous analyses and verification methods for empirical applications of machine learning in astronomy.

preprint2021arXiv

A Machine Learning Approach for Dynamical Mass Measurements of Galaxy Clusters

We present a modern machine learning approach for cluster dynamical mass measurements that is a factor of two improvement over using a conventional scaling relation. Different methods are tested against a mock cluster catalog constructed using halos with mass >= 10^14 Msolar/h from Multidark's publicly-available N-body MDPL halo catalog. In the conventional method, we use a standard M(sigma_v) power law scaling relation to infer cluster mass, M, from line-of-sight (LOS) galaxy velocity dispersion, sigma_v. The resulting fractional mass error distribution is broad, with width=0.87 (68% scatter), and has extended high-error tails. The standard scaling relation can be simply enhanced by including higher-order moments of the LOS velocity distribution. Applying the kurtosis as a correction term to log(sigma_v) reduces the width of the error distribution to 0.74 (16% improvement). Machine learning can be used to take full advantage of all the information in the velocity distribution. We employ the Support Distribution Machines (SDMs) algorithm that learns from distributions of data to predict single values. SDMs trained and tested on the distribution of LOS velocities yield width=0.46 (47% improvement). Furthermore, the problematic tails of the mass error distribution are effectively eliminated. Decreasing cluster mass errors will improve measurements of the growth of structure and lead to tighter constraints on cosmological parameters.

preprint2021arXiv

The Importance of Being Interpretable: Toward An Understandable Machine Learning Encoder for Galaxy Cluster Cosmology

We present a deep machine learning (ML) approach to constraining cosmological parameters with multi-wavelength observations of galaxy clusters. The ML approach has two components: an encoder that builds a compressed representation of each galaxy cluster and a flexible CNN to estimate the cosmological model from a cluster sample. It is trained and tested on simulated cluster catalogs built from the Magneticum simulations. From the simulated catalogs, the ML method estimates the amplitude of matter fluctuations, sigma_8, at approximately the expected theoretical limit. More importantly, the deep ML approach can be interpreted. We lay out three schemes for interpreting the ML technique: a leave-one-out method for assessing cluster importance, an average saliency for evaluating feature importance, and correlations in the terse layer for understanding whether an ML technique can be safely applied to observational data. These interpretation schemes led to the discovery of a previously unknown self-calibration mode for flux- and volume-limited cluster surveys. We describe this new mode, which uses the amplitude and peak of the cluster mass PDF as anchors for mass calibration. We introduce the term "overspecialized" to describe a common pitfall in astronomical applications of machine learning in which the ML method learns simulation-specific details, and we show how a carefully constructed architecture can be used to check for this source of systematic error.

preprint2021arXiv

The Role of Machine Learning in the Next Decade of Cosmology

In recent years, machine learning (ML) methods have remarkably improved how cosmologists can interpret data. The next decade will bring new opportunities for data-driven cosmological discovery, but will also present new challenges for adopting ML methodologies and understanding the results. ML could transform our field, but this transformation will require the astronomy community to both foster and promote interdisciplinary research endeavors.

preprint2019arXiv

A Hybrid Deep Learning Approach to Cosmological Constraints From Galaxy Redshift Surveys

We present a deep machine learning (ML)-based technique for accurately determining $σ_8$ and $Ω_m$ from mock 3D galaxy surveys. The mock surveys are built from the AbacusCosmos suite of $N$-body simulations, which comprises 40 cosmological volume simulations spanning a range of cosmological models, and we account for uncertainties in galaxy formation scenarios through the use of generalized halo occupation distributions (HODs). We explore a trio of ML models: a 3D convolutional neural network (CNN), a power-spectrum-based fully connected network, and a hybrid approach that merges the two to combine physically motivated summary statistics with flexible CNNs. We describe best practices for training a deep model on a suite of matched-phase simulations and we test our model on a completely independent sample that uses previously unseen initial conditions, cosmological parameters, and HOD parameters. Despite the fact that the mock observations are quite small ($\sim0.07h^{-3}\,\mathrm{Gpc}^3$) and the training data span a large parameter space (6 cosmological and 6 HOD parameters), the CNN and hybrid CNN can constrain $σ_8$ and $Ω_m$ to $\sim3\%$ and $\sim4\%$, respectively.