Researcher profile

Yao-Yuan Mao

Yao-Yuan Mao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2022arXiv

Extending the SAGA Survey (xSAGA) I: Satellite Radial Profiles as a Function of Host Galaxy Properties

We present &#34;Extending the Satellites Around Galactic Analogs Survey&#34; (xSAGA), a method for identifying low-$z$ galaxies on the basis of optical imaging, and results on the spatial distributions of xSAGA satellites around host galaxies. Using spectroscopic redshift catalogs from the SAGA Survey as a training data set, we have optimized a convolutional neural network (CNN) to identify $z < 0.03$ galaxies from more distant objects using image cutouts from the DESI Legacy Imaging Surveys. From the sample of $> 100,000$ CNN-selected low-$z$ galaxies, we identify $>20,000$ probable satellites located between 36-300 projected kpc from NASA-Sloan Atlas central galaxies in the stellar mass range $9.5 < \log(M_\star/M_\odot) < 11$. We characterize the incompleteness and contamination for CNN-selected samples, and apply corrections in order to estimate the true number of satellites as a function of projected radial distance from their hosts. Satellite richness depends strongly on host stellar mass, such that more massive host galaxies have more satellites, and on host morphology, such that elliptical hosts have more satellites than disky hosts with comparable stellar masses. We also find a strong inverse correlation between satellite richness and the magnitude gap between a host and its brightest satellite. The normalized satellite radial distribution between 36-300 kpc does not depend strongly on host stellar mass, morphology, or magnitude gap. The satellite abundances and radial distributions we measure are in reasonable agreement with predictions from hydrodynamic simulations. Our results deliver unprecedented statistical power for studying satellite galaxy populations, and highlight the promise of using machine learning for extending galaxy samples of wide-area surveys.

preprint2022arXiv

From Data to Software to Science with the Rubin Observatory LSST

The Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) dataset will dramatically alter our understanding of the Universe, from the origins of the Solar System to the nature of dark matter and dark energy. Much of this research will depend on the existence of robust, tested, and scalable algorithms, software, and services. Identifying and developing such tools ahead of time has the potential to significantly accelerate the delivery of early science from LSST. Developing these collaboratively, and making them broadly available, can enable more inclusive and equitable collaboration on LSST science. To facilitate such opportunities, a community workshop entitled &#34;From Data to Software to Science with the Rubin Observatory LSST&#34; was organized by the LSST Interdisciplinary Network for Collaboration and Computing (LINCC) and partners, and held at the Flatiron Institute in New York, March 28-30th 2022. The workshop included over 50 in-person attendees invited from over 300 applications. It identified seven key software areas of need: (i) scalable cross-matching and distributed joining of catalogs, (ii) robust photometric redshift determination, (iii) software for determination of selection functions, (iv) frameworks for scalable time-series analyses, (v) services for image access and reprocessing at scale, (vi) object image access (cutouts) and analysis at scale, and (vii) scalable job execution systems. This white paper summarizes the discussions of this workshop. It considers the motivating science use cases, identified cross-cutting algorithms, software, and services, their high-level technical specifications, and the principles of inclusive collaborations needed to develop them. We provide it as a useful roadmap of needs, as well as to spur action and collaboration between groups and individuals looking to develop reusable software for early LSST science.

preprint2022arXiv

Photometric Redshifts from SDSS Images with an Interpretable Deep Capsule Network

Studies of cosmology, galaxy evolution, and astronomical transients with current and next-generation wide-field imaging surveys like the Rubin Observatory Legacy Survey of Space and Time (LSST) are all critically dependent on estimates of photometric redshifts. Capsule networks are a new type of neural network architecture that is better suited for identifying morphological features of the input images than traditional convolutional neural networks. We use a deep capsule network trained on $ugriz$ images, spectroscopic redshifts, and Galaxy Zoo spiral/elliptical classifications of $\sim$400,000 Sloan Digital Sky Survey (SDSS) galaxies to do photometric redshift estimation. We achieve a photometric redshift prediction accuracy and a fraction of catastrophic outliers that are comparable to or better than current methods for SDSS main galaxy sample-like data sets ($r\leq17.8$ and $z_{\mathrm{spec}}\leq0.4$) while requiring less data and fewer trainable parameters. Furthermore, the decision-making of our capsule network is much more easily interpretable as capsules act as a low-dimensional encoding of the image. When the capsules are projected on a 2-dimensional manifold, they form a single redshift sequence with the fraction of spirals in a region exhibiting a gradient roughly perpendicular to the redshift sequence. We perturb encodings of real galaxy images in this low-dimensional space to create synthetic galaxy images that demonstrate the image properties (e.g., size, orientation, and surface brightness) encoded by each dimension. We also measure correlations between galaxy properties (e.g., magnitudes, colours, and stellar mass) and each capsule dimension. We publicly release our code, estimated redshifts, and additional catalogues at https://biprateep.github.io/encapZulate-1 .

preprint2022arXiv

Snowmass2021 Cosmic Frontier White Paper: Cosmological Simulations for Dark Matter Physics

Over the past several decades, unexpected astronomical discoveries have been fueling a new wave of particle model building and are inspiring the next generation of ever-more-sophisticated simulations to reveal the nature of Dark Matter (DM). This coincides with the advent of new observing facilities coming online, including JWST, the Rubin Observatory, the Nancy Grace Roman Space Telescope, and CMB-S4. The time is now to build a novel simulation program to interpret observations so that we can identify novel signatures of DM microphysics across a large dynamic range of length scales and cosmic time. This white paper identifies the key elements that are needed for such a simulation program. We identify areas of growth on both the particle theory side as well as the simulation algorithm and implementation side, so that we can robustly simulate the cosmic evolution of DM for well-motivated models. We recommend that simulations include a fully calibrated and well-tested treatment of baryonic physics, and that outputs should connect with observations in the space of observables. We identify the tools and methods currently available to make predictions and the path forward for building more of these tools. A strong cosmic DM simulation program is key to translating cosmological observations to robust constraints on DM fundamental physics, and provides a connection to lab-based probes of DM physics.

preprint2022arXiv

Snowmass2021 Cosmic Frontier White Paper: Prospects for obtaining Dark Matter Constraints with DESI

Despite efforts over several decades, direct-detection experiments have not yet led to the discovery of the dark matter (DM) particle. This has led to increasing interest in alternatives to the Lambda CDM (LCDM) paradigm and alternative DM scenarios (including fuzzy DM, warm DM, self-interacting DM, etc.). In many of these scenarios, DM particles cannot be detected directly and constraints on their properties can ONLY be arrived at using astrophysical observations. The Dark Energy Spectroscopic Instrument (DESI) is currently one of the most powerful instruments for wide-field surveys. The synergy of DESI with ESA&#39;s Gaia satellite and future observing facilities will yield datasets of unprecedented size and coverage that will enable constraints on DM over a wide range of physical and mass scales and across redshifts. DESI will obtain spectra of the Lyman-alpha forest out to z~5 by detecting about 1 million QSO spectra that will put constraints on clustering of the low-density intergalactic gas and DM halos at high redshift. DESI will obtain radial velocities of 10 million stars in the Milky Way (MW) and Local Group satellites enabling us to constrain their global DM distributions, as well as the DM distribution on smaller scales. The paradigm of cosmological structure formation has been extensively tested with simulations. However, the majority of simulations to date have focused on collisionless CDM. Simulations with alternatives to CDM have recently been gaining ground but are still in their infancy. While there are numerous publicly available large-box and zoom-in simulations in the LCDM framework, there are no comparable publicly available WDM, SIDM, FDM simulations. DOE support for a public simulation suite will enable a more cohesive community effort to compare observations from DESI (and other surveys) with numerical predictions and will greatly impact DM science.

preprint2022arXiv

Snowmass2021: Vera C. Rubin Observatory as a Flagship Dark Matter Experiment

Establishing that Vera C. Rubin Observatory is a flagship dark matter experiment is an essential pathway toward understanding the physical nature of dark matter. In the past two decades, wide-field astronomical surveys and terrestrial laboratories have jointly created a phase transition in the ecosystem of dark matter models and probes. Going forward, any robust understanding of dark matter requires astronomical observations, which still provide the only empirical evidence for dark matter to date. We have a unique opportunity right now to create a dark matter experiment with Rubin Observatory Legacy Survey of Space and Time (LSST). This experiment will be a coordinated effort to perform dark matter research, and provide a large collaborative team of scientists with the necessary organizational and funding supports. This approach leverages existing investments in Rubin. Studies of dark matter with Rubin LSST will also guide the design of, and confirm the results from, other dark matter experiments. Supporting a collaborative team to carry out a dark matter experiment with Rubin LSST is the key to achieving the dark matter science goals that have already been identified as high priority by the high-energy physics and astronomy communities.

preprint2022arXiv

Validating Synthetic Galaxy Catalogs for Dark Energy Science in the LSST Era

Large simulation efforts are required to provide synthetic galaxy catalogs for ongoing and upcoming cosmology surveys. These extragalactic catalogs are being used for many diverse purposes covering a wide range of scientific topics. In order to be useful, they must offer realistically complex information about the galaxies they contain. Hence, it is critical to implement a rigorous validation procedure that ensures that the simulated galaxy properties faithfully capture observations and delivers an assessment of the level of realism attained by the catalog. We present here a suite of validation tests that have been developed by the Rubin Observatory Legacy Survey of Space and Time (LSST) Dark Energy Science Collaboration (DESC). We discuss how the inclusion of each test is driven by the scientific targets for static ground-based dark energy science and by the availability of suitable validation data. The validation criteria that are used to assess the performance of a catalog are flexible and depend on the science goals. We illustrate the utility of this suite by showing examples for the validation of cosmoDC2, the extragalactic catalog recently released for the LSST DESC second Data Challenge.

preprint2021arXiv

The LSST DESC DC2 Simulated Sky Survey

We describe the simulated sky survey underlying the second data challenge (DC2) carried out in preparation for analysis of the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) by the LSST Dark Energy Science Collaboration (LSST DESC). Significant connections across multiple science domains will be a hallmark of LSST; the DC2 program represents a unique modeling effort that stresses this interconnectivity in a way that has not been attempted before. This effort encompasses a full end-to-end approach: starting from a large N-body simulation, through setting up LSST-like observations including realistic cadences, through image simulations, and finally processing with Rubin&#39;s LSST Science Pipelines. This last step ensures that we generate data products resembling those to be delivered by the Rubin Observatory as closely as is currently possible. The simulated DC2 sky survey covers six optical bands in a wide-fast-deep (WFD) area of approximately 300 deg^2 as well as a deep drilling field (DDF) of approximately 1 deg^2. We simulate 5 years of the planned 10-year survey. The DC2 sky survey has multiple purposes. First, the LSST DESC working groups can use the dataset to develop a range of DESC analysis pipelines to prepare for the advent of actual data. Second, it serves as a realistic testbed for the image processing software under development for LSST by the Rubin Observatory. In particular, simulated data provide a controlled way to investigate certain image-level systematic effects. Finally, the DC2 sky survey enables the exploration of new scientific ideas in both static and time-domain cosmology.

preprint2020arXiv

Signatures of Velocity-Dependent Dark Matter Self-Interactions in Milky Way-mass Halos

We explore the impact of elastic, anisotropic, velocity-dependent dark matter (DM) self-interactions on the host halo and subhalos of Milky Way (MW)--mass systems. We consider a generic self-interacting dark matter (SIDM) model parameterized by the masses of a light mediator and the DM particle. The ratio of these masses, $w$, sets the velocity scale above which momentum transfer due to DM self-interactions becomes inefficient. We perform high-resolution zoom-in simulations of an MW-mass halo for values of $w$ that span scenarios in which self-interactions either between the host and its subhalos or only within subhalos efficiently transfer momentum, and we study the effects of self-interactions on the host halo and on the abundance, radial distribution, orbital dynamics, and density profiles of subhalos in each case. The abundance and properties of surviving subhalos are consistent with being determined primarily by subhalo--host halo interactions. In particular, subhalos on radial orbits in models with larger values of the cross section at the host halo velocity scale are more susceptible to tidal disruption owing to mass loss from ram pressure stripping caused by self-interactions with the host. This mechanism suppresses the abundance of surviving subhalos relative to collisionless DM simulations, with stronger suppression for larger values of $w$. Thus, probes of subhalo abundance around MW-mass hosts can be used to place upper limits on the self-interaction cross section at velocity scales of $\sim 200\ \rm{km\ s}^{-1}$, and combining these measurements with the orbital properties and internal dynamics of subhalos may break degeneracies among velocity-dependent SIDM models.

preprint2020arXiv

The SAGA Survey. II. Building a Statistical Sample of Satellite Systems around Milky Way-like Galaxies

We present the Stage II results from the ongoing Satellites Around Galactic Analogs (SAGA) Survey. Upon completion, the SAGA Survey will spectroscopically identify satellite galaxies brighter than $ M_{r,o} = -12.3 $ around 100 Milky Way (MW) analogs at $ z \sim 0.01 $. In Stage II, we have more than quadrupled the sample size of Stage I, delivering results from 127 satellites around 36 MW analogs with an improved target selection strategy and deep photometric imaging catalogs from the Dark Energy Survey and the Legacy Surveys. We have obtained 25,372 galaxy redshifts, peaking around $ z = 0.2 $. These data significantly increase spectroscopic coverage for very low redshift objects in $ 17 < r_o < 20.75 $ around SAGA hosts, creating a unique data set that places the Local Group in a wider context. The number of confirmed satellites per system ranges from zero to nine, and correlates with host galaxy and brightest satellite luminosities. We find that the number and the luminosities of MW satellites are consistent with being drawn from the same underlying distribution as SAGA systems. The majority of confirmed SAGA satellites are star forming, and the quenched fraction increases as satellite stellar mass and projected radius from the host galaxy decrease. Overall, the satellite quenched fraction among SAGA systems is lower than that in the Local Group. We compare the luminosity functions and radial distributions of SAGA satellites with theoretical predictions based on cold dark matter simulations and an empirical galaxy-halo connection model and find that the results are broadly in agreement.

preprint2019arXiv

Constraining the scatter in the galaxy-halo connection at Milky Way masses

We develop and implement two new methods for constraining the scatter in the relationship between galaxies and dark matter halos. These new techniques are sensitive to the scatter at low halo masses, making them complementary to previous constraints that are dependent on clustering amplitudes or rich galaxy groups, both of which are only sensitive to more massive halos. In both of our methods, we use a galaxy group finder to locate central galaxies in the SDSS main galaxy sample. Our first technique uses the small-scale cross-correlation of central galaxies with all lower mass galaxies. This quantity is sensitive to the satellite fraction of low-mass galaxies, which is in turn driven by the scatter between halos and galaxies. The second technique uses the kurtosis of the distribution of line-of-sight velocities between central galaxies and neighboring galaxies. This quantity is sensitive to the distribution of halo masses that contain the central galaxies at fixed stellar mass. Theoretical models are constructed using peak halo circular velocity, $V_{\rm peak}$, as our property to connect galaxies to halos. The cross-correlation technique yields a constraint of $σ[ M_\ast|V_{\rm peak}]=0.27\pm 0.05$ dex, corresponding to a scatter in $\log M_\ast$ at fixed $M_h$ of $σ[ M_\ast|M_h]=0.38\pm 0.06$ dex at $M_h=10^{11.8}$ Msun. The kurtosis technique yields $σ[ M_\ast|V_{\rm peak}]=0.30\pm0.03$, corresponding to $σ[ M_\ast|M_h]=0.34\pm 0.04$ at $M_h=10^{12.2}$ Msun. The values of $σ[ M_\ast|M_h]$ are significantly larger than the constraints at higher masses, in agreement with the results of hydrodynamic simulations. This increase is only partly due to the scatter between $V_{\rm peak}$ and $M_h$, and it represents an increase of nearly a factor of two relative to the values inferred from clustering and group studies at high masses.

preprint2019arXiv

CosmoDC2: A Synthetic Sky Catalog for Dark Energy Science with LSST

This paper introduces cosmoDC2, a large synthetic galaxy catalog designed to support precision dark energy science with the Large Synoptic Survey Telescope (LSST). CosmoDC2 is the starting point for the second data challenge (DC2) carried out by the LSST Dark Energy Science Collaboration (LSST DESC). The catalog is based on a trillion-particle, 4.225 Gpc^3 box cosmological N-body simulation, the `Outer Rim&#39; run. It covers 440 deg^2 of sky area to a redshift of z=3 and is complete to a magnitude depth of 28 in the r-band. Each galaxy is characterized by a multitude of properties including stellar mass, morphology, spectral energy distributions, broadband filter magnitudes, host halo information and weak lensing shear. The size and complexity of cosmoDC2 requires an efficient catalog generation methodology; our approach is based on a new hybrid technique that combines data-driven empirical approaches with semi-analytic galaxy modeling. A wide range of observation-based validation tests has been implemented to ensure that cosmoDC2 enables the science goals of the planned LSST DESC DC2 analyses. This paper also represents the official release of the cosmoDC2 data set, including an efficient reader that facilitates interaction with the data.

preprint2019arXiv

How to Optimally Constrain Galaxy Assembly Bias: Supplement Projected Correlation Functions with Count-in-cells Statistics

Most models for the connection between galaxies and their haloes ignore the possibility that galaxy properties may be correlated with halo properties other than mass, a phenomenon known as galaxy assembly bias. Yet, it is known that such correlations can lead to systematic errors in the interpretation of survey data. At present, the degree to which galaxy assembly bias may be present in the real Universe, and the best strategies for constraining it remain uncertain. We study the ability of several observables to constrain galaxy assembly bias from redshift survey data using the decorated halo occupation distribution (dHOD), an empirical model of the galaxy--halo connection that incorporates assembly bias. We cover an expansive set of observables, including the projected two-point correlation function $w_{\mathrm{p}}(r_{\mathrm{p}})$, the galaxy--galaxy lensing signal $ΔΣ(r_{\mathrm{p}})$, the void probability function $\mathrm{VPF}(r)$, the distributions of counts-in-cylinders $P(N_{\mathrm{CIC}})$, and counts-in-annuli $P(N_{\mathrm{CIA}})$, and the distribution of the ratio of counts in cylinders of different sizes $P(N_2/N_5)$. We find that despite the frequent use of the combination $w_{\mathrm{p}}(r_{\mathrm{p}})+ΔΣ(r_{\mathrm{p}})$ in interpreting galaxy data, the count statistics, $P(N_{\mathrm{CIC}})$ and $P(N_{\mathrm{CIA}})$, are generally more efficient in constraining galaxy assembly bias when combined with $w_{\mathrm{p}}(r_{\mathrm{p}})$. Constraints based upon $w_{\mathrm{p}}(r_{\mathrm{p}})$ and $ΔΣ(r_{\mathrm{p}})$ share common degeneracy directions in the parameter space, while combinations of $w_{\mathrm{p}}(r_{\mathrm{p}})$ with the count statistics are more complementary. Therefore, we strongly suggest that count statistics should be used to complement the canonical observables in future studies of the galaxy--halo connection.

preprint2017arXiv

Brightest galaxies as halo centre tracers in SDSS DR7

Determining the positions of halo centres in large-scale structure surveys is crucial for many cosmological studies. A common assumption is that halo centres correspond to the location of their brightest member galaxies. In this paper, we study the dynamics of brightest galaxies with respect to other halo members in the Sloan Digital Sky Survey DR7. Specifically, we look at the line-of-sight velocity and spatial offsets between brightest galaxies and their neighbours. We compare those to detailed mock catalogues, constructed from high-resolution, dark-matter-only $N$-body simulations, in which it is assumed that satellite galaxies trace dark matter subhaloes. This allows us to place constraints on the fraction $f_{\rm BNC}$ of haloes in which the brightest galaxy is not the central. Compared to previous studies we explicitly take into account the unrelaxed state of the host haloes, velocity offsets of halo cores and correlations between $f_{\rm BNC}$ and the satellite occupation. We find that $f_{\rm BNC}$ strongly decreases with the luminosity of the brightest galaxy and increases with the mass of the host halo. Overall, in the halo mass range $10^{13} - 10^{14.5} h^{-1} M_\odot$ we find $f_{\rm BNC} \sim 30\%$, in good agreement with a previous study by Skibba et al. We discuss the implications of these findings for studies inferring the galaxy--halo connection from satellite kinematics, models of the conditional luminosity function and galaxy formation in general.

preprint2017arXiv

The Galaxy Clustering Crisis in Abundance Matching

Galaxy clustering on small scales is significantly under-predicted by sub-halo abundance matching (SHAM) models that populate (sub-)haloes with galaxies based on peak halo mass, $M_{\rm peak}$. SHAM models based on the peak maximum circular velocity, $V_{\rm peak}$, have had much better success. The primary reason $M_{\rm peak}$ based models fail is the relatively low abundance of satellite galaxies produced in these models compared to those based on $V_{\rm peak}$. Despite success in predicting clustering, a simple $V_{\rm peak}$ based SHAM model results in predictions for galaxy growth that are at odds with observations. We evaluate three possible remedies that could &#34;save&#34; mass-based SHAM: (1) SHAM models require a significant population of &#34;orphan&#34; galaxies as a result of artificial disruption/merging of sub-haloes in modern high resolution dark matter simulations; (2) satellites must grow significantly after their accretion; and (3) stellar mass is significantly affected by halo assembly history. No solution is entirely satisfactory. However, regardless of the particulars, we show that popular SHAM models based on $M_{\rm peak}$ cannot be complete physical models as presented. Either $V_{\rm peak}$ truly is a better predictor of stellar mass at $z\sim 0$ and it remains to be seen how the correlation between stellar mass and $V_{\rm peak}$ comes about, or SHAM models are missing vital component(s) that significantly affect galaxy clustering.

preprint2017arXiv

The Immitigable Nature of Assembly Bias: The Impact of Halo Definition on Assembly Bias

Dark matter halo clustering depends not only on halo mass, but also on other properties such as concentration and shape. This phenomenon is known broadly as assembly bias. We explore the dependence of assembly bias on halo definition, parametrized by spherical overdensity parameter, $Δ$. We summarize the strength of concentration-, shape-, and spin-dependent halo clustering as a function of halo mass and halo definition. Concentration-dependent clustering depends strongly on mass at all $Δ$. For conventional halo definitions ($Δ\sim 200\mathrm{m}-600\mathrm{m}$), concentration-dependent clustering at low mass is driven by a population of haloes that is altered through interactions with neighbouring haloes. Concentration-dependent clustering can be greatly reduced through a mass-dependent halo definition with $Δ\sim 20\mathrm{m}-40\mathrm{m}$ for haloes with $M_{200\mathrm{m}} \lesssim 10^{12}\, h^{-1}\mathrm{M}_{\odot}$. Smaller $Δ$ implies larger radii and mitigates assembly bias at low mass by subsuming altered, so-called backsplash haloes into now larger host haloes. At higher masses ($M_{200\mathrm{m}} \gtrsim 10^{13}\, h^{-1}\mathrm{M}_{\odot}$) larger overdensities, $Δ\gtrsim 600\mathrm{m}$, are necessary. Shape- and spin-dependent clustering are significant for all halo definitions that we explore and exhibit a relatively weaker mass dependence. Generally, both the strength and the sense of assembly bias depend on halo definition, varying significantly even among common definitions. We identify no halo definition that mitigates all manifestations of assembly bias. A halo definition that mitigates assembly bias based on one halo property (e.g., concentration) must be mass dependent. The halo definitions that best mitigate concentration-dependent halo clustering do not coincide with the expected average splashback radii at fixed halo mass.