Researcher profile

Sihan Yuan

Sihan Yuan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
1topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2022arXiv

Constructing high-fidelity halo merger trees in AbacusSummit

Tracking the formation and evolution of dark matter haloes is a critical aspect of any analysis of cosmological $N$-body simulations. In particular, the mass assembly of a halo and its progenitors, encapsulated in the form of its merger tree, serves as a fundamental input for constructing semi-analytic models of galaxy formation and, more generally, for building mock catalogues that emulate galaxy surveys. We present an algorithm for constructing halo merger trees from AbacusSummit, the largest suite of cosmological $N$-body simulations performed to date consisting of nearly 60 trillion particles, and which has been designed to meet the Cosmological Simulation Requirements of the Dark Energy Spectroscopic Instrument (DESI) survey. Our method tracks the cores of haloes to determine associations between objects across multiple timeslices, yielding lists of halo progenitors and descendants for the several tens of billions of haloes identified across the entire suite. We present an application of these merger trees as a means to enhance the fidelity of AbacusSummit halo catalogues by flagging and "merging" haloes deemed to exhibit non-monotonic past merger histories. We show that this cleaning technique identifies portions of the halo population that have been deblended due to choices made by the halo finder, but which could have feasibly been part of larger aggregate systems. We demonstrate that by cleaning halo catalogues in this post-processing step, we remove potentially unphysical features in the default halo catalogues, leaving behind a more robust halo population that can be used to create highly-accurate mock galaxy realisations from AbacusSummit.

preprint2022arXiv

Illustrating galaxy-halo connection in the DESI era with IllustrisTNG

We employ the hydrodynamical simulation IllustrisTNG to inform the galaxy-halo connection of the Luminous Red Galaxy (LRG) and Emission Line Galaxy (ELG) samples of the Dark Energy Spectroscopic Instrument (DESI) survey at redshift z ~ 0.8. Specifically, we model the galaxy colors of IllustrisTNG and apply sliding DESI color-magnitude cuts, matching the DESI target densities. We study the halo occupation distribution model (HOD) of the selected samples by matching them to their corresponding dark matter halos in the IllustrisTNG dark matter run. We find the HOD of both the LRG and ELG samples to be consistent with their respective baseline models, but also we find important deviations from common assumptions about the satellite distribution, velocity bias, and galaxy secondary biases. We identify strong evidence for concentration-based and environment-based occupational variance in both samples, an effect known as "galaxy assembly bias". The central and satellite galaxies have distinct dependencies on secondary halo properties, showing that centrals and satellites have distinct evolutionary trajectories and should be modelled separately. These results serve to inform the necessary complexities in modeling galaxy-halo connection for DESI analyses and also prepare for building high-fidelity mock galaxies. Finally, we present a shuffling-based clustering analysis that reveals a 10-15% excess in the LRG clustering of modest statistical significance due to secondary galaxy biases. We also find a similar excess signature for the ELGs, but with much lower statistical significance. When a larger hydrodynamical simulation volume becomes available, we expect our analysis pipeline to pinpoint the exact sources of such excess clustering signatures.

preprint2022arXiv

Stringent $σ_8$ constraints from small-scale galaxy clustering using a hybrid MCMC+emulator framework

We present a novel simulation-based hybrid emulator approach that maximally derives cosmological and Halo Occupation Distribution (HOD) information from non-linear galaxy clustering, with sufficient precision for DESI Year 1 (Y1) analysis. Our hybrid approach first samples the HOD space on a fixed cosmological simulation grid to constrain the high-likelihood region of cosmology+HOD parameter space, and then constructs the emulator within this constrained region. This approach significantly reduces the parameter volume emulated over, thus achieving much smaller emulator errors with fixed number of training points. We demonstrate that this combined with state-of-the-art simulations result in tight emulator errors comparable to expected DESI Y1 LRG sample variance. We leverage the new AbacusSummit simulations and apply our hybrid approach to CMASS non-linear galaxy clustering data. We infer constraints on $σ_8 = 0.762\pm0.024$ and $fσ_8 (z_{eff} = 0.52) = 0.444\pm0.016$, the tightest among contemporary galaxy clustering studies. We also demonstrate that our $fσ_8$ constraint is robust against secondary biases and other HOD model choices, a critical first step towards showcasing the robust cosmology information accessible in non-linear scales. We speculate that the additional statistical power of DESI Y1 should tighten the growth rate constraints by at least another 50-60%, significantly elucidating any potential tension with Planck. We also address the "lensing is low" tension, where we find that the combined effect of a lower $fσ_8$ and environment-based bias lowers the predicted lensing signal by 15%, accounting for approximately 50% of the discrepancy between the lensing measurement and clustering-based predictions.

preprint2021arXiv

Evidence for galaxy assembly bias in BOSS CMASS redshift-space galaxy correlation function

Building accurate and flexible galaxy-halo connection models is crucial in modeling galaxy clustering on non-linear scales. Recent studies have found that halo concentration by itself cannot capture the full galaxy assembly bias effect and that the local environment of the halo can be an excellent indicator of galaxy assembly bias. In this paper, we propose an extended halo occupation distribution model (HOD) that includes both a concentration-based assembly bias term and an environment-based assembly bias term. We use this model to achieve a good fit (chi^2/DoF = 1.35) on the 2D redshift-space 2-point correlation function (2PCF) of the Baryon Oscillation Spectroscopic Survey (BOSS) CMASS galaxy sample. We find that the inclusion of both assembly bias terms is strongly favored by the data and the standard 5-parameter HOD is strongly rejected. More interestingly, the redshift-space 2PCF drives the assembly bias parameters in a way that preferentially assigns galaxies to lower mass halos. This results in galaxy-galaxy lensing predictions that are within 1sigma agreement with the observation, alleviating the perceived tension between galaxy clustering and lensing. We also showcase a consistent 3-5sigma preference for a positive environment-based assembly bias that persists over variations in the fit. We speculate that the environmental dependence might be driven by underlying processes such as mergers and feedback, but might also be indicative of a larger halo boundaries such as the splashback radius. Regardless, this work highlights the importance of building flexible galaxy-halo connection models and demonstrates the extra constraining power of the redshift-space 2PCF.

preprint2020arXiv

Can Assembly Bias Explain the Lensing Amplitude of the BOSS CMASS Sample in a Planck Cosmology?

In this paper, we investigate whether galaxy assembly bias can reconcile the 20-40% disagreement between the observed galaxy projected clustering signal and the galaxy-galaxy lensing signal in the BOSS CMASS galaxy sample reported in Leauthaud et al. (2017). We use the suite of AbacusCosmos Lamda-CDM simulations at Planck best-fit cosmology and two flexible implementations of extended halo occupation distribution (HOD) models that incorporate galaxy assembly bias to build forward models and produce joint fits of the observed galaxy clustering signal and the galaxy-galaxy lensing signal. We find that our models using the standard HODs without any assembly bias generalizations continue to show a 20-40% over-prediction of the observed galaxy-galaxy lensing signal. We find that our implementations of galaxy assembly bias do not reconcile the two measurements at Planck best-fit cosmology. In fact, despite incorporating galaxy assembly bias, the satellite distribution parameter, and the satellite velocity bias parameter into our extended HOD model, our fits still strongly suggest a 31-34% discrepancy between the observed projected clustering and galaxy-galaxy lensing measurements. It remains to be seen whether a combination of other galaxy assembly bias models, alternative cosmological parameters, or baryonic effects can explain the amplitude difference between the two signals.

preprint2019arXiv

A Hybrid Deep Learning Approach to Cosmological Constraints From Galaxy Redshift Surveys

We present a deep machine learning (ML)-based technique for accurately determining $σ_8$ and $Ω_m$ from mock 3D galaxy surveys. The mock surveys are built from the AbacusCosmos suite of $N$-body simulations, which comprises 40 cosmological volume simulations spanning a range of cosmological models, and we account for uncertainties in galaxy formation scenarios through the use of generalized halo occupation distributions (HODs). We explore a trio of ML models: a 3D convolutional neural network (CNN), a power-spectrum-based fully connected network, and a hybrid approach that merges the two to combine physically motivated summary statistics with flexible CNNs. We describe best practices for training a deep model on a suite of matched-phase simulations and we test our model on a completely independent sample that uses previously unseen initial conditions, cosmological parameters, and HOD parameters. Despite the fact that the mock observations are quite small ($\sim0.07h^{-3}\,\mathrm{Gpc}^3$) and the training data span a large parameter space (6 cosmological and 6 HOD parameters), the CNN and hybrid CNN can constrain $σ_8$ and $Ω_m$ to $\sim3\%$ and $\sim4\%$, respectively.