Source author record

Lehman H. Garrison

Lehman H. Garrison appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.CO astro-ph.GA astro-ph.IM Machine Learning Artificial Intelligence astro-ph.EP Distributed, Parallel, and Cluster Computing hep-ph physics.comp-ph

Catalog footprint

What is connected

12works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Accuracy of power spectra in dissipationless cosmological simulations

We exploit a suite of large \emph{N}-body simulations (up to N=$4096^3$) performed with \Abacus, of scale-free models with a range of spectral indices $n$, to better understand and quantify convergence of the matter power spectrum. Using self-similarity to identify converged regions, we show that the maximal wavenumber resolved at a given level of accuracy increases monotonically as a function of time. At the 1\% level it starts at early times from a fraction of $k_Λ$, the Nyquist wavenumber of the initial grid, and reaches at most, if the force softening is sufficiently small, $\sim 2-3 k_Λ$ at the very latest times we evolve to. At the $5\%$ level, accuracy extends up to wavenumbers of order $5k_Λ$ at late times. Expressed as a suitable function of the scale-factor, accuracy shows a very simple $n$-dependence, allowing a extrapolation to place conservative bounds on the accuracy of \emph{N}-body simulations of non-scale free models like LCDM. We note that deviations due to discretization in the converged range are not well modelled by shot noise, and subtracting it in fact degrades accuracy. Quantitatively our findings are broadly in line with the conservative assumptions about resolution adopted by recent studies using large cosmological simulations (e.g. Euclid Flagship) aiming to constrain the mildly non-linear regime. On the other hand, we remark that conclusions about small scale clustering (e.g. concerning the validity of stable clustering) obtained using PS data at wavenumbers larger than a few $k_Λ$ may need revision in light of our convergence analysis.

preprint2022arXiv

Constructing high-fidelity halo merger trees in AbacusSummit

Tracking the formation and evolution of dark matter haloes is a critical aspect of any analysis of cosmological $N$-body simulations. In particular, the mass assembly of a halo and its progenitors, encapsulated in the form of its merger tree, serves as a fundamental input for constructing semi-analytic models of galaxy formation and, more generally, for building mock catalogues that emulate galaxy surveys. We present an algorithm for constructing halo merger trees from AbacusSummit, the largest suite of cosmological $N$-body simulations performed to date consisting of nearly 60 trillion particles, and which has been designed to meet the Cosmological Simulation Requirements of the Dark Energy Spectroscopic Instrument (DESI) survey. Our method tracks the cores of haloes to determine associations between objects across multiple timeslices, yielding lists of halo progenitors and descendants for the several tens of billions of haloes identified across the entire suite. We present an application of these merger trees as a means to enhance the fidelity of AbacusSummit halo catalogues by flagging and "merging" haloes deemed to exhibit non-monotonic past merger histories. We show that this cleaning technique identifies portions of the halo population that have been deblended due to choices made by the halo finder, but which could have feasibly been part of larger aggregate systems. We demonstrate that by cleaning halo catalogues in this post-processing step, we remove potentially unphysical features in the default halo catalogues, leaving behind a more robust halo population that can be used to create highly-accurate mock galaxy realisations from AbacusSummit.

preprint2022arXiv

Machine Learning and Cosmology

Methods based on machine learning have recently made substantial inroads in many corners of cosmology. Through this process, new computational tools, new perspectives on data collection, model development, analysis, and discovery, as well as new communities and educational pathways have emerged. Despite rapid progress, substantial potential at the intersection of cosmology and machine learning remains untapped. In this white paper, we summarize current and ongoing developments relating to the application of machine learning within cosmology and provide a set of recommendations aimed at maximizing the scientific impact of these burgeoning tools over the coming decade through both technical development as well as the fostering of emerging communities.

preprint2022arXiv

Robust field-level inference with dark matter halos

We train graph neural networks on halo catalogues from Gadget N-body simulations to perform field-level likelihood-free inference of cosmological parameters. The catalogues contain $\lesssim$5,000 halos with masses $\gtrsim 10^{10}~h^{-1}M_\odot$ in a periodic volume of $(25~h^{-1}{\rm Mpc})^3$; every halo in the catalogue is characterized by several properties such as position, mass, velocity, concentration, and maximum circular velocity. Our models, built to be permutationally, translationally, and rotationally invariant, do not impose a minimum scale on which to extract information and are able to infer the values of $Ω_{\rm m}$ and $σ_8$ with a mean relative error of $\sim6\%$, when using positions plus velocities and positions plus masses, respectively. More importantly, we find that our models are very robust: they can infer the value of $Ω_{\rm m}$ and $σ_8$ when tested using halo catalogues from thousands of N-body simulations run with five different N-body codes: Abacus, CUBEP$^3$M, Enzo, PKDGrav3, and Ramses. Surprisingly, the model trained to infer $Ω_{\rm m}$ also works when tested on thousands of state-of-the-art CAMELS hydrodynamic simulations run with four different codes and subgrid physics implementations. Using halo properties such as concentration and maximum circular velocity allow our models to extract more information, at the expense of breaking the robustness of the models. This may happen because the different N-body codes are not converged on the relevant scales corresponding to these parameters.

preprint2022arXiv

Stringent $σ_8$ constraints from small-scale galaxy clustering using a hybrid MCMC+emulator framework

We present a novel simulation-based hybrid emulator approach that maximally derives cosmological and Halo Occupation Distribution (HOD) information from non-linear galaxy clustering, with sufficient precision for DESI Year 1 (Y1) analysis. Our hybrid approach first samples the HOD space on a fixed cosmological simulation grid to constrain the high-likelihood region of cosmology+HOD parameter space, and then constructs the emulator within this constrained region. This approach significantly reduces the parameter volume emulated over, thus achieving much smaller emulator errors with fixed number of training points. We demonstrate that this combined with state-of-the-art simulations result in tight emulator errors comparable to expected DESI Y1 LRG sample variance. We leverage the new AbacusSummit simulations and apply our hybrid approach to CMASS non-linear galaxy clustering data. We infer constraints on $σ_8 = 0.762\pm0.024$ and $fσ_8 (z_{eff} = 0.52) = 0.444\pm0.016$, the tightest among contemporary galaxy clustering studies. We also demonstrate that our $fσ_8$ constraint is robust against secondary biases and other HOD model choices, a critical first step towards showcasing the robust cosmology information accessible in non-linear scales. We speculate that the additional statistical power of DESI Y1 should tighten the growth rate constraints by at least another 50-60%, significantly elucidating any potential tension with Planck. We also address the "lensing is low" tension, where we find that the combined effect of a lower $fσ_8$ and environment-based bias lowers the predicted lensing signal by 15%, accounting for approximately 50% of the discrepancy between the lensing measurement and clustering-based predictions.

preprint2022arXiv

The DESI $N$-body Simulation Project -- II. Suppressing sample variance with fast simulations

Dark Energy Spectroscopic Instrument (DESI) will construct a large and precise three-dimensional map of our Universe. The survey effective volume reaches $\sim20\Gpchcube$. It is a great challenge to prepare high-resolution simulations with a much larger volume for validating the DESI analysis pipelines. \textsc{AbacusSummit} is a suite of high-resolution dark-matter-only simulations designed for this purpose, with $200\Gpchcube$ (10 times DESI volume) for the base cosmology. However, further efforts need to be done to provide a more precise analysis of the data and to cover also other cosmologies. Recently, the CARPool method was proposed to use paired accurate and approximate simulations to achieve high statistical precision with a limited number of high-resolution simulations. Relying on this technique, we propose to use fast quasi-$N$-body solvers combined with accurate simulations to produce accurate summary statistics. This enables us to obtain 100 times smaller variance than the expected DESI statistical variance at the scales we are interested in, e.g. $k < 0.3\hMpc$ for the halo power spectrum. In addition, it can significantly suppress the sample variance of the halo bispectrum. We further generalize the method for other cosmologies with only one realization in \textsc{AbacusSummit} suite to extend the effective volume $\sim 20$ times. In summary, our proposed strategy of combining high-fidelity simulations with fast approximate gravity solvers and a series of variance suppression techniques sets the path for a robust cosmological analysis of galaxy survey data.

preprint2021arXiv

Checkpointing with cp: the POSIX Shared Memory System

We present the checkpointing scheme of Abacus, an $N$-body simulation code that allocates all persistent state in POSIX shared memory, or ramdisk. Checkpointing becomes as simple as copying files from ramdisk to external storage. The main simulation executable is invoked once per time step, memory mapping the input state, computing the output state directly into ramdisk, and unmapping the input state. The main executable remains unaware of the concept of checkpointing, with the top-level driver code launching a file-system copy between executable invocations when a checkpoint is needed. Since the only information flow is through files on ramdisk, the checkpoint must be correct so long as the simulation is correct. However, we find that with multi-GB of state, there is a significant overhead to unmapping the shared memory. This can be partially mitigated with multithreading, but ultimately, we do not recommend shared memory for use with a large state.

preprint2020arXiv

Measured Lightcurves and Rotational Periods of 3122 Florence, 3830 Trelleborg, and (131077) 2000 YH105

We determined the rotational periods of 3122 Florence, 3830 Trelleborg, and (131077) 2000 YH105 with the Harvard Clay Telescope and KeplerCam at the Fred L. Whipple Observatory. We found the rotational periods to be 2.3580 $\pm$ 0.0015 h, 17.059 $\pm$ 0.017 h, and 1.813 $\pm$ 0.00003 h, respectively. Our measurement of 3122 Florence's period agrees with Warner (2016), who reported 2.3580 $\pm$ 0.0002 h.

preprint2019arXiv

A Hybrid Deep Learning Approach to Cosmological Constraints From Galaxy Redshift Surveys

We present a deep machine learning (ML)-based technique for accurately determining $σ_8$ and $Ω_m$ from mock 3D galaxy surveys. The mock surveys are built from the AbacusCosmos suite of $N$-body simulations, which comprises 40 cosmological volume simulations spanning a range of cosmological models, and we account for uncertainties in galaxy formation scenarios through the use of generalized halo occupation distributions (HODs). We explore a trio of ML models: a 3D convolutional neural network (CNN), a power-spectrum-based fully connected network, and a hybrid approach that merges the two to combine physically motivated summary statistics with flexible CNNs. We describe best practices for training a deep model on a suite of matched-phase simulations and we test our model on a completely independent sample that uses previously unseen initial conditions, cosmological parameters, and HOD parameters. Despite the fact that the mock observations are quite small ($\sim0.07h^{-3}\,\mathrm{Gpc}^3$) and the training data span a large parameter space (6 cosmological and 6 HOD parameters), the CNN and hybrid CNN can constrain $σ_8$ and $Ω_m$ to $\sim3\%$ and $\sim4\%$, respectively.

preprint2019arXiv

Corrfunc --- A Suite of Blazing Fast Correlation Functions on the CPU

The two-point correlation function (2PCF) is the most widely used tool for quantifying the spatial distribution of galaxies. Since the distribution of galaxies is determined by galaxy formation physics as well as the underlying cosmology, fitting an observed correlation function yields valuable insights into both. The calculation for a 2PCF involves computing pair-wise separations and consequently, the computing time scales quadratically with the number of galaxies. The next-generation galaxy surveys are slated to observe many millions of galaxies, and computing the 2PCF for such surveys would be prohibitively time-consuming. Additionally, modern modelling techniques require the 2PCF to be calculated thousands of times on simulated galaxy catalogues of {\em at least} equal size to the data and would be completely unfeasible for the next generation surveys. Thus, calculating the 2PCF forms a substantial bottleneck in improving our understanding of the fundamental physics of the universe, and we need high-performance software to compute the correlation function. In this paper, we present Corrfunc --- a suite of highly optimised, OpenMP parallel clustering codes. The improved performance of Corrfunc arises from both efficient algorithms as well as software design that suits the underlying hardware of modern CPUs. Corrfunc can compute a wide range of 2-D and 3-D correlation functions in either simulation (Cartesian) space or on-sky coordinates. Corrfunc runs efficiently in both single- and multi-threaded modes and can compute a typical 2-point projected correlation function ($w_p(r_p)$) for ~1 million galaxies within a few seconds on a single thread. Corrfunc is designed to be both user-friendly and fast and is publicly available at https://github.com/manodeep/Corrfunc.

preprint2019arXiv

Cosmology with galaxy-galaxy lensing on non-perturbative scales: Emulation method and application to BOSS LOWZ

We describe our nonlinear emulation (i.e., interpolation) framework that combines the halo occupation distribution (HOD) galaxy bias model with $N$-body simulations of nonlinear structure formation, designed to accurately predict the projected clustering and galaxy-galaxy lensing signals from luminous red galaxies (LRGs) in the redshift range $0.16 < z < 0.36$ on comoving scales $0.6 < r_p < 30$ \hMpc. The interpolation accuracy is $\lesssim 1-2$ per cent across the entire physically plausible range of parameters for all scales considered. We correctly recover the true value of the cosmological parameter $S_8 = ({σ_8}/{0.8228}) ({Ω_{\text{m}}}/{0.3107})^{0.6}$ from mock measurements produced via subhalo abundance matching (SHAM)-based lightcones designed to approximately match the properties of the SDSS LOWZ galaxy sample. Applying our model to Baryon Oscillation Spectroscopic Survey (BOSS) Data Release 14 (DR14) LOWZ galaxy clustering and galaxy-shear cross-correlation measurements made with Sloan Digital Sky Survey (SDSS) Data Release 8 (DR8) imaging, we perform a prototype cosmological analysis marginalizing over $w$CDM cosmological parameters and galaxy HOD parameters. We obtain a 4.4 per cent measurement of $S_8 = 0.847 \pm 0.037$, in $3.5σ$ tension with the Planck cosmological results of $1.00 \pm 0.02$. We discuss the possibility of underestimated systematic uncertainties or astrophysical effects that could explain this discrepancy.

preprint2016arXiv

Improving Initial Conditions for Cosmological $N$-Body Simulations

In cosmological $N$-body simulations, the representation of dark matter as discrete "macroparticles" suppresses the growth of structure, such that simulations no longer reproduce linear theory on small scales near $k_{\rm Nyquist}$. Marcos et al. demonstrate that this is due to sparse sampling of modes near $k_{\rm Nyquist}$ and that the often-assumed continuum growing modes are not proper growing modes of the particle system. We develop initial conditions that respect the particle linear theory growing modes and then rescale the mode amplitudes to account for growth suppression. These ICs also allow us to take advantage of our very accurate $N$-body code Abacus to implement 2LPT in configuration space. The combination of 2LPT and rescaling improves the accuracy of the late-time power spectra, halo mass functions, and halo clustering. In particular, we achieve 1% accuracy in the power spectrum down to $k_{\rm Nyquist}$, versus $k_{\rm Nyquist}/4$ without rescaling or $k_{\rm Nyquist}/13$ without 2LPT, relative to an oversampled reference simulation. We anticipate that our 2LPT will be useful for large simulations where FFTs are expensive and that rescaling will be useful for suites of medium-resolution simulations used in cosmic emulators and galaxy survey mock catalogs. Code to generate initial conditions is available at https://github.com/lgarrison/zeldovich-PLT

Lehman H. Garrison

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Accuracy of power spectra in dissipationless cosmological simulations

Constructing high-fidelity halo merger trees in AbacusSummit

Machine Learning and Cosmology

Robust field-level inference with dark matter halos

Stringent $σ_8$ constraints from small-scale galaxy clustering using a hybrid MCMC+emulator framework

The DESI $N$-body Simulation Project -- II. Suppressing sample variance with fast simulations

Checkpointing with cp: the POSIX Shared Memory System

Measured Lightcurves and Rotational Periods of 3122 Florence, 3830 Trelleborg, and (131077) 2000 YH105

A Hybrid Deep Learning Approach to Cosmological Constraints From Galaxy Redshift Surveys

Corrfunc --- A Suite of Blazing Fast Correlation Functions on the CPU

Cosmology with galaxy-galaxy lensing on non-perturbative scales: Emulation method and application to BOSS LOWZ

Improving Initial Conditions for Cosmological $N$-Body Simulations