Researcher profile

Salman Habib

Salman Habib contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
17works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

17 published item(s)

preprint2026arXiv

Emulator-Based Inference of Cosmological Subgrid Models

The formation of structure in the Universe at large scales is dominated by gravity, with baryonic physics becoming significant at $\sim{\rm Mpc}$ scales. To capture the impact of baryonic physics, cosmological simulations must model gas dynamics and a host of relevant astrophysical processes. A recent extension of the Hardware/Hybrid Accelerated Cosmology Code (HACC) couples its gravity solver with a modern smoothed particle hydrodynamics method. This extension incorporates sub-resolution models for chemical enrichment, black hole and star formation, AGN kinetic and thermal feedback, supernova-driven feedback, galactic winds, and metal-line cooling. We present an inference framework based on high-fidelity emulators to aid in model calibration against observational targets, e.g., the galaxy stellar mass function, radial gas density profiles, and the cluster gas fraction. The emulators are trained on simulation suites comprising 64 boxes with side-length $128\,h^{-1}$Mpc and 16 boxes with side-length $256\,h^{-1}$Mpc with $2\times 512^3$ and $2\times 1024^3$ particles, respectively. Our analysis reveals two distinct AGN kinetic feedback modes -- a low-feedback mode yielding strong agreement with the observed radial gas density profiles of massive X-ray clusters, and a high-feedback mode providing a better fit to cluster gas fraction data, but systematically underestimating gas densities in inner regions.

preprint2022arXiv

Cosmo-Paleontology: Statistics of Fossil Groups in a Gravity-Only Simulation

We present a detailed study of fossil group candidates identified in "Last Journey", a gravity-only cosmological simulation covering a $(3.4\, h^{-1}\mathrm{Gpc})^3$ volume with a particle mass resolution of $m_p \approx 2.7 \times 10^9\, h^{-1}\mathrm{M}_\odot$. The simulation allows us to simultaneously capture a large number of group-scale halos and to resolve their internal structure. Historically, fossil groups have been characterized by high X-ray brightness and a large luminosity gap between the brightest and second brightest galaxy in the group. In order to identify candidate halos that host fossil groups, we use halo merger tree information to introduce two parameters: a luminous merger mass threshold ($M_\mathrm{LM}$) and a last luminous merger redshift cut-off ($z_\mathrm{LLM}$). The final parameter choices are informed by observational data and allow us to identify a plausible fossil group sample from the simulation. The candidate halos are characterized by reduced substructure and are therefore less likely to host bright galaxies beyond the brightest central galaxy. We carry out detailed studies of this sample, including analysis of halo properties and clustering. We find that our simple assumptions lead to fossil group candidates that form early, have higher concentrations, and are more relaxed compared to other halos in the same mass range.

preprint2022arXiv

Farpoint: A High-Resolution Cosmology Simulation at the Gigaparsec Scale

In this paper we introduce the Farpoint simulation, the latest member of the Hardware/Hybrid Accelerated Cosmology Code (HACC) gravity-only simulation family. The domain covers a volume of (1000$h^{-1}$Mpc)$^3$ and evolves close to two trillion particles, corresponding to a mass resolution of $m_p\sim 4.6\cdot 10^7 h^{-1}$M$_\odot$. These specifications enable comprehensive investigations of the galaxy-halo connection, capturing halos down to small masses. Further, the large volume resolves scales typical of modern surveys with good statistical coverage of high mass halos. The simulation was carried out on the GPU-accelerated system Summit, one of the fastest supercomputers currently available. We provide specifics about the Farpoint run and present an initial set of results. The high mass resolution facilitates precise measurements of important global statistics, such as the halo concentration-mass relation and the correlation function down to small scales. Selected subsets of the simulation data products are publicly available via the HACC Simulation Data Portal.

preprint2022arXiv

Portability: A Necessary Approach for Future Scientific Software

Today's world of scientific software for High Energy Physics (HEP) is powered by x86 code, while the future will be much more reliant on accelerators like GPUs and FPGAs. The portable parallelization strategies (PPS) project of the High Energy Physics Center for Computational Excellence (HEP/CCE) is investigating solutions for portability techniques that will allow the coding of an algorithm once, and the ability to execute it on a variety of hardware products from many vendors, especially including accelerators. We think without these solutions, the scientific success of our experiments and endeavors is in danger, as software development could be expert driven and costly to be able to run on available hardware infrastructure. We think the best solution for the community would be an extension to the C++ standard with a very low entry bar for users, supporting all hardware forms and vendors. We are very far from that ideal though. We argue that in the future, as a community, we need to request and work on portability solutions and strive to reach this ideal.

preprint2022arXiv

Snowmass2021 Computational Frontier White Paper: Cosmological Simulations and Modeling

Powerful new observational facilities will come online over the next decade, enabling a number of discovery opportunities in the "Cosmic Frontier", which targets understanding of the physics of the early universe, dark matter and dark energy, and cosmological probes of fundamental physics, such as neutrino masses and modifications of Einstein gravity. Synergies between different experiments will be leveraged to present new classes of cosmic probes as well as to minimize systematic biases present in individual surveys. Success of this observational program requires actively pairing it with a well-matched state-of-the-art simulation and modeling effort. Next-generation cosmological modeling will increasingly focus on physically rich simulations able to model outputs of sky surveys spanning multiple wavebands. These simulations will have unprecedented resolution, volume coverage, and must deliver guaranteed high-fidelity results for individual surveys as well as for the cross-correlations across different surveys. The needed advances are as follows: (1) Development of scientifically rich and broadly-scoped simulations, which capture the relevant physics and correlations between probes (2) Accurate translation of simulation results into realistic image or spectral data to be directly compared with observations (3) Improved emulators and/or data-driven methods serving as surrogates for expensive simulations, constructed from a finite set of full-physics simulations (4) Detailed and transparent verification and validation programs for both simulations and analysis tools. (Abridged)

preprint2022arXiv

Snowmass2021 Cosmic Frontier White Paper: Rubin Observatory after LSST

The Vera C. Rubin Observatory will begin the Legacy Survey of Space and Time (LSST) in 2024, spanning an area of 18,000 square degrees in six bands, with more than 800 observations of each field over ten years. The unprecedented data set will enable great advances in the study of the formation and evolution of structure and exploration of physics of the dark universe. The observations will hold clues about the cause for the accelerated expansion of the universe and possibly the nature of dark matter. During the next decade, LSST will be able to confirm or dispute if tensions seen today in cosmological data are due to new physics. New and unexpected phenomena could confirm or disrupt our current understanding of the universe. Findings from LSST will guide the path forward post-LSST. The Rubin Observatory will still be a uniquely powerful facility even then, capable of revealing further insights into the physics of the dark universe. These could be obtained via innovative observing strategies, e.g., targeting new probes at shorter timescales than with LSST, or via modest instrumental changes, e.g., new filters, or through an entirely new instrument for the focal plane. This White Paper highlights some of the opportunities in each scenario from Rubin observations after LSST.

preprint2022arXiv

Why are we still using 3D masses for cluster cosmology?

The abundance of clusters of galaxies is highly sensitive to the late-time evolution of the matter distribution, since clusters form at the highest density peaks. However, the 3D cluster mass cannot be inferred without deprojecting the observations, introducing model-dependent biases and uncertainties due to the mismatch between the assumed and the true cluster density profile and the neglected matter along the sightline. Since projected aperture masses can be measured directly in simulations and observationally through weak lensing, we argue that they are better suited for cluster cosmology. Using the Mira-Titan suite of gravity-only simulations, we show that aperture masses correlate strongly with 3D halo masses, albeit with large intrinsic scatter due to the varying matter distribution along the sightline. Nonetheless, aperture masses can be measured $\approx 2-3$ times more precisely from observations, since they do not require assumptions about the density profile and are only affected by the shape noise in the weak lensing measurements. We emulate the cosmology dependence of the aperture mass function directly with a Gaussian process. Comparing the cosmology sensitivity of the aperture mass function and the 3D halo mass function for a fixed survey solid angle and redshift interval, we find the aperture mass sensitivity is higher for $Ω_\mathrm{m}$ and $w_a$, similar for $σ_8$, $n_\mathrm{s}$, and $w_0$, and slightly lower for $h$. With a carefully calibrated aperture mass function emulator, cluster cosmology analyses can use cluster aperture masses directly, reducing the sensitivity to model-dependent mass calibration biases and uncertainties.

preprint2021arXiv

Machine learning synthetic spectra for probabilistic redshift estimation: SYTH-Z

Photometric redshift estimation algorithms are often based on representative data from observational campaigns. Data-driven methods of this type are subject to a number of potential deficiencies, such as sample bias and incompleteness. Motivated by these considerations, we propose using physically motivated synthetic spectral energy distributions in redshift estimation. In addition, the synthetic data would have to span a domain in colour-redshift space concordant with that of the targeted observational surveys. With a matched distribution and realistically modelled synthetic data in hand, a suitable regression algorithm can be appropriately trained; we use a mixture density network for this purpose. We also perform a zero-point re-calibration to reduce the systematic differences between noise-free synthetic data and the (unavoidably) noisy observational data sets. This new redshift estimation framework, SYTH-Z, demonstrates superior accuracy over a wide range of redshifts compared to baseline models trained on observational data alone. Approaches using realistic synthetic data sets can therefore greatly mitigate the reliance on expensive spectroscopic follow-up for the next generation of photometric surveys.

preprint2021arXiv

Nested Array-Based Spatially Coupled LDPC Codes

Linear nested codes, where two or more sub-codes are nested in a global code, have been proposed as candidates for reliable multi-terminal communication. In this paper, we consider nested array-based spatially coupled low-density parity-check (SC-LDPC) codes and propose a line-counting based optimization scheme for minimizing the number of dominant absorbing sets in order to improve its performance in the high signal-to-noise ratio regime. Since the parity-check matrices of different nested sub-codes partially overlap, the optimization of one nested sub-code imposes constraints on the optimization of the other sub-codes. To tackle these constraints, a multi-step optimization process is applied first to one of the nested codes, then sequential optimization of the remaining nested codes is carried out based on the constraints imposed by the previously optimized sub-codes. Results show that the order of optimization has a significant impact on the number of dominant absorbing sets in the Tanner graph of the code, resulting in a tradeoff between the performance of a nested code structure and its optimization sequence: the code which is optimized without constraints has fewer harmful structures than the code which is optimized with constraints. We also show that for certain code parameters, dominant absorbing sets in the Tanner graphs of all nested codes are completely removed using our proposed optimization strategy.

preprint2021arXiv

The LSST DESC DC2 Simulated Sky Survey

We describe the simulated sky survey underlying the second data challenge (DC2) carried out in preparation for analysis of the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) by the LSST Dark Energy Science Collaboration (LSST DESC). Significant connections across multiple science domains will be a hallmark of LSST; the DC2 program represents a unique modeling effort that stresses this interconnectivity in a way that has not been attempted before. This effort encompasses a full end-to-end approach: starting from a large N-body simulation, through setting up LSST-like observations including realistic cadences, through image simulations, and finally processing with Rubin's LSST Science Pipelines. This last step ensures that we generate data products resembling those to be delivered by the Rubin Observatory as closely as is currently possible. The simulated DC2 sky survey covers six optical bands in a wide-fast-deep (WFD) area of approximately 300 deg^2 as well as a deep drilling field (DDF) of approximately 1 deg^2. We simulate 5 years of the planned 10-year survey. The DC2 sky survey has multiple purposes. First, the LSST DESC working groups can use the dataset to develop a range of DESC analysis pipelines to prepare for the advent of actual data. Second, it serves as a realistic testbed for the image processing software under development for LSST by the Rubin Observatory. In particular, simulated data provide a controlled way to investigate certain image-level systematic effects. Finally, the DC2 sky survey enables the exploration of new scientific ideas in both static and time-domain cosmology.

preprint2020arXiv

Building Halo Merger Trees from the Q Continuum Simulation

Cosmological N-body simulations rank among the most computationally intensive efforts today. A key challenge is the analysis of structure, substructure, and the merger history for many billions of compact particle clusters, called halos. Effectively representing the merging history of halos is essential for many galaxy formation models used to generate synthetic sky catalogs, an important application of modern cosmological simulations. Generating realistic mock catalogs requires computing the halo formation history from simulations with large volumes and billions of halos over many time steps, taking hundreds of terabytes of analysis data. We present fast parallel algorithms for producing halo merger trees and tracking halo substructure from a single-level, density-based clustering algorithm. Merger trees are created from analyzing the halo-particle membership function in adjacent snapshots, and substructure is identified by tracking the "cores" of merging halos -- sets of particles near the halo center. Core tracking is performed after creating merger trees and uses the relationships found during tree construction to associate substructures with hosts. The algorithms are implemented with MPI and evaluated on a Cray XK7 supercomputer using up to 16,384 processes on data from HACC, a modern cosmological simulation framework. We present results for creating merger trees from 101 analysis snapshots taken from the Q Continuum, a large volume, high mass resolution, cosmological simulation evolving half a trillion particles.

preprint2020arXiv

The Mira-Titan Universe. III. Emulation of the Halo Mass Function

We construct an emulator for the halo mass function over group and cluster mass scales for a range of cosmologies, including the effects of dynamical dark energy and massive neutrinos. The emulator is based on the recently completed Mira-Titan Universe suite of cosmological $N$-body simulations. The main set of simulations spans 111 cosmological models with 2.1 Gpc boxes. We extract halo catalogs in the redshift range $z=[0.0, 2.0]$ and for masses $M_{200\mathrm{c}}\geq 10^{13}M_\odot/h$. The emulator covers an 8-dimensional hypercube spanned by {$Ω_\mathrm{m}h^2$, $Ω_\mathrm{b}h^2$, $Ω_νh^2$, $σ_8$, $h$, $n_s$, $w_0$, $w_a$}; spatial flatness is assumed. We obtain smooth halo mass functions by fitting piecewise second-order polynomials to the halo catalogs and employ Gaussian process regression to construct the emulator while keeping track of the statistical noise in the input halo catalogs and uncertainties in the regression process. For redshifts $z\lesssim1$, the typical emulator precision is better than $2\%$ for $10^{13}-10^{14} M_\odot/h$ and $<10\%$ for $M\simeq 10^{15}M_\odot/h$. For comparison, fitting functions using the traditional universal form for the halo mass function can be biased at up to 30\% at $M\simeq 10^{14}M_\odot/h$ for $z=0$. Our emulator is publicly available at \url{https://github.com/SebastianBocquet/MiraTitanHMFemulator}.

preprint2019arXiv

CosmoDC2: A Synthetic Sky Catalog for Dark Energy Science with LSST

This paper introduces cosmoDC2, a large synthetic galaxy catalog designed to support precision dark energy science with the Large Synoptic Survey Telescope (LSST). CosmoDC2 is the starting point for the second data challenge (DC2) carried out by the LSST Dark Energy Science Collaboration (LSST DESC). The catalog is based on a trillion-particle, 4.225 Gpc^3 box cosmological N-body simulation, the `Outer Rim&#39; run. It covers 440 deg^2 of sky area to a redshift of z=3 and is complete to a magnitude depth of 28 in the r-band. Each galaxy is characterized by a multitude of properties including stellar mass, morphology, spectral energy distributions, broadband filter magnitudes, host halo information and weak lensing shear. The size and complexity of cosmoDC2 requires an efficient catalog generation methodology; our approach is based on a new hybrid technique that combines data-driven empirical approaches with semi-analytic galaxy modeling. A wide range of observation-based validation tests has been implemented to ensure that cosmoDC2 enables the science goals of the planned LSST DESC DC2 analyses. This paper also represents the official release of the cosmoDC2 data set, including an efficient reader that facilitates interaction with the data.

preprint2019arXiv

The Outer Rim Simulation: A Path to Many-Core Supercomputers

We describe the Outer Rim cosmological simulation, one of the largest high-resolution N-body simulations performed to date, aimed at promoting science to be carried out with large-scale structure surveys. The simulation covers a volume of (4.225Gpc)^3 and evolves more than one trillion particles. It was executed on Mira, a BlueGene/Q system at the Argonne Leadership Computing Facility. We discuss some of the computational challenges posed by a system like Mira, a many-core supercomputer, and how the simulation code, HACC, has been designed to overcome these challenges. We have carried out a large range of analyses on the simulation data and we report on the results as well as the data products that have been generated. The full data set generated by the simulation totals more than 5PB of data, making data curation and data handling a large challenge in of itself. The simulation results have been used to generate synthetic catalogs for large-scale structure surveys, including DESI and eBOSS, as well as CMB experiments. A detailed catalog for the LSST DESC data challenges has been created as well. We publicly release some of the Outer Rim halo catalogs, downsampled particle information, and lightcone data.

preprint2010arXiv

The Coyote Universe I: Precision Determination of the Nonlinear Matter Power Spectrum

Near-future cosmological observations targeted at investigations of dark energy pose stringent requirements on the accuracy of theoretical predictions for the clustering of matter. Currently, N-body simulations comprise the only viable approach to this problem. In this paper we demonstrate that N-body simulations can indeed be sufficiently controlled to fulfill these requirements for the needs of ongoing and near-future weak lensing surveys. By performing a large suite of cosmological simulation comparison and convergence tests we show that results for the nonlinear matter power spectrum can be obtained at 1% accuracy out to k~1 h/Mpc. The key components of these high accuracy simulations are: precise initial conditions, very large simulation volumes, sufficient mass resolution, and accurate time stepping. This paper is the first in a series of three, with the final aim to provide a high-accuracy prediction scheme for the nonlinear matter power spectrum.

preprint2010arXiv

The Coyote Universe II: Cosmological Models and Precision Emulation of the Nonlinear Matter Power Spectrum

The power spectrum of density fluctuations is a foundational source of cosmological information. Precision cosmological probes targeted primarily at investigations of dark energy require accurate theoretical determinations of the power spectrum in the nonlinear regime. To exploit the observational power of future cosmological surveys, accuracy demands on the theory are at the one percent level or better. Numerical simulations are currently the only way to produce sufficiently error-controlled predictions for the power spectrum. The very high computational cost of (precision) N-body simulations is a major obstacle to obtaining predictions in the nonlinear regime, while scanning over cosmological parameters. Near-future observations, however, are likely to provide a meaningful constraint only on constant dark energy equation of state &#39;wCDM&#39; cosmologies. In this paper we demonstrate that a limited set of only 37 cosmological models -- the &#34;Coyote Universe&#34; suite -- can be used to predict the nonlinear matter power spectrum at the required accuracy over a prior parameter range set by cosmic microwave background observations. This paper is the second in a series of three, with the final aim to provide a high-accuracy prediction scheme for the nonlinear matter power spectrum for wCDM cosmologies.