Source author record

Yu Feng

Yu Feng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

67works

34topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis

An agent skill is a configuration package that equips an LLM-driven agent with a concrete capability, such as reading email, executing shell commands, or signing blockchain transactions. Each skill is a hybrid artifact-a structured half declares executable interfaces, while a prose half dictates when and how those interfaces fire-and the prose is reinterpreted probabilistically on every invocation. Conventional static analyzers parse the structured half but ignore the prose; LLM-based tools read the prose but cannot reproducibly prove that a tainted input reaches a high-impact sink. We present Semia, a static auditor for agent skills. Semia lifts each skill into the Skill Description Language (SDL), a Datalog fact base that captures LLM-triggered actions, prose-defined conditions, and human-in-the-loop checkpoints. Synthesizing a fact base that is both structurally sound and semantically faithful to the original prose is the central challenge; we address it with Constraint-Guided Representation Synthesis (CGRS), a propose-verify-evaluate loop that refines LLM candidates until convergence. Security properties (e.g., indirect injection, secret leakage, confused deputies, unguarded sinks, etc.) over an agent skill can then be reduced to Datalog reachability queries. We evaluate Semia on 13,728 real-world skills from public marketplaces. Semia renders all of them auditable and finds that more than half carry at least one critical semantic risk. On a stratified sample of 541 expert-labeled skills, Semia achieves 97.7% recall and an F1 of 90.6%, substantially outperforming signature-based scanners and LLM baselines.

preprint2022arXiv

Automated Transpilation of Imperative to Functional Code using Neural-Guided Program Synthesis (Extended Version)

While many mainstream languages such as Java, Python, and C# increasingly incorporate functional APIs to simplify programming and improve parallelization/performance, there are no effective techniques that can be used to automatically translate existing imperative code to functional variants using these APIs. Motivated by this problem, this paper presents a transpilation approach based on inductive program synthesis for modernizing existing code. Our method is based on the observation that the overwhelming majority of source/target programs in this setting satisfy an assumption that we call trace-compatibility: not only do the programs share syntactically identical low-level expressions, but these expressions also take the same values in corresponding execution traces. Our method leverages this observation to design a new neural-guided synthesis algorithm that (1) uses a novel neural architecture called cognate grammar network (CGN) and (2) leverages a form of concolic execution to prune partial programs based on intermediate values that arise during a computation. We have implemented our approach in a tool called NGST2 and use it to translate imperative Java and Python code to functional variants that use the Stream and functools APIs respectively. Our experiments show that NGST2 significantly outperforms several baselines and that our proposed neural architecture and pruning techniques are vital for achieving good results.

preprint2022arXiv

Crescent: Taming Memory Irregularities for Accelerating Deep Point Cloud Analytics

3D perception in point clouds is transforming the perception ability of future intelligent machines. Point cloud algorithms, however, are plagued by irregular memory accesses, leading to massive inefficiencies in the memory sub-system, which bottlenecks the overall efficiency. This paper proposes Crescent, an algorithm-hardware co-design system that tames the irregularities in deep point cloud analytics while achieving high accuracy. To that end, we introduce two approximation techniques, approximate neighbor search and selectively bank conflict elision, that "regularize" the DRAM and SRAM memory accesses. Doing so, however, necessarily introduces accuracy loss, which we mitigate by a new network training procedure that integrates approximation into the network training process. In essence, our training procedure trains models that are conditioned upon a specific approximate setting and, thus, retain a high accuracy. Experiments show that Crescent doubles the performance and halves the energy consumption compared to an optimized baseline accelerator with < 1% accuracy loss. The code of our paper is available at: https://github.com/horizon-research/crescent.

preprint2022arXiv

Determination of building flood risk maps from LiDAR mobile mapping data

With increasing urbanization, flooding is a major challenge for many cities today. Based on forecast precipitation, topography, and pipe networks, flood simulations can provide early warnings for areas and buildings at risk of flooding. Basement windows, doors, and underground garage entrances are common places where floodwater can flow into a building. Some buildings have been prepared or designed considering the threat of flooding, but others have not. Therefore, knowing the heights of these facade openings helps to identify places that are more susceptible to water ingress. However, such data is not yet readily available in most cities. Traditional surveying of the desired targets may be used, but this is a very time-consuming and laborious process. This research presents a new process for the extraction of windows and doors from LiDAR mobile mapping data. Deep learning object detection models are trained to identify these objects. Usually, this requires to provide large amounts of manual annotations. In this paper, we mitigate this problem by leveraging a rule-based method. In a first step, the rule-based method is used to generate pseudo-labels. A semi-supervised learning strategy is then applied with three different levels of supervision. The results show that using only automatically generated pseudo-labels, the learning-based model outperforms the rule-based approach by 14.6% in terms of F1-score. After five hours of human supervision, it is possible to improve the model by another 6.2%. By comparing the detected facade openings' heights with the predicted water levels from a flood simulation model, a map can be produced which assigns per-building flood risk levels. This information can be combined with flood forecasting to provide a more targeted disaster prevention guide for the city's infrastructure and residential buildings.

preprint2022arXiv

FIBA: Frequency-Injection based Backdoor Attack in Medical Image Analysis

In recent years, the security of AI systems has drawn increasing research attention, especially in the medical imaging realm. To develop a secure medical image analysis (MIA) system, it is a must to study possible backdoor attacks (BAs), which can embed hidden malicious behaviors into the system. However, designing a unified BA method that can be applied to various MIA systems is challenging due to the diversity of imaging modalities (e.g., X-Ray, CT, and MRI) and analysis tasks (e.g., classification, detection, and segmentation). Most existing BA methods are designed to attack natural image classification models, which apply spatial triggers to training images and inevitably corrupt the semantics of poisoned pixels, leading to the failures of attacking dense prediction models. To address this issue, we propose a novel Frequency-Injection based Backdoor Attack method (FIBA) that is capable of delivering attacks in various MIA tasks. Specifically, FIBA leverages a trigger function in the frequency domain that can inject the low-frequency information of a trigger image into the poisoned image by linearly combining the spectral amplitude of both images. Since it preserves the semantics of the poisoned image pixels, FIBA can perform attacks on both classification and dense prediction models. Experiments on three benchmarks in MIA (i.e., ISIC-2019 for skin lesion classification, KiTS-19 for kidney tumor segmentation, and EAD-2019 for endoscopic artifact detection), validate the effectiveness of FIBA and its superiority over state-of-the-art methods in attacking MIA models as well as bypassing backdoor defense. Source code will be available at https://github.com/HazardFY/FIBA.

preprint2022arXiv

Injecting Numerical Reasoning Skills into Knowledge Base Question Answering Models

Embedding-based methods are popular for Knowledge Base Question Answering (KBQA), but few current models have numerical reasoning skills and thus struggle to answer ordinal constrained questions. This paper proposes a new embedding-based KBQA framework which particularly takes numerical reasoning into account. We present NumericalTransformer on top of NSM, a state-of-the-art embedding-based KBQA model, to create NT-NSM. To enable better training, we propose two pre-training tasks with explicit numerical-oriented loss functions on two generated training datasets and a template-based data augmentation method for enriching ordinal constrained QA dataset. Extensive experiments on KBQA benchmarks demonstrate that with the help of our training algorithm, NT-NSM is empowered with numerical reasoning skills and substantially outperforms the baselines in answering ordinal constrained questions.

preprint2022arXiv

Real-Time Gaze Tracking with Event-Driven Eye Segmentation

Gaze tracking is increasingly becoming an essential component in Augmented and Virtual Reality. Modern gaze tracking al gorithms are heavyweight; they operate at most 5 Hz on mobile processors despite that near-eye cameras comfortably operate at a r eal-time rate ($>$ 30 Hz). This paper presents a real-time eye tracking algorithm that, on average, operates at 30 Hz on a mobile processor, achieves \ang{0.1}--\ang{0.5} gaze accuracies, all the while requiring only 30K parameters, one to two orders of magn itude smaller than state-of-the-art eye tracking algorithms. The crux of our algorithm is an Auto~ROI mode, which continuously pr edicts the Regions of Interest (ROIs) of near-eye images and judiciously processes only the ROIs for gaze estimation. To that end, we introduce a novel, lightweight ROI prediction algorithm by emulating an event camera. We discuss how a software emulation of events enables accurate ROI prediction without requiring special hardware. The code of our paper is available at https://github.com/horizon-research/edgaze.

preprint2022arXiv

Storage capacity of networks with discrete synapses and sparsely encoded memories

Attractor neural networks (ANNs) are one of the leading theoretical frameworks for the formation and retrieval of memories in networks of biological neurons. In this framework, a pattern imposed by external inputs to the network is said to be learned when this pattern becomes a fixed point attractor of the network dynamics. The storage capacity is the maximum number of patterns that can be learned by the network. In this paper, we study the storage capacity of fully-connected and sparsely-connected networks with a binarized Hebbian rule, for arbitrary coding levels. Our results show that a network with discrete synapses has a similar storage capacity as the model with continuous synapses, and that this capacity tends asymptotically towards the optimal capacity, in the space of all possible binary connectivity matrices, in the sparse coding limit. We also derive finite coding level corrections for the asymptotic solution in the sparse coding limit. The result indicates the capacity of network with Hebbian learning rules converges to the optimal capacity extremely slowly when the coding level becomes small. Our results also show that in networks with sparse binary connectivity matrices, the information capacity per synapse is larger than in the fully connected case, and thus such networks store information more efficiently.

preprint2022arXiv

The ASTRID Simulation: Galaxy Formation and Reionization

We introduce the ASTRID simulation, a large-scale cosmological hydrodynamic simulation in a $250$ Mpc/h box with $2\times 5500^3$ particles. ASTRID contains a large number of high redshift galaxies, which can be compared to future survey data, and resolves galaxies in halos more massive than $2\times 10^9 M_\odot$. ASTRID has been run from $z=99$ to $z=3$. As a particular focus is modelling the high redshift Universe, it contains models for inhomogeneous hydrogen and helium reionization, baryon relative velocities and massive neutrinos, as well as supernova and AGN feedback. The black hole model includes mergers driven by dynamical friction rather than repositioning. We briefly summarise the implemented models, and the technical choices we took when developing the simulation code. We validate the model, showing good agreement with observed UV luminosity functions, galaxy stellar mass functions and specific star formation rates. We show that the redshift at which a given galaxy underwent hydrogen reionization has a large effect on the halo gas fraction. Finally, at $z=6$, halos with $M \sim 2\times 10^9 M_\odot$ which have been reionized have a star formation rate $1.5$ times greater than those which have not yet been reionized.

preprint2022arXiv

The ASTRID simulation: the evolution of Supermassive Black Holes

We present the evolution of black holes (BHs) and their relationship with their host galaxies in Astrid, a large-volume cosmological hydrodynamical simulation with box size 250 $h^{-1} \rm Mpc$ containing $2\times5500^3$ particles evolved to z=3. Astrid statistically models BH gas accretion and AGN feedback to their environments, applies a power-law distribution for BH seed mass $M_{\rm sd}$, uses a dynamical friction model for BH dynamics and executes a physical treatment of BH mergers. The BH population is broadly consistent with empirical constraints on the BH mass function, the bright end of the luminosity functions, and the time evolution of BH mass and accretion rate density. The BH mass and accretion exhibit a tight correlation with host stellar mass and star formation rate. We trace BHs seeded before z>10 down to z=3, finding that BHs carry virtually no imprint of the initial $M_{\rm sd}$ except those with the smallest $M_{\rm sd}$, where less than 50\% of them have doubled in mass. Gas accretion is the dominant channel for BH growth compared to BH mergers. With dynamical friction, Astrid predicts a significant delay for BH mergers after the first encounter of a BH pair, with a typical elapse time of about 200 Myrs. There are in total $4.5 \times 10^5$ BH mergers in Astrid at z>3, $\sim 10^3$ of which have X-ray detectable EM counterparts: a bright kpc scale dual AGN with $L_X>10^{43}$ erg/s. BHs with $M_{\rm BH} \sim 10^{7-8} M_{\odot}$ experience the most frequent mergers. Galaxies that host BH mergers are unbiased tracers of the overall $M_{\rm BH} - M_{*}$ relation. Massive ($>10^{11} M_{\odot}$) galaxies have a high occupation number (>10) of BHs, and hence host the majority of BH mergers.

preprint2022arXiv

The BlueTides Mock Image Catalogue: Simulated observations of high-redshift galaxies and predictions for JWST imaging surveys

We present a mock image catalogue of ~100,000 MUV=-22.5 to -19.6 mag galaxies at z=7-12 from the BlueTides cosmological simulation. We create mock images of each galaxy with the James Webb (JWST), Hubble, Roman, and Euclid Space Telescopes, as well as Subaru, and VISTA, with a range of near- and mid-infrared filters. We perform photometry on the mock images to estimate the success of these instruments for detecting high-z galaxies. We predict that JWST will have unprecedented power in detecting high-z galaxies, with a 95% completeness limit at least 2.5 magnitudes fainter than VISTA and Subaru, 1.1 magnitudes fainter than Hubble, and 0.9 magnitudes fainter than Roman, for the same wavelength and exposure time. Focusing on JWST, we consider a range of exposure times and filters, and find that the NIRCam F356W and F277W filters will detect the faintest galaxies, with 95% completeness at m=27.4 mag in 10ks exposures. We also predict the number of high-z galaxies that will be discovered by upcoming JWST imaging surveys. We predict that the COSMOS-Web survey will detect ~1000 MUV<-20.1 mag galaxies at 6.5<z<7.5, by virtue of its large survey area. JADES-Medium will detect almost 100% of MUV<-20 mag galaxies at z<8.5 due to its significant depth, however with its smaller survey area it will detect only ~100 of these galaxies at 6.5<z<7.5. Cosmic variance results in a large range in the number of predicted galaxies each survey will detect, which is more evident in smaller surveys such as CEERS and the PEARLS NEP and GOODS-S fields.

preprint2022arXiv

The DESI $N$-body Simulation Project -- II. Suppressing sample variance with fast simulations

Dark Energy Spectroscopic Instrument (DESI) will construct a large and precise three-dimensional map of our Universe. The survey effective volume reaches $\sim20\Gpchcube$. It is a great challenge to prepare high-resolution simulations with a much larger volume for validating the DESI analysis pipelines. \textsc{AbacusSummit} is a suite of high-resolution dark-matter-only simulations designed for this purpose, with $200\Gpchcube$ (10 times DESI volume) for the base cosmology. However, further efforts need to be done to provide a more precise analysis of the data and to cover also other cosmologies. Recently, the CARPool method was proposed to use paired accurate and approximate simulations to achieve high statistical precision with a limited number of high-resolution simulations. Relying on this technique, we propose to use fast quasi-$N$-body solvers combined with accurate simulations to produce accurate summary statistics. This enables us to obtain 100 times smaller variance than the expected DESI statistical variance at the scales we are interested in, e.g. $k < 0.3\hMpc$ for the halo power spectrum. In addition, it can significantly suppress the sample variance of the halo bispectrum. We further generalize the method for other cosmologies with only one realization in \textsc{AbacusSummit} suite to extend the effective volume $\sim 20$ times. In summary, our proposed strategy of combining high-fidelity simulations with fast approximate gravity solvers and a series of variance suppression techniques sets the path for a robust cosmological analysis of galaxy survey data.

preprint2022arXiv

The Impact of Dust on the Sizes of Galaxies in the Epoch of Reionization

We study the sizes of galaxies in the Epoch of Reionization using a sample of ~100,000 galaxies from the BlueTides cosmological hydrodynamical simulation from z=7 to 11. We measure the galaxy sizes from stellar mass and luminosity maps, defining the effective radius as the minimum radius which could enclose the pixels containing 50% of the total mass/light in the image. We find an inverse relationship between stellar mass and effective half-mass radius, suggesting that the most massive galaxies are more compact and dense than lower mass galaxies, which have flatter mass distributions. We find a mildly negative relation between intrinsic far-ultraviolet luminosity and size, while we find a positive size-luminosity relation when measured from dust-attenuated images. This suggests that dust is the predominant cause of the observed positive size-luminosity relation, with dust preferentially attenuating bright sight lines resulting in a flatter emission profile and thus larger measured effective radii. We study the size-luminosity relation across the rest-frame ultraviolet and optical, and find that the slope decreases at longer wavelengths; this is a consequence of the relation being caused by dust, which produces less attenuation at longer wavelengths. We find that the far-ultraviolet size-luminosity relation shows mild evolution from z=7 to 11, and galaxy size evolves with redshift as $R\propto(1+z)^{-m}$, where $m=0.662\pm0.009$. Finally, we investigate the sizes of z=7 quasar host galaxies, and find that while the intrinsic sizes of quasar hosts are small relative to the overall galaxy sample, they have comparable sizes when measured from dust-attenuated images.

preprint2021arXiv

A fast particle-mesh simulation of non-linear cosmological structure formation with massive neutrinos

Quasi-N-body simulations, such as FastPM, provide a fast way to simulate cosmological structure formation, but have yet to adequately include the effects of massive neutrinos. We present a method to include neutrino particles in FastPM, enabling computation of the CDM and total matter power spectra to percent-level accuracy in the non-linear regime. The CDM-neutrino cross-power can also be computed at a sufficient accuracy to constrain cosmological observables. To avoid the shot noise that typically plagues neutrino particle simulations, we employ a quasi-random algorithm to sample the relevant Fermi-Dirac distribution when setting the initial neutrino thermal velocities. We additionally develop an effective distribution function to describe a set of non-degenerate neutrinos as a single particle to speed up non-degenerate simulations. The simulation is accurate for the full range of physical interest, $M_ν\lesssim 0.6$eV, and applicable to redshifts $z\lesssim2$. Such accuracy can be achieved by initializing particles with the two-fluid approximation transfer functions (using the REPS package). Convergence can be reached in $\sim 25$ steps, with a starting redshift of $z=99$. Probing progressively smaller scales only requires an increase in the number of CDM particles being simulated, while the number of neutrino particles can remain fixed at a value less than or similar to the number of CDM particles. In turn, the percentage increase in runtime-per-step due to neutrino particles is between $\sim 5-20\%$ for runs with $1024^3$ CDM particles, and decreases as the number of CDM particles is increased. The code has been made publicly available, providing an invaluable resource to produce fast predictions for cosmological surveys and studying reconstruction.

preprint2021arXiv

Falx: Synthesis-Powered Visualization Authoring

Modern visualization tools aim to allow data analysts to easily create exploratory visualizations. When the input data layout conforms to the visualization design, users can easily specify visualizations by mapping data columns to visual channels of the design. However, when there is a mismatch between data layout and the design, users need to spend significant effort on data transformation. We propose Falx, a synthesis-powered visualization tool that allows users to specify visualizations in a similarly simple way but without needing to worry about data layout. In Falx, users specify visualizations using examples of how concrete values in the input are mapped to visual channels, and Falx automatically infers the visualization specification and transforms the data to match the design. In a study with 33 data analysts on four visualization tasks involving data transformation, we found that users can effectively adopt Falx to create visualizations they otherwise cannot implement.

preprint2021arXiv

Massive Black Hole Mergers with Orbital Information: Predictions from the ASTRID Simulation

We examine massive black hole (MBH) mergers and their associated gravitational wave signals from the large-volume cosmological simulation Astrid. Astrid includes galaxy formation and black hole models recently updated with a MBH seed population between $3\times 10^4M_{\odot}/h$ and $3\times 10^5M_{\odot}/h$ and a sub-grid dynamical friction (DF) model to follow the MBH dynamics down to $1.5\;\text{ckpc}/h$. We calculate initial eccentricities of MBH orbits directly from the simulation at kpc-scales, and find orbital eccentricities above $0.7$ for most MBH pairs before the numerical merger. After approximating unresolved evolution on scales below ${\sim 200\,\text{pc}}$, we find that the in-simulation DF on large scales accounts for more than half of the total orbital decay time ($\sim 500\,\text{Myrs}$) due to DF. The binary hardening time is an order of magnitude longer than the DF time, especially for the seed-mass binaries ($M_\text{BH}<2M_\text{seed}$). As a result, only $\lesssim20\%$ of seed MBH pairs merge at $z>3$ after considering both unresolved DF evolution and binary hardening. These $z>3$ seed-mass mergers are hosted in a biased population of galaxies with the highest stellar masses of $>10^9\,M_\odot$. With the higher initial eccentricity prediction from Astrid, we estimate an expected merger rate of $0.3-0.7$ per year from the $z>3$ MBH population. This is a factor of $\sim 7$ higher than the prediction using the circular orbit assumption. The LISA events are expected at a similar rate, and comprise $\gtrsim 60\%$ seed-seed mergers, $\sim 30\%$ involving only one seed-mass MBH, and $\sim 10\%$ mergers of non-seed MBHs.

preprint2021arXiv

Phases of learning dynamics in artificial neural networks: with or without mislabeled data

Despite tremendous success of deep neural network in machine learning, the underlying reason for its superior learning capability remains unclear. Here, we present a framework based on statistical physics to study dynamics of stochastic gradient descent (SGD) that drives learning in neural networks. By using the minibatch gradient ensemble, we construct order parameters to characterize dynamics of weight updates in SGD. Without mislabeled data, we find that the SGD learning dynamics transitions from a fast learning phase to a slow exploration phase, which is associated with large changes in order parameters that characterize the alignment of SGD gradients and their mean amplitude. In the case with randomly mislabeled samples, SGD learning dynamics falls into four distinct phases. The system first finds solutions for the correctly labeled samples in phase I, it then wanders around these solutions in phase II until it finds a direction to learn the mislabeled samples during phase III, after which it finds solutions that satisfy all training samples during phase IV. Correspondingly, the test error decreases during phase I and remains low during phase II; however, it increases during phase III and reaches a high plateau during phase IV. The transitions between different phases can be understood by changes of order parameters that characterize the alignment of mean gradients for the correctly and incorrectly labeled samples and their (relative) strength during learning. We find that individual sample losses for the two datasets are most separated during phase II, which leads to a cleaning process to eliminate mislabeled samples for improving generalization.

preprint2021arXiv

The DESI $N$-body Simulation Project I: Testing the Robustness of Simulations for the DESI Dark Time Survey

Analysis of large galaxy surveys requires confidence in the robustness of numerical simulation methods. The simulations are used to construct mock galaxy catalogs to validate data analysis pipelines and identify potential systematics. We compare three $N$-body simulation codes, ABACUS, GADGET, and SWIFT, to investigate the regimes in which their results agree. We run $N$-body simulations at three different mass resolutions, $6.25\times10^{8}$, $2.11\times10^{9}$, and $5.00\times10^{9}~h^{-1}$M$_{\odot}$, matching phases to reduce the noise within the comparisons. We find systematic errors in the halo clustering between different codes are smaller than the DESI statistical error for $s > 20\, h^{-1}$Mpc in the correlation function in redshift space. Through the resolution comparison we find that simulations run with a mass resolution of $2.1\times10^{9}~h^{-1}$M$_{\odot}$ are sufficiently converged for systematic effects in the halo clustering to be smaller than the DESI statistical error at scales larger than $20 \, h^{-1}$Mpc. These findings show that the simulations are robust for extracting cosmological information from large scales which is the key goal of the DESI survey. Comparing matter power spectra, we find the codes agree to within 1% for $k \leq 10~h$Mpc$^{-1}$. We also run a comparison of three initial condition generation codes and find good agreement. In addition, we include a quasi-$N$-body code, FastPM, since we plan use it for certain DESI analyses. The impact of the halo definition and galaxy-halo relation will be presented in a follow up study.

preprint2020arXiv

Cosmic variance of $z>7$ galaxies: Prediction from BlueTides

In the coming decade, a new generation of telescopes, including JWST and WFIRST, will probe the period of the formation of first galaxies and quasars, and open up the last frontier for structure formation. Recent simulations as well as observations have suggested that these galaxies are strongly clustered (with large scale bias $\gtrsim6$), and therefore have significant cosmic variance. In this work, we use \texttt{BlueTides}, the largest volume cosmological simulation of galaxy formation, to directly estimate the cosmic variance for current and upcoming surveys. Given its resolution and volume, \texttt{BlueTides} can probe the bias and cosmic variance of $z>7$ galaxies between magnitude $M_{UV}\sim-16$ to $M_{UV}\sim-22$ over survey areas $\sim0.1\ \mathrm{arcmin}^2$ to $\sim 10~\mathrm{deg}^2$. Within this regime, the cosmic variance decreases with survey area/ volume as a power law with exponents between $\sim-0.25$ to $\sim-0.45$. For the planned $10~\mathrm{deg}^2$ field of WFIRST, the cosmic variance is between $3\%$ to $10\%$. Upcoming JWST medium/ deep surveys with areas up to $A\sim100\ \mathrm{arcmin}^2$ will have cosmic variance ranging from $\sim 20-50\%$. Lensed surveys have the highest cosmic variance $\gtrsim 40\%$; the cosmic variance of $M_{UV}\lesssim-16$ galaxies is $\lesssim100\%$ up to $z\sim11$. At higher redshifts such as $z\sim12~(14)$, effective volumes of $\gtrsim(8~\mathrm{Mpc}/h)^3$ ($\gtrsim(12\ \mathrm{Mpc}/h)^3$) are required to limit the cosmic variance to within $100\%$. Finally, we find that cosmic variance is larger than Poisson variance and forms the dominant component of the overall uncertainty in all current and upcoming surveys. We present our calculations in the form of simple fitting functions and an online cosmic variance calculator (CV_AT_COSMIC_DAWN) which we publicly release.

preprint2020arXiv

Exclusive quarkonium production or decay in soft gluon factorization

In this paper, we study the application of the recently proposed soft gluon factorization (SGF) to exclusive quarkonium production or decay. We find that in the nonrelativistic QCD factorization framework there are too many nonperturbative parameters. Thanks to the factorization of kinematical physics from dynamical physics, the SGF significantly reduces the number of nonperturbative parameters. Therefore, the SGF can improve our predictive power of exclusive quarkonium production or decay. By applying to $η_c+γ$ production at B-factories, our result is the closest one to data among all theoretical calculations.

preprint2020arXiv

How neural networks find generalizable solutions: Self-tuned annealing in deep learning

Despite the tremendous success of Stochastic Gradient Descent (SGD) algorithm in deep learning, little is known about how SGD finds generalizable solutions in the high-dimensional weight space. By analyzing the learning dynamics and loss function landscape, we discover a robust inverse relation between the weight variance and the landscape flatness (inverse of curvature) for all SGD-based learning algorithms. To explain the inverse variance-flatness relation, we develop a random landscape theory, which shows that the SGD noise strength (effective temperature) depends inversely on the landscape flatness. Our study indicates that SGD attains a self-tuned landscape-dependent annealing strategy to find generalizable solutions at the flat minima of the landscape. Finally, we demonstrate how these new theoretical insights lead to more efficient algorithms, e.g., for avoiding catastrophic forgetting.

preprint2020arXiv

Imaging Systematics and Clustering of DESI Main Targets

We evaluate the impact of imaging systematics on the clustering of luminous red galaxies (LRG), emission-line galaxies (ELG) and quasars (QSO) targeted for the upcoming Dark Energy Spectroscopic Instrument (DESI) survey. Using Data Release 7 of the DECam Legacy Survey, we study the effects of astrophysical foregrounds, stellar contamination, differences between north galactic cap and south galactic cap measurements, and variations in imaging depth, stellar density, galactic extinction, seeing, airmass, sky brightness, and exposure time before presenting survey masks and weights to mitigate these effects. With our sanitized samples in hand, we conduct a preliminary analysis of the clustering amplitude and evolution of the DESI main targets. From measurements of the angular correlation functions, we determine power law fits $r_0 = 7.78 \pm 0.26$ $h^{-1}$Mpc, $γ= 1.98 \pm 0.02$ for LRGs and $r_0 = 5.45 \pm 0.1$ $h^{-1}$Mpc, $γ= 1.54 \pm 0.01$ for ELGs. Additionally, from the angular power spectra, we measure the linear biases and model the scale dependent biases in the weakly nonlinear regime. Both sets of clustering measurements show good agreement with survey requirements for LRGs and ELGs, attesting that these samples will enable DESI to achieve precise cosmological constraints. We also present clustering as a function of magnitude, use cross-correlations with external spectroscopy to infer $dN/dz$ and measure clustering as a function of luminosity, and probe higher order clustering statistics through counts-in-cells moments.

preprint2020arXiv

Large scale simulations of H and He reionization and heating driven by stars and more energetic sources

We present simulations of cosmic reionization and reheating from $z=18$ to $z=5$, investigating the role of stars (emitting soft UV-photons), nuclear black holes (BHs, with power-law spectra), X-ray binaries (XRBs, with hard X-ray dominated spectra), and the supernova-associated thermal bremsstrahlung of the diffuse interstellar medium (ISM, with soft X-ray spectra). We post-process the hydrodynamical simulation Massive-Black II (MBII) with multifrequency ionizing radiative transfer. The source properties are directly derived from the physical environment of MBII, and our only real free parameter is the ionizing escape fraction $f_{\rm esc}$. We find that, among the models explored here, the one with an escape fraction that decreases with decreasing redshift yields results most in line with observations, such as of the neutral hydrogen fraction and the Thomson scattering optical depth. Stars are the main driver of hydrogen reionization and consequently of the thermal history of the intergalactic medium (IGM). We obtain $\langle x_{\rm HII} \rangle = 0.99998$ at $z=6$ for all source types, with volume averaged temperatures $\langle T \rangle \sim 20,000~{\rm K}$. BHs are rare and negligible to hydrogen reionization, but conversely they are the only sources which can fully ionize helium, increasing local temperatures by $\sim 10^4~{\rm K}$. The thermal and ionization state of the neutral and lowly ionized hydrogen differs significantly with different source combinations, with ISM and (to a lesser extent) XRBs, playing a significant role and, as a consequence, determining the transition from absorption to emission of the 21 cm signal from neutral hydrogen.

preprint2020arXiv

Mesorasi: Architecture Support for Point Cloud Analytics via Delayed-Aggregation

Point cloud analytics is poised to become a key workload on battery-powered embedded and mobile platforms in a wide range of emerging application domains, such as autonomous driving, robotics, and augmented reality, where efficiency is paramount. This paper proposes Mesorasi, an algorithm-architecture co-designed system that simultaneously improves the performance and energy efficiency of point cloud analytics while retaining its accuracy. Our extensive characterizations of state-of-the-art point cloud algorithms show that, while structurally reminiscent of convolutional neural networks (CNNs), point cloud algorithms exhibit inherent compute and memory inefficiencies due to the unique characteristics of point cloud data. We propose delayed-aggregation, a new algorithmic primitive for building efficient point cloud algorithms. Delayed-aggregation hides the performance bottlenecks and reduces the compute and memory redundancies by exploiting the approximately distributive property of key operations in point cloud algorithms. Delayed-aggregation let point cloud algorithms achieve 1.6x speedup and 51.1% energy reduction on a mobile GPU while retaining the accuracy (-0.9% loss to 1.2% gains). To maximize the algorithmic benefits, we propose minor extensions to contemporary CNN accelerators, which can be integrated into a mobile Systems-on-a-Chip (SoC) without modifying other SoC components. With additional hardware support, Mesorasi achieves up to 3.6x speedup.

preprint2020arXiv

More accurate simulations with separate initial conditions for baryons and dark matter

We revisit techniques for performing cosmological simulations with both baryons and cold dark matter when each fluid has different initial conditions, as is the case at the end of the radiation era. Most simulations do not reproduce the linear prediction for the difference between the cold dark matter and baryon perturbations. We show that this is due to the common use of offset regular grids when setting up the particle initial conditions. The correct behaviour can be obtained without any loss of simulation resolution by using a Lagrangian glass for the baryon particles. We further show that the difference between cold dark matter and baryons may affect predictions for the Lyman-alpha forest flux power spectrum at the 5% level, potentially impacting current cosmological constraints.

preprint2020arXiv

Nebular Line Emission During the Epoch of Reionization

Nebular emission lines associated with galactic HII regions carry information about both physical properties of the ionised gas and the source of ionising photons as well as providing the opportunity of measuring accurate redshifts and thus distances once a cosmological model is assumed. While nebular line emission has been extensively studied at lower redshift there are currently only few constraints within the epoch of reionisation (EoR, $z>6$), chiefly due to the lack of sensitive near-IR spectrographs. However, this will soon change with the arrival of the Webb Telescope providing sensitive near-IR spectroscopy covering the rest-frame UV and optical emission of galaxies in the EoR. In anticipation of Webb we combine the large cosmological hydrodynamical simulation Bluetides with photoionisation modelling to predict the nebular emission line properties of galaxies at $z=8\to 13$. We find good agreement with the, albeit limited, existing direct and indirect observational constraints on equivalent widths though poorer agreement with luminosity function constraints.

preprint2020arXiv

Phase separation in the advective Cahn-Hilliard equation

The Cahn--Hilliard equation is a classic model of phase separation in binary mixtures that exhibits spontaneous coarsening of the phases. We study the Cahn--Hilliard equation with an imposed advection term in order to model the stirring and eventual mixing of the phases. The main result is that if the imposed advection is sufficiently mixing then no phase separation occurs, and the solution instead converges exponentially to a homogeneous mixed state. The mixing effectiveness of the imposed drift is quantified in terms of the dissipation time of the associated advection-hyperdiffusion equation, and we produce examples of velocity fields with a small dissipation time. We also study the relationship between this quantity and the dissipation time of the standard advection-diffusion equation.

preprint2020arXiv

Real-Time Spatio-Temporal LiDAR Point Cloud Compression

Compressing massive LiDAR point clouds in real-time is critical to autonomous machines such as drones and self-driving cars. While most of the recent prior work has focused on compressing individual point cloud frames, this paper proposes a novel system that effectively compresses a sequence of point clouds. The idea to exploit both the spatial and temporal redundancies in a sequence of point cloud frames. We first identify a key frame in a point cloud sequence and spatially encode the key frame by iterative plane fitting. We then exploit the fact that consecutive point clouds have large overlaps in the physical space, and thus spatially encoded data can be (re-)used to encode the temporal stream. Temporal encoding by reusing spatial encoding data not only improves the compression rate, but also avoids redundant computations, which significantly improves the compression speed. Experiments show that our compression system achieves 40x to 90x compression rate, significantly higher than the MPEG's LiDAR point cloud compression standard, while retaining high end-to-end application accuracies. Meanwhile, our compression system has a compression speed that matches the point cloud generation rate by today LiDARs and out-performs existing compression systems, enabling real-time point cloud transmission.

preprint2020arXiv

Singular hyperbolic metrics and negative subharmonic functions

We propose a conjecture that the monodromy group of a singular hyperbolic metric on a non-hyperbolic Riemann surface is {\it Zariski dense} in ${\rm PSL}(2,\,{\Bbb R})$. By using meromorphic differentials and affine connections, we obtain an evidence of the conjecture that the monodromy group of the singular hyperbolic metric can not be contained in four classes of one-dimensional Lie subgroups of ${\rm PSL}(2,\,{\Bbb R})$. Moreover, we confirm the conjecture if the Riemann surface is either one of the once punctured Riemann sphere, the twice punctured Riemann sphere, a once punctured torus and a compact Riemann surface.

preprint2020arXiv

The complete study on polarization of $Υ(nS)$ hadroproduction at QCD next-to-leading order

Applying the nonrelativistic quantum chromodynamics factorization formalism to the $Υ(1S,2S,3S)$ hadroproduction, a complete analysis on the polarization parameters $λ_θ$, $λ_{θϕ}$, $λ_ϕ$ for the production are presented at QCD next-to-leading order. With the long-distance matrix elements extracted from experimental data for the production rate and polarization parameter $λ_θ$ of $Υ$ hadroproduction, our results provide a good description for the measured parameters $λ_{θϕ}$ and $λ_ϕ$ in both the helicity and the Collins-Soper frames. In our calculations the frame invariant parameter $\tildeλ$ is consistent in the two frames. Finally, it is pointed out that there are discrepancies for $\tildeλ$ between available experimental data and corresponding theoretical predictions.

preprint2020arXiv

The early growth of supermassive black holes in cosmological hydrodynamic simulations with constrained Gaussian realizations

The paper examines the early growth of supermassive black holes (SMBHs) in cosmological hydrodynamic simulations with different BH seeding scenarios. Employing the constrained Gaussian realization, we reconstruct the initial conditions in the large-volume BlueTides simulation and run them to $z=6$ to cross-validate that the method reproduces the first quasars and their environments. Our constrained simulations in a volume of $(15\, h^{-1}{\rm Mpc})^3$ successfully recover the evolution of large-scale structure and the stellar and BH masses in the vicinity of a $\sim10^{12}\, M_{\odot}$ halo which we identified in BlueTides at $z\sim7$ hosting a $\sim10^9\, M_{\odot}$ SMBH. Among our constrained simulations, only the ones with a low-tidal field and high-density peak in the initial conditions induce the fastest BH growth required to explain the $z>6$ quasars. We run two sets of simulations with different BH seed masses of $5\times10^3$, $5\times10^4$, and $5\times10^5\, h^{-1}M_{\odot}$, (a) with the same ratio of halo to BH seed mass and (b) with the same halo threshold mass. At $z=6$, all the SMBHs converge in mass to $\sim10^9\, M_{\odot}$ except for the one with the smallest seed in (b) undergoing critical BH growth and reaching $10^8$ -- $10^9\, M_{\odot}$, albeit with most of the growth in (b) delayed compared to set (a). The finding of eight BH mergers in the small-seed scenario (four with masses $10^4$ -- $10^6\, M_{\odot}$ at $z>12$), six in the intermediate-seed scenario, and zero in the large-seed scenario suggests that the vast BHs in the small-seed scenario merge frequently during the early phases of the growth of SMBHs. The increased BH merger rate for the low-mass BH seed and halo threshold scenario provides an exciting prospect for discriminating BH formation mechanisms with the advent of multi-messenger astrophysics and next-generation gravitational wave facilities.

preprint2019arXiv

High mass and halo resolution from fast low resolution simulations

Generating mocks for future sky surveys requires large volumes and high resolutions, which is computationally expensive even for fast simulations. In this work we try to develop numerical schemes to calibrate various halo and matter statistics in fast low resolution simulations compared to high resolution N-body and hydrodynamic simulations. For the halos, we improve the initial condition accuracy and develop a halo finder "relaxed-FOF", where we allow different linking length for different halo mass and velocity dispersions. We show that our relaxed-FoF halo finder improves the common statistics, such as halo bias, halo mass function, halo auto power spectrum in real space and in redshift space, cross correlation coefficient with the reference halo catalog, and halo-matter cross power spectrum. We also incorporate the potential gradient descent (PGD) method into fast simulations to improve the matter distribution at nonlinear scale. By building a lightcone output, we show that the PGD method significantly improves the weak lensing convergence tomographic power spectrum. With these improvements FastPM is comparable to the high resolution full N-body simulation of the same mass resolution, with two orders of magnitude fewer time steps. These techniques can be used to improve the halo and matter statistics of FastPM simulations for mock catalogs of future surveys such as DESI and LSST.

preprint2019arXiv

Neutron Spin Resonance in the Heavily Hole-doped KFe$_{2}$As$_{2}$ Superconductor

We report high-resolution neutron scattering measurements of the low energy spin fluctuations of KFe$_{2}$As$_{2}$, the end member of the hole-doped Ba$_{1-x}$K$_x$Fe$_2$As$_2$ family with only hole pockets, above and below its superconducting transition temperature $T_c$ ($\sim$ 3.5 K). Our data reveals clear spin fluctuations at the incommensurate wave vector ($0.5\pmδ$, 0, $L$), ($δ$ = 0.2)(1-Fe unit cell), which exhibit $L$-modulation peaking at $L=0.5$. Upon cooling to the superconducting state, the incommensurate spin fluctuations gradually open a spin-gap and form a sharp spin resonance mode. The incommensurability ($2δ$ = 0.4) of the resonance mode ($\sim1.2$ meV) is considerably larger than the previously reported value ($2δ$ $\approx0.32$) at higher energies ($\ge\sim6$ meV). The determination of the momentum structure of spin fluctuation in the low energy limit allows a direct comparison with the realistic Fermi surface and superconducting gap structure. Our results point to an $s$-wave pairing with a reversed sign between the hole pockets near the zone center in KFe$_{2}$As$_{2}$.

preprint2019arXiv

QSO obscuration at high redshift ($z \gtrsim 7$): Predictions from the BlueTides simulation

High-$z$ AGNs hosted in gas rich galaxies are expected to grow through significantly obscured accretion phases. This may limit or bias their observability. In this work, we use \textsc{BlueTides}, a large volume cosmological simulation of galaxy formation to examine quasar obscuration for the highest-redshift ($z \geq 7$) supermassive black holes residing in the center of galaxies. We find that for the bright quasars, most of the high column density gas ($>90\%$) resides in the innermost regions of the host galaxy, (typically within $< 10$ ckpc), while the gas in the outskirts is a minor contributor to the $N_\mathrm H$. The brightest quasars can have large angular variations in galactic obscuration, over 2 orders of magnitude, where the lines of sight with the lowest obscuration are those formed via strong gas outflows driven by AGN feedback. We find that for the overall AGN population, the mean $N_\mathrm H$ is generally larger for high luminosity and BH mass, while the $N_\mathrm H$ distribution is significantly broadened, developing a low $N_\mathrm H $ wing due to the angular variations driven by the AGN outflows/feedback. The obscured fraction P($N_{\rm H} > 10^{23} {\rm cm}^{-2}$) typically range from 0.6 to 1.0 for increasing $L_{X}$ (with $L_X > 10^{43} \rm{ergs/s}$), with no clear trend of redshift evolution. With respect to the galaxy host property, we find a linear relation between $N_{\rm H}$, $M_*$ and $M_{\rm H_2}$ with $\log N_{\rm H} = (0.24 \pm 0.03) \log M_{*} + (20.7 \pm 0.3)$ and $\log N_{\rm H} = (0.47 \pm 0.03) \log M_{\rm H_2} + (18.4 \pm 0.3)$. The dust optical depth in the UV band $τ_{\mathrm UV}$ has tight positive correlation with $N_{\rm H}$. Our dust extincted UVLF is about 1.5 dex lower than the intrinsic UVLF, implying that more than 99\% of the $z \sim 7$ AGNs are heavily dust extincted and therefore would be missed by the UV band observation.

preprint2019arXiv

SciPy 1.0--Fundamental Algorithms for Scientific Computing in Python

SciPy is an open source scientific computing library for the Python programming language. SciPy 1.0 was released in late 2017, about 16 years after the original version 0.1 release. SciPy has become a de facto standard for leveraging scientific algorithms in the Python programming language, with more than 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories, and millions of downloads per year. This includes usage of SciPy in almost half of all machine learning projects on GitHub, and usage by high profile projects including LIGO gravitational wave analysis and creation of the first-ever image of a black hole (M87). The library includes functionality spanning clustering, Fourier transforms, integration, interpolation, file I/O, linear algebra, image processing, orthogonal distance regression, minimization algorithms, signal processing, sparse matrix handling, computational geometry, and statistics. In this work, we provide an overview of the capabilities and development practices of the SciPy library and highlight some recent technical developments.

preprint2016arXiv

Component-based Synthesis of Table Consolidation and Transformation Tasks from Examples

This paper presents an example-driven synthesis technique for automating a large class of data preparation tasks that arise in data science. Given a set of input tables and an out- put table, our approach synthesizes a table transformation program that performs the desired task. Our approach is not restricted to a fixed set of DSL constructs and can synthesize programs from an arbitrary set of components, including higher-order combinators. At a high-level, our approach performs type-directed enumerative search over partial pro- grams but incorporates two key innovations that allow it to scale: First, our technique can utilize any first-order specification of the components and uses SMT-based deduction to reject partial programs. Second, our algorithm uses partial evaluation to increase the power of deduction and drive enumerative search. We have evaluated our synthesis algorithm on dozens of data preparation tasks obtained from on-line forums, and we show that our approach can automatically solve a large class of problems encountered by R users.

preprint2016arXiv

Forecasts for the WFIRST High Latitude Survey using the BlueTides Simulation

We use the BlueTides simulation to predict the properties of the high-$z$ galaxy and active galactic nuclei (AGN) populations for the planned 2200deg$^2$ Wide-Field Infrared Survey Telescope's (WFIRST)-AFTA High Latitude Survey (HLS). BlueTides is a cosmological hydrodynamic simulation, which incorporates a variety of baryon physics in a $(400h^{-1} \mathrm{Mpc})^3$ volume evolved to $z=8$ with 0.7 trillion particles. The galaxy luminosity functions in the simulation show good agreement with all the current observational constraints (up to $z=11$) and predicts an enhanced number of UV bright galaxies. At the proposed depth of the HLS ($m < 26.75$), BlueTides predicts $10^6$ galaxies at $z=8$ with a few up to $z\sim 15$ due to the enhanced bright end of the galaxy luminosity function. At $z=8$, galaxies in the mock HLS have specific star formation rates of $\sim 10 {\rm Gyr}^{-1}$ and ages of $\sim 80 {\rm Myr}$ (both evolving linearly with redshift) and a non-evolving mass-metallicity relation. BlueTides also predicts $\sim 10^4$ AGN in WFIRST HLS from $z=8$ out to $z\sim 14$. These AGN host black holes of $M\sim 10^6-10^8 M_\odot$ accreting close to their Eddington luminosity. Galaxies and AGN have host halo masses of $M_{halo}\sim 10^{11-12} M_\odot$ and a linear bias $b\approx 13-20$. Given the expected galaxy space densities, their high bias and large volume probed we speculate that it may be feasible for WFIRST HLS detect the Baryon Acoustic Oscillation peak in the galaxy power spectrum out to $z=8-9$.

preprint2016arXiv

Monsters in the Dark: Predictions for Luminous Galaxies in the Early Universe from the BlueTides Simulation

Using deep Hubble and Spitzer observations Oesch et al. (2016) have identified a bright ($M_{\rm UV}\approx -22$) star forming galaxy candidate at $z \approx 11$. The presence of GN-$z11$ implies a number density $\sim 10^{-6}\,{\rm Mpc^{-3}}$, roughly an order of magnitude higher than the expected value based on extrapolations from lower redshift. Using the unprecedented volume and high resolution of the BlueTides cosmological hydrodynamical simulation, we study the population of luminous rare objects at $z > 10$. The luminosity function in BlueTides implies an enhanced number of massive galaxies, consistent with the observation of GN-$z11$. We find about 30 galaxies at $M_{\rm UV}\approx -22$ at $z = 11$ in the BlueTides volume, including a few objects about 1.5 magnitudes brighter. The probability of observing GN-$z11$ in the volume probed by Oesch et al. (2016) is $\sim 13$ per cent. The predicted properties of the rare bright galaxies at $z = 11$ in BlueTides closely match those inferred from the observations of GN-$z11$. BlueTides predicts a negligible contribution from faint AGN in the observed SED. The enormous increase in volume surveyed by WFIRST will provide observations of $\sim1000$ galaxies with $M_{\rm UV} < -22$ beyond $z = 11$ out to $z = 13.5$.

preprint2016arXiv

On Energy Efficiency of the Nearest-Neighbor Cooperative Communication in Heterogeneous Networks

In this paper, we consider a two-dimensional heterogeneous cellular network scenario consisting of one base station (BS) and some mobile stations (MSs) whose locations follow a Poisson point process (PPP). The MSs are equipped with multiple radio access interfaces including a cellular access interface and at least one short-range communication interface. We propose a nearest-neighbor cooperation communication (NNCC) scheme by exploiting the short-range communication between a MS and its nearest neighbor to collaborate on their uplink transmissions. In the proposed cooperation scheme, a MS and its nearest neighbor first exchange data by the short-range communication. Upon successful decoding of the data from each other, they proceed to send their own data, as well as the data received from the other to the BS respectively in orthogonal time slots. The energy efficiency analysis for the proposed scheme is presented based on the characteristics of the PPP and the Rayleigh fading channel. Numerical results show that the NNCC scheme significantly improves the energy efficiency compared to the conventional non-cooperative uplink transmissions.

preprint2016arXiv

Perturbation theory, effective field theory, and oscillations in the power spectrum

We explore the relationship between the nonlinear matter power spectrum and the various Lagrangian and Standard Perturbation Theories (LPT and SPT). We first look at it in the context of one dimensional (1-d) dynamics, where 1LPT is exact at the perturbative level and one can exactly resum the SPT series into the 1LPT power spectrum. Shell crossings lead to non-perturbative effects, and the PT ignorance can be quantified in terms of their ratio, which is also the transfer function squared in the absence of stochasticity. At the order of PT we work, this parametrization is equivalent to the results of effective field theory (EFT), and can thus be expanded in terms of the same parameters. We find that its radius of convergence is larger than the SPT loop expansion. The same EFT parametrization applies to all SPT loop terms and, if stochasticity can be ignored, to all N-point correlators. In 3-d, the LPT structure is considerably more complicated, and we find that LPT models with parametrization motivated by the EFT exhibit running with $k$ and that SPT is generally a better choice. Since these transfer function expansions contain free parameters that change with cosmological model their usefulness for broadband power is unclear. For this reason we test the predictions of these models on baryonic acoustic oscillations (BAO) and other primordial oscillations, including string monodromy models, for which we ran a series of simulations with and without oscillations. Most models are successful in predicting oscillations beyond their corresponding PT versions, confirming the basic validity of the model.

preprint2016arXiv

The DESI Experiment Part I: Science,Targeting, and Survey Design

DESI (Dark Energy Spectroscopic Instrument) is a Stage IV ground-based dark energy experiment that will study baryon acoustic oscillations (BAO) and the growth of structure through redshift-space distortions with a wide-area galaxy and quasar redshift survey. To trace the underlying dark matter distribution, spectroscopic targets will be selected in four classes from imaging data. We will measure luminous red galaxies up to $z=1.0$. To probe the Universe out to even higher redshift, DESI will target bright [O II] emission line galaxies up to $z=1.7$. Quasars will be targeted both as direct tracers of the underlying dark matter distribution and, at higher redshifts ($ 2.1 < z < 3.5$), for the Ly-$α$ forest absorption features in their spectra, which will be used to trace the distribution of neutral hydrogen. When moonlight prevents efficient observations of the faint targets of the baseline survey, DESI will conduct a magnitude-limited Bright Galaxy Survey comprising approximately 10 million galaxies with a median $z\approx 0.2$. In total, more than 30 million galaxy and quasar redshifts will be obtained to measure the BAO feature and determine the matter power spectrum, including redshift space distortions.

preprint2016arXiv

The DESI Experiment Part II: Instrument Design

DESI (Dark Energy Spectropic Instrument) is a Stage IV ground-based dark energy experiment that will study baryon acoustic oscillations and the growth of structure through redshift-space distortions with a wide-area galaxy and quasar redshift survey. The DESI instrument is a robotically-actuated, fiber-fed spectrograph capable of taking up to 5,000 simultaneous spectra over a wavelength range from 360 nm to 980 nm. The fibers feed ten three-arm spectrographs with resolution $R= λ/Δλ$ between 2000 and 5500, depending on wavelength. The DESI instrument will be used to conduct a five-year survey designed to cover 14,000 deg$^2$. This powerful instrument will be installed at prime focus on the 4-m Mayall telescope in Kitt Peak, Arizona, along with a new optical corrector, which will provide a three-degree diameter field of view. The DESI collaboration will also deliver a spectroscopic pipeline and data management system to reduce and archive all data for eventual public use.

preprint2016arXiv

The Lyman-continuum photon production efficiency in the high-redshift Universe

The Lyman Continuum photon production efficiency ($ξ_{\rm ion}$) is a critical ingredient for inferring the number of photons available to reionise the intergalactic medium. To estimate the theoretical production efficiency in the high-redshift Universe we couple the BlueTides cosmological hydrodynamical simulation with a range of stellar population synthesis models. We find Lyman Continuum photon production efficiencies of $\log_{10}(ξ_{\rm ion}/{\rm erg^{-1}\, Hz})\approx 25.1-25.5$ depending on the choice of stellar population synthesis model. These results are broadly consistent with recent observational constraints at high-redshift though favour a model incorporating the effects of binary evolution

preprint2016arXiv

The Photometric Properties of Galaxies in the Early Universe

We use the large cosmological hydro-dynamic simulation BlueTides to predict the photometric properties of galaxies during the epoch of reionisation ($z=8-15$). These properties include the rest-frame UV to near-IR broadband spectral energy distributions, the Lyman continuum photon production, the UV star formation rate calibration, and intrinsic UV continuum slope. In particular we focus on exploring the effect of various modelling assumptions, including the assumed choice of stellar population synthesis model, initial mass function, and the escape fraction of Lyman continuum photons, upon these quantities. We find that these modelling assumptions can have a dramatic effect on photometric properties leading to consequences for the accurate determination of physical properties from observations. For example, at $z=8$ we predict that nebular emission can account for up-to $50\%$ of the rest-frame $R$-band luminosity, while the choice of stellar population synthesis model can change the Lyman continuum production rate up to a factor of $\times 2$.

preprint2016arXiv

Transition from sign-reversed to sign-preserved Cooper-pairing symmetry in sulfur-doped iron selenide superconductors

An essential step toward elucidating the mechanism of superconductivity is to determine the sign/phase of superconducting order parameter, as it is closely related to the pairing interaction. In conventional superconductors, the electron-phonon interaction induces attraction between electrons near the Fermi energy and results in a sign-preserved s-wave pairing. For high-temperature superconductors, including cuprates and iron-based superconductors, prevalent weak coupling theories suggest that the electron pairing is mediated by spin fluctuations which lead to repulsive interactions, and therefore that a sign-reversed pairing with an s+-or d-wave symmetry is favored. Here, by using magnetic neutron scattering, a phase sensitive probe of superconducting gap, we report the observation of a transition from the sign-reversed to sign-preserved Cooper-pairing symmetry with insignificant changes in Tc in the S-doped iron selenide superconductors KxFe2-y(Se1-zSz)2. We show that a rather sharp magnetic resonant mode well below the superconducting gap (2delta) in the undoped sample (z = 0) is replaced by a broad hump structure above 2delta under 50% S doping. These results cannot be readily explained by simple spin fluctuation-exchange pairing theories and, therefore, multiple pairing channels are required to describe superconductivity in this system. Our findings may also yield a simple explanation for the sometimes contradictory data on the sign of the superconducting order parameter in iron-based materials.

preprint2016arXiv

Type-Directed Code Reuse using Integer Linear Programming

In many common scenarios, programmers need to implement functionality that is already provided by some third party library. This paper presents a tool called Hunter that facilitates code reuse by finding relevant methods in large code bases and automatically synthesizing any necessary wrapper code. The key technical idea underlying our approach is to use types to both improve search results and guide synthesis. Specifically, our method computes similarity metrics between types and uses this information to solve an integer linear programming (ILP) problem in which the objective is to minimize the cost of synthesis. We have implemented Hunter as an Eclipse plug-in and evaluate it by (a) comparing it against S6, a state-of-the-art code reuse tool, and (b) performing a user study. Our evaluation shows that Hunter compares favorably with S6 and significantly increases programmer productivity.

preprint2015arXiv

An Updated Study for $Υ$ Production and Polarization at the Tevatron and LHC

Following the nonrelativistic QCD factorization scheme, by taking latest available measurement on $χ_b(3P)$ into consideration, we present an updated study on the yield and polarization of $Υ(1S,2S,3S)$ hadroproduction, and the fractions of $χ_b(mP)$ feed-down in $Υ(nS)$ production at QCD next-to-leading order. In the fitting, three schemes are applied with different choice of $χ_b(mP)$ feed-down ratios and NRQCD factorization scale. The results can explain the measurements on yield very well as in our previous work. The polarization puzzle to $Υ(3S)$ is now solved by considering the $χ_b(3P)$ feed-down contributions. The ratio of $σ[χ_{b2}(1P)]/σ[χ_{b1}(1P)]$ measurements from CMS can also be reproduced in our prediction. Among the different schemes, the results show little difference, but there are sizeable difference for the fitted long-distance color-octet matrix elements. It may bring large uncertainty when the values are applied in other experiments such as in $ee,~ep$ colliders.

preprint2015arXiv

Energy Dependence of Direct-Quarkonium Production in pp Collisions from Fixed-Target to LHC Energies: Complete One-Loop Analysis

We compute the energy dependence of the P_T-integrated cross section of directly produced quarkonia in pp collisions at next-to-leading order (NLO), namely up to alpha_s^3, within nonrelativistic QCD (NRQCD). Our analysis is based on the idea that the P_T-integrated and the P_T-differential cross sections can be treated as two different observables. The colour-octet NRQCD parameters needed to predict the P_T-integrated yield can thus be extracted from the fits of the P_T-differential cross sections at mid and large P_T. For the first time, the total cross section is evaluated in NRQCD at full NLO accuracy using the recent NLO fits of the P_T-differential yields at RHIC, the Tevatron and the LHC. Both the normalisation and the energy dependence of the J/psi, psi' and Upsilon(1S), we obtained, are in disagreement with the data irrespective of the fit method. The same is true if one uses CEM-like colour-octet NRQCD parameters. If, on the contrary, one disregards the colour-octet contribution, the existing data in the TeV range are well described by the alpha_s^3 contribution in the colour-singlet model --which, at alpha_s^4, however shows an unphysical energy dependence. A similar observation is made for eta(c,b). This calls for a full NNLO or for a resummation of the initial-state radiation in this channel. In any case, past claims that colour-octet transitions are dominantly responsible for low-P_T quarkonium production are not supported by our results. This may impact the interpretation of quarkonium suppression in high-energy proton-nucleus and nucleus-nucleus collisions.

preprint2015arXiv

Intrinsic alignments of galaxies in the MassiveBlack-II simulation: analysis of two-point statistics

The intrinsic alignment of galaxies with the large-scale density field is an important astrophysical contaminant in upcoming weak lensing surveys. We present detailed measurements of the galaxy intrinsic alignments and associated ellipticity-direction (ED) and projected shape ($w_{g+}$) correlation functions for galaxies in the cosmological hydrodynamic MassiveBlack-II (MB-II) simulation. We carefully assess the effects on galaxy shapes, misalignment of the stellar component with the dark matter shape and two-point statistics of iterative weighted (by mass and luminosity) definitions of the (reduced and unreduced) inertia tensor. We find that iterative procedures must be adopted for a reliable measurement of the reduced tensor but that luminosity versus mass weighting has only negligible effects. Both ED and $w_{g+}$ correlations increase in amplitude with subhalo mass (in the range of $10^{10} - 6.0\times 10^{14}h^{-1}M_{\odot}$), with a weak redshift dependence (from $z=1$ to $z=0.06$) at fixed mass. At $z \sim 0.3$, we predict a $w_{g+}$ that is in reasonable agreement with SDSS LRG measurements and that decreases in amplitude by a factor of $\sim 5$--18 for galaxies in the LSST survey. We also compared the intrinsic alignments of centrals and satellites, with clear detection of satellite radial alignments within their host halos. Finally, we show that $w_{g+}$ (using subhalos as tracers of density) and $w_{δ+}$ (using dark matter density) predictions from the simulations agree with that of non-linear alignment models (NLA) at scales where the 2-halo term dominates in the correlations (and tabulate associated NLA fitting parameters). The 1-halo term induces a scale dependent bias at small scales which is not modeled in the NLA model.

preprint2015arXiv

Luminosity function of [OII] emission-line galaxies in the MassiveBlack-II simulation

We examine the luminosity function (LF) of [OII] emission-line galaxies in the high-resolution cosmological simulation MassiveBlack-II (MBII). From the spectral energy distribution of each galaxy, we select a sub-sample of star-forming galaxies at $0.06 \le z \le 3.0$ using the [OII] emission line luminosity L([OII]). We confirm that the specific star formation rate matches that in the GAMA survey. We show that the [OII] LF at z=1.0 from the MBII shows a good agreement with the LFs from several surveys below L([OII])=$10^{43.0}$ erg/s while the low redshifts ($z \le 0.3$) show an excess in the prediction of bright [OII] galaxies, but still displaying a good match with observations below L([OII])=$10^{41.6}$ erg/s. Based on the validity in reproducing the properties of [OII] galaxies at low redshift ($z \le 1$), we forecast the evolution of the [OII] LF at high redshift ($z \le 3$), which can be tested by upcoming surveys such as the HETDEX and DESI. The slopes of the LFs at bright and faint ends range from -3 to -2 showing minima at z=2. The slope of the bright end evolves approximately as 1/(z+1) at z=2 while the faint end evolves as ~3/(z+1) at $0.6 \le z \le 2$. In addition, a similar analysis is applied for the evolution of [OIII] LFs, which is to be explored in the forthcoming survey WFIRST-AFTA. Finally, we show that the auto-correlation function of [OII] and [OIII] emitting galaxies shows a rapid evolution from z=2 to 1.

preprint2015arXiv

Mock Quasar-Lyman-α Forest Data-sets for the SDSS-III Baryon Oscillation Spectroscopic Survey

We describe mock data-sets generated to simulate the high-redshift quasar sample in Data Release 11 (DR11) of the SDSS-III Baryon Oscillation Spectroscopic Survey (BOSS). The mock spectra contain Lyα forest correlations useful for studying the 3D correlation function including Baryon Acoustic Oscillations (BAO). They also include astrophysical effects such as quasar continuum diversity and high-density absorbers, instrumental effects such as noise and spectral resolution, as well as imperfections introduced by the SDSS pipeline treatment of the raw data. The Lyα forest BAO analysis of the BOSS collaboration, described in Delubac et al. 2014, has used these mock data-sets to develop and cross-check analysis procedures prior to performing the BAO analysis on real data, and for continued systematic cross checks. Tests presented here show that the simulations reproduce sufficiently well important characteristics of real spectra. These mock data-sets will be made available together with the data at the time of the Data Release 11.

preprint2015arXiv

Structural and Magnetic Phase Diagram of CrAs and its Relationship with Pressure-induced Superconductivity

Most unconventional superconductors, including cuprates and iron-based superconductors, are derived from chemical doping or application of pressure on their collinearly magnetic-ordered parent compounds[1-5]. The recently discovered pressure-induced superconductor CrAs, as a rare example of a non-collinear helimagnetic superconductor, has therefore generated great interest in understanding microscopic magnetic properties and their interplay with superconductivity [6-8]. Unlike cuprates and iron based superconductors where the magnetic moment direction barely changes upon doping, here we show that CrAs exhibits a spin reorientation from the ab plane to the ac plane, along with an abrupt drop of the magnetic propagation vector at a critical pressure (Pc~0.6 GPa). This magnetic phase transition coincides with the emergence of bulk superconductivity, indicating a direct connection between magnetism and superconductivity. With further increasing pressure, the magnetic order completely disappears near the optimal Tc regime (P~0.94 GPa). Moreover, the Cr magnetic moments between nearest neighbors tend to be aligned antiparallel with increasing pressure toward the optimal superconductivity regime. Our findings suggest that the non-collinear helimagnetic order is strongly coupled to structural and electronic degrees of freedom, and that antiferromagnetic correlations associated with the low magnetic vector phase are crucial for superconductivity.

preprint2015arXiv

The BlueTides Simulation: First Galaxies and Reionization

We introduce the BlueTides simulation and report initial results for the luminosity functions of the first galaxies and AGN, and their contribution to reionization. BlueTides was run on the BlueWaters cluster at NCSA from $z=99$ to $z=8.0$ and includes 2$\times$7040$^3$ particles in a $400$Mpc/h per side box, making it the largest hydrodynamic simulation ever performed at high redshift. BlueTides includes a pressure-entropy formulation of smoothed particle hydrodynamics, gas cooling, star formation (including molecular hydrogen), black hole growth and models for stellar and AGN feedback processes. The star formation rate density in the simulation is a good match to current observational data at $z\sim 8-10$. We find good agreement between observations and the predicted galaxy luminosity function in the currently observable range $-18\le M_{\mathrm UV} \le -22.5$ with some dust extinction required to match the abundance of brighter objects. BlueTides implements a patchy reionization model that produces a fluctuating UV background. BlueTides predicts number counts for galaxies fainter than current observational limits which are consistent with extrapolating the faint end slope of the luminosity function with a power law index $α\sim -1.8$ at $z\sim 8$ and redshift dependence of $α\sim (1+z)^{-0.4}$. The AGN population has a luminosity function well fit by a power law with a slope $α\sim -2.4$ that compares favourably with the deepest CANDELS-Goods fields. We investigate how these luminosity functions affect the progress of reionization, and find that a high Lyman-$α$ escape fraction ($f_\mathrm{esc} \sim 0.5$) is required if galaxies dominate the ionising photon budget during reionization. Smaller galaxy escape fractions imply a large contribution from faint AGN (down to $M_\mathrm{UV}=-12$) which results in a rapid reionization, disfavoured by current observations.

preprint2015arXiv

The formation of Milky Way-mass disk galaxies in the first 500 million years of a cold dark matter universe

Whether among the myriad tiny proto-galaxies there exists a population with similarities to present day galaxies is an open question. We show, using BlueTides, the first hydrodynamic simulation large enough to resolve the relevant scales, that the first massive galaxies to form are %in fact predicted to have extensive rotationally-supported disks. Although their morphology resembles in some ways Milky-way types seen at much lower redshifts, these high-redshift galaxies are smaller, denser, and richer in gas than their low redshift counterparts. From a kinematic analysis of a statistical sample of 216 galaxies at redshift $z=8-10$ we have found that disk galaxies make up 70\% of the population of galaxies with stellar mass $10^{10} M_\odot$ or greater. Cold Dark Matter cosmology therefore makes specific predictions for the population of large galaxies 500 million years after the Big Bang. We argue that wide-field satellite telescopes (e.g. WFIRST) will in the near future discover these first massive disk galaxies. The simplicity of their structure and formation history should make possible new tests of cosmology.

preprint2014arXiv

Galaxy Shapes and Intrinsic Alignments in The MassiveBlack-II Simulation

The intrinsic alignment of galaxy shapes with the large-scale density field is a contaminant to weak lensing measurements, as well as being an interesting signature of galaxy formation and evolution (albeit one that is difficult to predict theoretically). Here we investigate the shapes and relative orientations of the stars and dark matter of halos and subhalos (central and satellite) extracted from the MassiveBlack-II simulation, a state-of-the-art high resolution hydrodynamical cosmological simulation which includes stellar and AGN feedback in a volume of $(100{h^{-1}\mathrm{Mpc}})^3$. We consider redshift evolution from $z=1$ to $0.06$ and mass evolution within the range of subhalo masses, $10^{10} -6.0 \times 10^{14.0}{h^{-1}M_{\odot}}$. The shapes of the dark matter distributions are generally more round than the shapes defined by stellar matter. The projected root-mean-square (RMS) ellipticity per component for stellar matter is measured to be $e_\text{rms} = 0.28$ at $z=0.3$ for $M_{subhalo}> 10^{12.0}{h^{-1}M_{\odot}}$, which compares favourably with observational measurements. We find that the shapes of stellar and dark matter are more round for less massive subhalos and at lower redshifts. By directly measuring the relative orientation of the stellar matter and dark matter of subgroups, we find that, on average, the misalignment between the two components is larger for less massive subhalos. The mean misalignment angle varies from $\sim 30^{\circ}-10^{\circ}$ for $M \sim 10^{10} - 10^{14} {h^{-1}M_{\odot}}$ and shows a weak dependence on redshift. We also compare the misalignment angles in central and satellite subhalos at fixed subhalo mass, and find that centrals are more misaligned than satellites. We present fitting formulae for the shapes of dark and stellar matter in subhalos and also the probability distributions of misalignment angles.

preprint2014arXiv

Scaling relations between black holes and their host galaxies: comparing theoretical and observational measurements, and the impact of selection effects

We use the high-resolution simulation MassiveBlackII to examine scaling relations between black hole mass (MBH) and host galaxy properties (sigma, M*, and LV), finding good agreement with observational data, especially at the high-mass end. The simulations have less intrinsic scatter than observations, and the MBH-LV correlation has the largest scatter, suggesting it may the the least fundamental of the three relations. We find Gaussian scatter about all three relations, except among the highest mass galaxies, which host more massive black holes. Below z~2 the slopes for the full population remain roughly z-independent, and only steepen by 50% by z~4. The normalization of the sigma, LV relations evolve by 0.3, 0.43 dex, while the MBH correlation does not evolve to at least z~2. Testing for selection biases, we find samples selected by MBH or M* have steeper slopes than randomly selected samples. If unaccounted for, such a selection function would find faster evolution than inferred from a randomly selected sample, as objects at the highend of the relation tend to evolve more rapidly. We find a potential bias among high-LBH subsamples (tending to reside in higher mass galaxies), but these bright-AGN exhibit no intrinsic bias relative to fainter ones in equivalent-mass hosts, nor is there a significant difference between active- and inactive-samples. Finally we characterize the evolution of individual black holes along the scaling planes. Below the local relation, black holes grow faster than their host (72% of black holes 0.3 dex below the mean relation have a MBH-M* trajectory steeper than the local relation), while those above have shallower trajectories (only 14% are steeper than local). Black holes tend to grow faster than their hosts until surpassing the local relation, at which point their growth is suppressed while their hosts continue to grow, returning them to the mean relation.

preprint2014arXiv

The MassiveBlack-II Simulation: The Evolution of Halos and Galaxies to z~0

(Abridged for arXiv)We investigate the properties of halos, galaxies and blackholes to z=0 in the high resolution hydrodynamical simulation MassiveBlack-II (MBII) which evolves a LCDM cosmology in a comoving volume Vbox=100(Mpc/h)^3. MBII is the highest resolution simulation of this size which includes a self-consistent model for star formation, black hole accretion and associated feedback. We provide a simulation browser web application which enables interactive search and tagging of halos, subhalos and their properties and publicly release our galaxy catalogs. Our analysis of the halo mass function (MF) in MBII reveals that baryons have strong effects, with changes in the halo abundance of 20-35% below the knee of the MF (Mhalo < 10^13.2 Msun/h at z=0) when compared to fits based on dark matter only simulations. We provide a fitting function for the halo MF out to redshift z=11 and discuss how the onset of non-universality in the MF limits the accuracy of our fit. We study the halo occupation distribution and clustering of galaxies, in particular the evolution and scale dependence of stochasticity and bias finding reasonable agreement with observational data. The shape of the cosmic spectral energy distribution predicted by MBII is consistent with observations, but lower in amplitude. The Galaxy Stellar Mass Function (GSMF) function is broadly consistent with observations at z>=2. At z<2, the population of passive low mass (for M*<10^9 Msun) galaxies in MBII makes the GSMF too steep compared to observations whereas at the high mass end (M*>10^11 Msun) galaxies hosting bright AGNs make significant contributions to the GSMF. The quasar bolometric luminosity function is also largely consistent with observations. We note however that more efficient AGN feedback (beyond simple thermal coupling used here) is likely necessary for the largest, rarest objects/clusters at low redshifts.

preprint2014arXiv

Where do galaxies end? Comparing measurement techniques of hydrodynamic-simulation galaxies' integrated properties

Using the suite of high-resolution zoom re-simulations of individual haloes by Martig et al., and the large-scale simulation \emph{MassiveBlack-II}, we examine the differences in measured galaxy properties from techniques with various aperture definitions of where galaxies end. We perform techniques popular in the literature and present a new technique of our own, where the aperture radius is based on the baryonic mass profiles of simulated (sub)haloes. For the average Milky-Way-mass system, we find the two most popular techniques in the literature return differences of order 30 per cent for stellar mass, a factor of 3 for gas mass, 40 per cent for star formation rate, and factors of several for gas accretion and ejection rates. Individual cases can show variations greater than this, with the severity dependent on the concentration of a given system. The average difference in integrated properties for a more general galaxy population are not as striking, but are still significant for stellar and gas mass, especially for optical-limit apertures. The large differences that can occur are problematic for comparing results from various publications. We stress the importance of both defining and justifying a technique choice and discourage using popular apertures that use an exact fraction of the virial radius, due to the unignorable variation in galaxy-to-(sub)halo size. Finally, we note that technique choice does not greatly affect simulated galaxies from lying within the scatter of observed scaling relations, but it can alter the derived best-fit slope for the Kennicutt-Schmidt relation.

preprint2013arXiv

Confronting predictions of the galaxy stellar mass function with observations at high-redshift

We investigate the evolution of the galaxy stellar mass function at high-redshift ($z\ge 5$) using a pair of large cosmological hydrodynamical simulations: {\em MassiveBlack} and {\em MassiveBlack-II}. By combining these simulations we can study the properties of galaxies with stellar masses greater than $10^{8}\,{\rm M_{\odot}}\,h^{-1}$ and (co-moving) number densities of $\log_{10}(ϕ\, [{\rm Mpc^{-3}\,dex^{-1}}\,h^{3}])>-8$. Observational determinations of the galaxy stellar mass function at very-high redshift typically assume a relation between the observed UV luminosity and stellar mass-to-light ratio which is applied to high-redshift samples in order to estimate stellar masses. This relation can also be measured from the simulations. We do this, finding two significant differences with the usual observational assumption: it evolves strongly with redshift and has a different shape. Using this relation to make a consistent comparison between galaxy stellar mass functions we find that at $z=6$ and above the simulation predictions are in good agreement with observed data over the whole mass range. Without using the correct UV luminosity and stellar mass-to-light ratio, the discrepancy would be up to two orders of magnitude for large galaxies $>10^{10}\,{\rm M_{\odot}}\,h^{-1}$. At $z=5$, however the stellar mass function for low mass $<10^{9}\,{\rm M_{\odot}}\,h^{-1}$ galaxies is overpredicted by factors of a few, consistent with the behaviour of the UV luminosity function, and perhaps a sign that feedback in the simulation is not efficient enough for these galaxies.

preprint2013arXiv

High redshift supermassive blackholes: accretion through cold flows

We use zoom-in techniques to re-simulate three high-redshift (z > 5.5) halos which host 10^9 solar mass blackholes from the ~ Gpc volume, MassiveBlack cosmological hydrodynamic simulation. We examine a number of factors potentially affecting supermassive blackhole growth at high redshift in cosmological simulations. These include numerical resolution, feedback prescriptions and formulation of smoothed particle hydrodynamics. We find that varying the size of the region over which feedback energy is deposited directly, either for fixed number of neighbours or fixed volume makes very little difference to the accretion history of blackholes. Changing mass resolution by factors of up to 64 also does not change the blackhole growth history significantly. We find that switching from the density-entropy formulation to the pressure-entropy formulation of smoothed particle hydrodynamics slightly increases the accretion rate onto blackholes. In general numerical details appear to have small effects on the main fueling mechanism for blackholes at these high redshifts. We examine the fashion by which this occurs, finding that the insensitivity to simulation technique seems to be a hallmark of the cold flow feeding picture of these high-z supermassive blackholes. We show that the gas that participates in critical accretion phases, in these massive objects at z > 6~7 is in all cases colder, denser, and forms more coherent streams than the average gas in the halo. This is also mostly the case when the blackhole accretion is feedback regulated (z < 6), however the distinction is less prominent. For our resimulated halos, cold flows appear to be a viable mechanism for forming the most massive blackholes in the early universe, occurring naturally in LambdaCDM models of structure formation. Not requiring fine tuning of numerical parameters, they seem to be physically inevitable in these objects.

preprint2013arXiv

Interpreting the observed UV continuum slopes of high-redshift galaxies

The observed UV continuum slope of star forming galaxies is strongly affected by the presence of dust. Its observation is then a potentially valuable diagnostic of dust attenuation, particularly at high-redshift where other diagnostics are currently inaccesible. Interpreting the observed UV continuum slope in the context of dust attenuation is often achieved assuming the empirically calibrated Meurer et al. (1999) relation. Implicit in this relation is the assumption of an intrinsic UV continuum slope ($β=-2.23$). However, results from numerical simulations suggest that the intrinsic UV continuum slopes of high-redshift star forming galaxies are bluer than this, and moreover vary with redshift. Using values of the intrinsic slope predicted by numerical models of galaxy formation combined with a Calzetti et al. (2000) reddening law we infer UV attenuations ($A_{1500}$) $0.35-0.5\,{\rm mag}$ ($A_{V}$: $0.14-0.2\,{\rm mag}$ assuming Calzetti et al. 2000) greater than simply assuming the Meurer relation. This has significant implications for the inferred amount of dust attenuation at very-high ($z\approx 7$) redshift given current observational constraints on $β$, combined with the Meurer relation, suggest dust attenuation to be virtually zero in all but the most luminous systems.

preprint2013arXiv

Theoretical predictions for the effect of nebular emission on the broad band photometry of high-redshift galaxies

By combining optical and near-IR observations from the Hubble Space Telescope with NIR photometry from the Spitzer Space Telescope it is possible to measure the rest-frame UV-optical colours of galaxies at z=4-8. The UV-optical spectral energy distribution of star formation dominated galaxies is the result of several different factors. These include the joint distribution of stellar masses, ages, and metallicities, and the subsequent reprocessing by dust and gas in the ISM. Using a large cosmological hydrodynamical simulation we investigate the predicted spectral energy distributions of galaxies at high-redshift with a particular emphasis on assessing the potential contribution of nebular emission. We find that the average pure stellar UV-optical colour correlates with both luminosity and redshift such that galaxies at lower-redshift and higher-luminosity are typically redder. Assuming the escape fraction of ionising photons is close to zero, the effect of nebular emission is to redden the UV-optical 1500-V_w colour by, on average, 0.4 mag at z=8 declining to 0.25 mag at z=4. Young and low-metallicity stellar populations, which typically have bluer pure stellar UV-optical colours, produce larger ionising luminosities and are thus more strongly affected by the reddening effects of nebular emission. This causes the distribution of 1500-V_w colours to narrow and the trends with luminosity and redshift to weaken. The strong effect of nebular emission leaves observed-frame colours critically sensitive to the source redshift. For example, increasing the redshift by 0.1 can result in observed frame colours changing by up to ~0.6. These predictions reinforce the need to include nebular emission when modelling the spectral energy distributions of galaxies at high-redshift and also highlight the difficultly in interpreting the observed colours of individual galaxies without precise redshifts.

preprint2012arXiv

Growth and anisotropy of ionization fronts near high redshift quasars in the MassiveBlack simulation

We use radiative transfer to study the growth of ionized regions around the brightest, z=8 quasars in a large cosmological hydrodynamic simulation that includes black hole growth and feedback (the MassiveBlack simulation). We find that in the presence of the quasar s the comoving HII bubble radii reach 10 Mpc/h after 20 My while with the stellar component alone the HII bubbles are smaller by at least an order of magnitude. Our calculations show that several features are not captured within an analytical growth model of Stromgren spheres. The X-ray photons from hard quasar spectra drive a smooth transition from fully neutral to partially neutral in the ionization front. However the transition from partially neutral to fully ionized is significantly more complex. We measure the distance to the edge of bubbles as a function of angle and use the standard deviation of these distances as a diagnostic of the isotropy of ionized regions. We find that the overlapping of nearby ionized regions from clustered halos not only increases the anisotropy, but also is the main mechanism which allows the outer radius to grow. We therefore predict that quasar ionized bubbles at this early stage in the reionization process should be both significantly larger and more irregularly shaped than bubbles around star forming galaxies. Before the star formation rate increases and the Universe fully reionizes, quasar bubbles will form the most striking and recognizable features in 21cm maps.

preprint2011arXiv

Cold flows and the first quasars

Observations of the most distant bright quasars imply that billion solar mass supermassive black holes (SMBH) have to be assembled within the first eight hundred million years. Under our standard galaxy formation scenario such fast growth implies large gas densities providing sustained accretion at critical or supercritical rates onto an initial black hole seed. It has been a long standing question whether and how such high black hole accretion rates can be achieved and sustained at the centers of early galaxies. Here we use our new cosmological hydrodynamic simulation (MassiveBlack) covering a volume (0.75 \Gpc)^3 appropriate for studying the rare first quasars to show that steady high density cold gas flows responsible for assembling the first galaxies produce the high gas densities that lead to sustained critical accretion rates and hence rapid growth commensurate with the existence of ~10^9 solar mass black holes as early as z~7. We find that under these conditions quasar feedback is not effective at stopping the cold gas from penetrating the central regions and hence cannot quench the accretion until the host galaxy reaches M_halo > 10^{12} solar masses. This cold-flow driven scenario for the formation of quasars implies that they should be ubiquitous in galaxies in the early universe and that major (proto)galaxy mergers are not a requirement for efficient fuel supply and growth, particularly for the earliest SMBHs.

preprint2011arXiv

Terapixel imaging of cosmological simulations

The increasing size of cosmological simulations has led to the need for new visualization techniques. We focus on Smoothed Particle Hydrodynamical (SPH) simulations run with the GADGET code and describe methods for visually accessing the entire simulation at full resolution. The simulation snapshots are rastered and processed on supercomputers into images that are ready to be accessed through a web interface (GigaPan). This allows any scientist with a web-browser to interactively explore simulation datasets in both in spatial and temporal dimensions, datasets which in their native format can be hundreds of terabytes in size or more. We present two examples, the first a static terapixel image of the MassiveBlack simulation, a P-GADGET SPH simulation with 65 billion particles, and the second an interactively zoomable animation of a different simulation with more than one thousand frames, each a gigapixel in size. Both are available for public access through the GigaPan web interface. We also make our imaging software publicly available.

preprint2011arXiv

The Formation of Galaxies Hosting z~6 Quasars

We investigate the formation and properties of galaxies hosting z~6 quasars, in the gigaparsec scale cosmological hydrodynamical simulation: MassiveBlack, which includes a self-consistent model for star formation, black hole accretion and associated feedback. We show that the MassiveBlack reproduces current estimates of the galaxy stellar mass function z=5, 6. We find that quasar hosts in the simulation are compact gas rich systems with high star formations rates of SFR ~ 100-1000 Msun/yr consistent with observed properties of Sloan quasar hosts in the redshift range 5.5 < z < 6.5. We show that the star-forming gas in these galaxies predominantly originates from high density cold streams which are able to penetrate the halo and grow the galaxy at the center. MassiveBlack predicts a deviation from the local Mbh-sigma and Mbh-Mstar relation implying that black holes are relatively more massive for a given stellar host at these redshifts.

preprint2011arXiv

Transaction fees and optimal rebalancing in the growth-optimal portfolio

The growth-optimal portfolio optimization strategy pioneered by Kelly is based on constant portfolio rebalancing which makes it sensitive to transaction fees. We examine the effect of fees on an example of a risky asset with a binary return distribution and show that the fees may give rise to an optimal period of portfolio rebalancing. The optimal period is found analytically in the case of lognormal returns. This result is consequently generalized and numerically verified for broad return distributions and returns generated by a GARCH process. Finally we study the case when investment is rebalanced only partially and show that this strategy can improve the investment long-term growth rate more than optimization of the rebalancing period.

Yu Feng

What is connected

Connect this record

See the researcher in context

Building this map preview

67 published item(s)

Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis

Automated Transpilation of Imperative to Functional Code using Neural-Guided Program Synthesis (Extended Version)

Crescent: Taming Memory Irregularities for Accelerating Deep Point Cloud Analytics

Determination of building flood risk maps from LiDAR mobile mapping data

FIBA: Frequency-Injection based Backdoor Attack in Medical Image Analysis

Injecting Numerical Reasoning Skills into Knowledge Base Question Answering Models

Real-Time Gaze Tracking with Event-Driven Eye Segmentation

Storage capacity of networks with discrete synapses and sparsely encoded memories

The ASTRID Simulation: Galaxy Formation and Reionization

The ASTRID simulation: the evolution of Supermassive Black Holes

The BlueTides Mock Image Catalogue: Simulated observations of high-redshift galaxies and predictions for JWST imaging surveys

The DESI $N$-body Simulation Project -- II. Suppressing sample variance with fast simulations

The Impact of Dust on the Sizes of Galaxies in the Epoch of Reionization

A fast particle-mesh simulation of non-linear cosmological structure formation with massive neutrinos

Falx: Synthesis-Powered Visualization Authoring

Massive Black Hole Mergers with Orbital Information: Predictions from the ASTRID Simulation

Phases of learning dynamics in artificial neural networks: with or without mislabeled data

The DESI $N$-body Simulation Project I: Testing the Robustness of Simulations for the DESI Dark Time Survey

Cosmic variance of $z>7$ galaxies: Prediction from BlueTides

Exclusive quarkonium production or decay in soft gluon factorization

How neural networks find generalizable solutions: Self-tuned annealing in deep learning

Imaging Systematics and Clustering of DESI Main Targets

Large scale simulations of H and He reionization and heating driven by stars and more energetic sources

Mesorasi: Architecture Support for Point Cloud Analytics via Delayed-Aggregation

More accurate simulations with separate initial conditions for baryons and dark matter

Nebular Line Emission During the Epoch of Reionization

Phase separation in the advective Cahn-Hilliard equation

Real-Time Spatio-Temporal LiDAR Point Cloud Compression

Singular hyperbolic metrics and negative subharmonic functions

The complete study on polarization of $Υ(nS)$ hadroproduction at QCD next-to-leading order

The early growth of supermassive black holes in cosmological hydrodynamic simulations with constrained Gaussian realizations

High mass and halo resolution from fast low resolution simulations

Neutron Spin Resonance in the Heavily Hole-doped KFe$_{2}$As$_{2}$ Superconductor

QSO obscuration at high redshift ($z \gtrsim 7$): Predictions from the BlueTides simulation

SciPy 1.0--Fundamental Algorithms for Scientific Computing in Python

Component-based Synthesis of Table Consolidation and Transformation Tasks from Examples

Forecasts for the WFIRST High Latitude Survey using the BlueTides Simulation

Monsters in the Dark: Predictions for Luminous Galaxies in the Early Universe from the BlueTides Simulation

On Energy Efficiency of the Nearest-Neighbor Cooperative Communication in Heterogeneous Networks

Perturbation theory, effective field theory, and oscillations in the power spectrum

The DESI Experiment Part I: Science,Targeting, and Survey Design

The DESI Experiment Part II: Instrument Design

The Lyman-continuum photon production efficiency in the high-redshift Universe

The Photometric Properties of Galaxies in the Early Universe

Transition from sign-reversed to sign-preserved Cooper-pairing symmetry in sulfur-doped iron selenide superconductors

Type-Directed Code Reuse using Integer Linear Programming

An Updated Study for $Υ$ Production and Polarization at the Tevatron and LHC

Energy Dependence of Direct-Quarkonium Production in pp Collisions from Fixed-Target to LHC Energies: Complete One-Loop Analysis

Intrinsic alignments of galaxies in the MassiveBlack-II simulation: analysis of two-point statistics

Luminosity function of [OII] emission-line galaxies in the MassiveBlack-II simulation

Mock Quasar-Lyman-α Forest Data-sets for the SDSS-III Baryon Oscillation Spectroscopic Survey

Structural and Magnetic Phase Diagram of CrAs and its Relationship with Pressure-induced Superconductivity

The BlueTides Simulation: First Galaxies and Reionization

The formation of Milky Way-mass disk galaxies in the first 500 million years of a cold dark matter universe

Galaxy Shapes and Intrinsic Alignments in The MassiveBlack-II Simulation

Scaling relations between black holes and their host galaxies: comparing theoretical and observational measurements, and the impact of selection effects

The MassiveBlack-II Simulation: The Evolution of Halos and Galaxies to z~0

Where do galaxies end? Comparing measurement techniques of hydrodynamic-simulation galaxies' integrated properties

Confronting predictions of the galaxy stellar mass function with observations at high-redshift

High redshift supermassive blackholes: accretion through cold flows

Interpreting the observed UV continuum slopes of high-redshift galaxies

Theoretical predictions for the effect of nebular emission on the broad band photometry of high-redshift galaxies

Growth and anisotropy of ionization fronts near high redshift quasars in the MassiveBlack simulation

Cold flows and the first quasars

Terapixel imaging of cosmological simulations

The Formation of Galaxies Hosting z~6 Quasars

Transaction fees and optimal rebalancing in the growth-optimal portfolio