Source author record

Robert J. Brunner

Robert J. Brunner appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph astro-ph.CO astro-ph.IM astro-ph.GA astro-ph.EP astro-ph.SR Computer Vision cs.CY Machine Learning physics.ed-ph stat.OT

Catalog footprint

What is connected

25works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Extended Isolation Forest

We present an extension to the model-free anomaly detection algorithm, Isolation Forest. This extension, named Extended Isolation Forest (EIF), resolves issues with assignment of anomaly score to given data points. We motivate the problem using heat maps for anomaly scores. These maps suffer from artifacts generated by the criteria for branching operation of the binary tree. We explain this problem in detail and demonstrate the mechanism by which it occurs visually. We then propose two different approaches for improving the situation. First we propose transforming the data randomly before creation of each tree, which results in averaging out the bias. Second, which is the preferred way, is to allow the slicing of the data to use hyperplanes with random slopes. This approach results in remedying the artifact seen in the anomaly score heat maps. We show that the robustness of the algorithm is much improved using this method by looking at the variance of scores of data points distributed along constant level sets. We report AUROC and AUPRC for our synthetic datasets, along with real-world benchmark datasets. We find no appreciable difference in the rate of convergence nor in computation time between the standard Isolation Forest and EIF.

preprint2016arXiv

Machine Learning and Cosmological Simulations II: Hydrodynamical Simulations

We extend a machine learning (ML) framework presented previously to model galaxy formation and evolution in a hierarchical universe using N-body + hydrodynamical simulations. In this work, we show that ML is a promising technique to study galaxy formation in the backdrop of a hydrodynamical simulation. We use the Illustris Simulation to train and test various sophisticated machine learning algorithms. By using only essential dark matter halo physical properties and no merger history, our model predicts the gas mass, stellar mass, black hole mass, star formation rate, $g-r$ color, and stellar metallicity fairly robustly. Our results provide a unique and powerful phenomenological framework to explore the galaxy-halo connection that is built upon a solid hydrodynamical simulation. The promising reproduction of the listed galaxy properties demonstrably place ML as a promising and a significantly more computationally efficient tool to study small-scale structure formation. We find that ML mimics a full-blown hydrodynamical simulation surprisingly well in a computation time of mere minutes. The population of galaxies simulated by ML, while not numerically identical to Illustris, is statistically and physically robust and follows the same fundamental observational constraints. Machine learning offers an intriguing and promising technique to create quick mock galaxy catalogs in the future.

preprint2016arXiv

Star-galaxy Classification Using Deep Convolutional Neural Networks

Most existing star-galaxy classifiers use the reduced summary information from catalogs, requiring careful feature extraction and selection. The latest advances in machine learning that use deep convolutional neural networks allow a machine to automatically learn the features directly from data, minimizing the need for input from human experts. We present a star-galaxy classification framework that uses deep convolutional neural networks (ConvNets) directly on the reduced, calibrated pixel values. Using data from the Sloan Digital Sky Survey (SDSS) and the Canada-France-Hawaii Telescope Lensing Survey (CFHTLenS), we demonstrate that ConvNets are able to produce accurate and well-calibrated probabilistic classifications that are competitive with conventional machine learning techniques. Future advances in deep learning may bring more success with current and forthcoming photometric surveys, such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope (LSST), because deep neural networks require very little, manual feature engineering.

preprint2016arXiv

Teaching Data Science

We describe an introductory data science course, entitled Introduction to Data Science, offered at the University of Illinois at Urbana-Champaign. The course introduced general programming concepts by using the Python programming language with an emphasis on data preparation, processing, and presentation. The course had no prerequisites, and students were not expected to have any programming experience. This introductory course was designed to cover a wide range of topics, from the nature of data, to storage, to visualization, to probability and statistical analysis, to cloud and high performance computing, without becoming overly focused on any one subject. We conclude this article with a discussion of lessons learned and our plans to develop new data science courses.

preprint2015arXiv

A Hybrid Ensemble Learning Approach to Star-Galaxy Classification

There exist a variety of star-galaxy classification techniques, each with their own strengths and weaknesses. In this paper, we present a novel meta-classification framework that combines and fully exploits different techniques to produce a more robust star-galaxy classification. To demonstrate this hybrid, ensemble approach, we combine a purely morphological classifier, a supervised machine learning method based on random forest, an unsupervised machine learning method based on self-organizing maps, and a hierarchical Bayesian template fitting method. Using data from the CFHTLenS survey, we consider different scenarios: when a high-quality training set is available with spectroscopic labels from DEEP2, SDSS, VIPERS, and VVDS, and when the demographics of sources in a low-quality training set do not match the demographics of objects in the test data set. We demonstrate that our Bayesian combination technique improves the overall performance over any individual classification method in these scenarios. Thus, strategies that combine the predictions of different classifiers may prove to be optimal in currently ongoing and forthcoming photometric surveys, such as the Dark Energy Survey and the Large Synoptic Survey Telescope.

preprint2015arXiv

Creating updated, scientifically-calibrated mosaic images for the RC3 catalogue

The Third Reference Catalogue of Bright Galaxies (RC3) is a reasonably complete listing of 23,011 nearby, large, bright galaxies. By using the final imaging data release from the Sloan Digital Sky Survey, we generate scientifically-calibrated FITS mosaics by using the montage program for all SDSS imaging bands for all RC3 galaxies that lie within the survey footprint. We further combine the SDSS g, r, and i band FITS mosaics for these galaxies to create color-composite images by using the STIFF program. We generalized this software framework to make FITS mosaics and color-composite images for an arbitrary catalog and imaging data set. Due to positional inaccuracies inherent in the RC3 catalog, we employ a recursive algorithm in our mosaicking pipeline that first determines the correct location for each galaxy, and subsequently applies the mosaicking procedure. As an additional test of this new software pipeline and to obtain mosaic images of a larger sample of RC3 galaxies, we also applied this pipeline to photographic data taken by the Second Palomar Observatory Sky Survey with $B_J$, $R_F$, and $I_N$ plates. We publicly release all generated data, accessible via a web search form, and the software pipeline to enable others to make galaxy mosaics by using other catalogs or surveys.

preprint2015arXiv

Machine Learning and Cosmological Simulations I: Semi-Analytical Models

We present a new exploratory framework to model galaxy formation and evolution in a hierarchical universe by using machine learning (ML). Our motivations are two-fold: (1) presenting a new, promising technique to study galaxy formation, and (2) quantitatively analyzing the extent of the influence of dark matter halo properties on galaxies in the backdrop of semi-analytical models (SAMs). We use the influential Millennium Simulation and the corresponding Munich SAM to train and test various sophisticated machine learning algorithms (k-Nearest Neighbors, decision trees, random forests and extremely randomized trees). By using only essential dark matter halo physical properties for haloes of $M>10^{12} M_{\odot}$ and a partial merger tree, our model predicts the hot gas mass, cold gas mass, bulge mass, total stellar mass, black hole mass and cooling radius at z = 0 for each central galaxy in a dark matter halo for the Millennium run. Our results provide a unique and powerful phenomenological framework to explore the galaxy-halo connection that is built upon SAMs and demonstrably place ML as a promising and a computationally efficient tool to study small-scale structure formation.

preprint2013arXiv

Narrow absorption line variability in repeat quasar observations from the Sloan Digital Sky Survey

We present the results from a time domain study of absorption lines detected in quasar spectra with repeat observations from the Sloan Digital Sky Survey Data Release 7 (SDSS DR7). Beginning with over 4500 unique time separation baselines of various absorption line species identified in the SDSS DR7 quasar spectra, we create a catalogue of 2522 quasar absorption line systems with two to eight repeat observations, representing the largest collection of unbiased and homogeneous multi-epoch absorption systems ever published. To investigate these systems for time variability of narrow absorption lines, we refine this sample based on the reliability of the system detection, the proximity of pixels with bright sky contamination to individual absorption lines, and the quality of the continuum fit. Variability measurements of this sub-sample based on the absorption line equivalent widths yield a total of 33 systems with indications of significantly variable absorption strengths on time-scales ranging from one day to several years in the rest frame of the absorption system. Of these, at least 10 are from a class known as intervening absorption systems caused by foreground galaxies along the line of sight to the background quasar. This is the first evidence of possible absorption line variability detected in intervening systems, and their short time-scale variations suggest that small-scale structures (~10-100 au) are likely to exist in their host foreground galaxies.

preprint2010arXiv

Data Mining and Machine Learning in Astronomy

We review the current state of data mining and machine learning in astronomy. 'Data Mining' can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black-box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those where data mining techniques directly resulted in improved science, and important current and future directions, including probability density functions, parallel algorithms, petascale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm, and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.

preprint2010arXiv

Evolution of the Clustering of Photometrically Selected SDSS Galaxies

We measure the angular auto-correlation functions (w) of SDSS galaxies selected to have photometric redshifts 0.1 < z < 0.4 and absolute r-band magnitudes Mr < -21.2. We split these galaxies into five overlapping redshift shells of width 0.1 and measure w in each subsample in order to investigate the evolution of SDSS galaxies. We find that the bias increases substantially with redshift - much more so than one would expect for a passively evolving sample. We use halo-model analysis to determine the best-fit halo-occupation-distribution (HOD) for each subsample, and the best-fit models allow us to interpret the change in bias physically. In order to properly interpret our best-fit HODs, we convert each halo mass to its z = 0 passively evolved bias (bo), enabling a direct comparison of the best-fit HODs at different redshifts. We find that the minimum halo bo required to host a galaxy decreases as the redshift decreases, suggesting that galaxies with Mr < -21.2 are forming in halos at the low-mass end of the HODs over our redshift range. We use the best-fit HODs to determine the change in occupation number divided by the change in mass of halos with constant bo and we find a sharp peak at bo ~ 0.9 - corresponding to an average halo mass of ~ 10^12Msol/h. We thus present the following scenario: the bias of galaxies with Mr < -21.2 decreases as the Universe evolves because these galaxies form in halos of mass ~ 10^12Msol/h (independent of redshift), and the bias of these halos naturally decreases as the Universe evolves.

preprint2009arXiv

A Cross-Correlation Analysis of Mg II Absorption Line Systems and Luminous Red Galaxies from the SDSS DR5

We analyze the cross-correlation of 2,705 unambiguously intervening Mg II (2796,2803A) quasar absorption line systems with 1,495,604 luminous red galaxies (LRGs) from the Fifth Data Release of the Sloan Digital Sky Survey within the redshift range 0.36<=z<=0.8. We confirm with high precision a previously reported weak anti-correlation of equivalent width and dark matter halo mass, measuring the average masses to be log M_h(M_[solar]h^-1)=11.29 [+0.36,-0.62] and log M_h(M_[solar]h^-1)=12.70 [+0.53,-1.16] for systems with W[2796A]>=1.4A and 0.8A<=W[2796A]<1.4A, respectively. Additionally, we investigate the significance of a number of potential sources of bias inherent in absorber-LRG cross-correlation measurements, including absorber velocity distributions and the weak lensing of background quasars, which we determine is capable of producing a 20-30% bias in angular cross-correlation measurements on scales less than 2'. We measure the Mg II - LRG cross-correlation for 719 absorption systems with v<60,000 km s^-1 in the quasar rest frame and find that these associated absorbers typically reside in dark matter haloes that are ~10-100 times more massive than those hosting unambiguously intervening Mg II absorbers. Furthermore, we find evidence for evolution of the redshift number density, dN/dz, with 2-sigma significance for the strongest (W>2.0A) absorbers in the DR5 sample. This width-dependent dN/dz evolution does not significantly affect the recovered equivalent width-halo mass anti-correlation and adds to existing evidence that the strongest Mg II absorption systems are correlated with an evolving population of field galaxies at z<0.8, while the non-evolving dN/dz of the weakest absorbers more closely resembles that of the LRG population.

preprint2009arXiv

Clustering of Low-Redshift (z <= 2.2) Quasars from the Sloan Digital Sky Survey

We present measurements of the quasar two-point correlation function, ξ_{Q}, over the redshift range z=0.3-2.2 based upon data from the SDSS. Using a homogeneous sample of 30,239 quasars with spectroscopic redshifts from the DR5 Quasar Catalogue, our study represents the largest sample used for this type of investigation to date. With this redshift range and an areal coverage of approx 4,000 deg^2, we sample over 25 h^-3 Gpc^3 (comoving) assuming the current LCDM cosmology. Over this redshift range, we find that the redshift-space correlation function, xi(s), is adequately fit by a single power-law, with s_{0}=5.95+/-0.45 h^-1 Mpc and γ_{s}=1.16+0.11-0.16 when fit over s=1-25 h^-1 Mpc. Using the projected correlation function we calculate the real-space correlation length, r_{0}=5.45+0.35-0.45 h^-1 Mpc and γ=1.90+0.04-0.03, over scales of rp=1-130 h^-1 Mpc. Dividing the sample into redshift slices, we find very little, if any, evidence for the evolution of quasar clustering, with the redshift-space correlation length staying roughly constant at s_{0} ~ 6-7 h^-1 Mpc at z<2.2 (and only increasing at redshifts greater than this). Comparing our clustering measurements to those reported for X-ray selected AGN at z=0.5-1, we find reasonable agreement in some cases but significantly lower correlation lengths in others. We find that the linear bias evolves from b~1.4 at z=0.5 to b~3 at z=2.2, with b(z=1.27)=2.06+/-0.03 for the full sample. We compare our data to analytical models and infer that quasars inhabit dark matter haloes of constant mass M ~2 x 10^12 h^-1 M_Sol from redshifts z~2.5 (the peak of quasar activity) to z~0. [ABRIDGED]

preprint2009arXiv

Eight-Dimensional Mid-Infrared/Optical Bayesian Quasar Selection

We explore the multidimensional, multiwavelength selection of quasars from mid-IR (MIR) plus optical data, specifically from Spitzer-IRAC and the Sloan Digital Sky Survey (SDSS). We apply modern statistical techniques to combined Spitzer MIR and SDSS optical data, allowing up to 8-D color selection of quasars. Using a Bayesian selection method, we catalog 5546 quasar candidates to an 8.0 um depth of 56 uJy over an area of ~24 sq. deg; ~70% of these candidates are not identified by applying the same Bayesian algorithm to 4-color SDSS optical data alone. Our selection recovers 97.7% of known type 1 quasars in this area and greatly improves the effectiveness of identifying 3.5<z<5 quasars. Even using only the two shortest wavelength IRAC bandpasses, it is possible to use our Bayesian techniques to select quasars with 97% completeness and as little as 10% contamination. This sample has a photometric redshift accuracy of 93.6% (Delta Z +/-0.3), remaining roughly constant when the two reddest MIR bands are excluded. While our methods are designed to find type 1 (unobscured) quasars, as many as 1200 of the objects are type 2 (obscured) quasar candidates. Coupling deep optical imaging data with deep mid-IR data could enable selection of quasars in significant numbers past the peak of the quasar luminosity function (QLF) to at least z~4. Such a sample would constrain the shape of the QLF and enable quasar clustering studies over the largest range of redshift and luminosity to date, yielding significant gains in our understanding of quasars and the evolution of galaxies.

preprint2009arXiv

Halo-model Analysis of the Clustering of Photometrically Selected Galaxies from SDSS

We measure the angular 2-point correlation functions of galaxies in a volume limited, photometrically selected galaxy sample from the fifth data release of the Sloan Digital Sky Survey. We split the sample both by luminosity and galaxy type and use a halo-model analysis to find halo-occupation distributions that can simultaneously model the clustering of all, early-, and late-type galaxies in a given sample. Our results for the full galaxy sample are generally consistent with previous results using the SDSS spectroscopic sample, taking the differences between the median redshifts of the photometric and spectroscopic samples into account. We find that our early- and late- type measurements cannot be fit by a model that allows early- and late-type galaxies to be well-mixed within halos. Instead, we introduce a new model that segregates early- and late-type galaxies into separate halos to the maximum allowed extent. We determine that, in all cases, it provides a good fit to our data and thus provides a new statistical description of the manner in which early- and late-type galaxies occupy halos.

preprint2009arXiv

LSST Science Book, Version 2.0

A survey that can cover the sky in optical bands over wide fields to faint magnitudes with a fast cadence will enable many of the exciting science opportunities of the next decade. The Large Synoptic Survey Telescope (LSST) will have an effective aperture of 6.7 meters and an imaging camera with field of view of 9.6 deg^2, and will be devoted to a ten-year imaging survey over 20,000 deg^2 south of +15 deg. Each pointing will be imaged 2000 times with fifteen second exposures in six broad bands from 0.35 to 1.1 microns, to a total point-source depth of r~27.5. The LSST Science Book describes the basic parameters of the LSST hardware, software, and observing plans. The book discusses educational and outreach opportunities, then goes on to describe a broad range of science that LSST will revolutionize: mapping the inner and outer Solar System, stellar populations in the Milky Way and nearby galaxies, the structure of the Milky Way disk and halo and other objects in the Local Volume, transient and variable objects both at low and high redshift, and the properties of normal and active galaxies at low and high redshift. It then turns to far-field cosmological topics, exploring properties of supernovae to z~1, strong and weak lensing, the large-scale distribution of galaxies and baryon oscillations, and how these different probes may be combined to constrain cosmological models and the physics of dark energy.

preprint2008arXiv

Efficient Photometric Selection of Quasars from the Sloan Digital Sky Survey: II. ~1,000,000 Quasars from Data Release Six

We present a catalog of 1,172,157 quasar candidates selected from the photometric imaging data of the Sloan Digital Sky Survey (SDSS). The objects are all point sources to a limiting magnitude of i=21.3 from 8417 sq. deg. of imaging from SDSS Data Release 6 (DR6). This sample extends our previous catalog by using the latest SDSS public release data and probing both UV-excess and high-redshift quasars. While the addition of high-redshift candidates reduces the overall efficiency (quasars:quasar candidates) of the catalog to ~80%, it is expected to contain no fewer than 850,000 bona fide quasars -- ~8 times the number of our previous sample, and ~10 times the size of the largest spectroscopic quasar catalog. Cross-matching between our photometric catalog and spectroscopic quasar catalogs from both the SDSS and 2dF Surveys, yields 88,879 spectroscopically confirmed quasars. For judicious selection of the most robust UV-excess sources (~500,000 objects in all), the efficiency is nearly 97% -- more than sufficient for detailed statistical analyses. The catalog's completeness to type 1 (broad-line) quasars is expected to be no worse than 70%, with most missing objects occurring at z<0.7 and 2.5<z<3.0. In addition to classification information, we provide photometric redshift estimates (typically good to Delta z +/- 0.3 [2 sigma]) and cross-matching with radio, X-ray, and proper motion catalogs. Finally, we consider the catalog's utility for determining the optical luminosity function of quasars and are able to confirm the flattening of the bright-end slope of the quasar luminosity function at z~4 as compared to z~2.

preprint2008arXiv

Normalization of the Matter Power Spectrum via Higher-Order Angular Correlations of Luminous Red Galaxies

We present a novel technique to measure $σ_8$, by measuring the dependence of the second-order bias of a density field on $σ_8$ using two separate techniques. Each technique employs area-averaged angular correlation functions ($\barω_N$), one relying on the shape of $\barω_2$, the other relying on the amplitude of $s_3$ ($s_3 =\barω_3/\barω_2^2$). We confirm the validity of the method by testing it on a mock catalog drawn from Millennium Simulation data and finding $σ_8^{measured}- σ_8^{true} = -0.002 \pm 0.062$. We create a catalog of photometrically selected LRGs from SDSS DR5 and separate it into three distinct data sets by photometric redshift, with median redshifts of 0.47, 0.53, and 0.61. Measurements of $c_2$, and $σ_8$ are made for each data set, assuming flat geometry and WMAP3 best-fit priors on $Ω_m$, $h$, and $Γ$. We find, with increasing redshfit, $c_2 = 0.09 \pm 0.04$, $0.09 \pm 0.05$, and $0.09 \pm 0.03$ and $σ_8 = 0.78 \pm 0.08$, $0.80 \pm 0.09$, and $0.80 \pm 0.09$. We combine these three consistent $σ_8$ measurements to produce the result $σ_8 = 0.79 \pm 0.05$. Allowing the parameters $Ω_m$, $h$, and $Γ$ to vary within their WMAP3 1$σ$ error, we find that the best-fit $σ_8$ does not change by more than 8% and we are thus confident our measurement is accurate to within 10%. We anticipate that future surveys, such as Pan-STARRS, DES, and LSST, will be able to employ this method to measure $σ_8$ to great precision, and will serve as an important check, complementary, on the values determined via more established methods.

preprint2008arXiv

Quasar Clustering from SDSS DR5: Dependences on Physical Properties

Using a homogenous sample of 38,208 quasars with a sky coverage of $4000 {\rm deg^2}$ drawn from the SDSS Data Release Five quasar catalog, we study the dependence of quasar clustering on luminosity, virial black hole mass, quasar color, and radio loudness. At $z<2.5$, quasar clustering depends weakly on luminosity and virial black hole mass, with typical uncertainty levels $\sim 10%$ for the measured correlation lengths. These weak dependences are consistent with models in which substantial scatter between quasar luminosity, virial black hole mass and the host dark matter halo mass has diluted any clustering difference, where halo mass is assumed to be the relevant quantity that best correlates with clustering strength. However, the most luminous and most massive quasars are more strongly clustered (at the $\sim 2σ$ level) than the remainder of the sample, which we attribute to the rapid increase of the bias factor at the high-mass end of host halos. We do not observe a strong dependence of clustering strength on quasar colors within our sample. On the other hand, radio-loud quasars are more strongly clustered than are radio-quiet quasars matched in redshift and optical luminosity (or virial black hole mass), consistent with local observations of radio galaxies and radio-loud type 2 AGN. Thus radio-loud quasars reside in more massive and denser environments in the biased halo clustering picture. Using the Sheth et al.(2001) formula for the linear halo bias, the estimated host halo mass for radio-loud quasars is $\sim 10^{13} h^{-1}M_\odot$, compared to $\sim 2\times 10^{12} h^{-1}M_\odot$ for radio-quiet quasar hosts at $z\sim 1.5$.

preprint2007arXiv

Robust Machine Learning Applied to Astronomical Datasets II: Quantifying Photometric Redshifts for Quasars Using Instance-Based Learning

We apply instance-based machine learning in the form of a k-nearest neighbor algorithm to the task of estimating photometric redshifts for 55,746 objects spectroscopically classified as quasars in the Fifth Data Release of the Sloan Digital Sky Survey. We compare the results obtained to those from an empirical color-redshift relation (CZR). In contrast to previously published results using CZRs, we find that the instance-based photometric redshifts are assigned with no regions of catastrophic failure. Remaining outliers are simply scattered about the ideal relation, in a similar manner to the pattern seen in the optical for normal galaxies at redshifts z < ~1. The instance-based algorithm is trained on a representative sample of the data and pseudo-blind-tested on the remaining unseen data. The variance between the photometric and spectroscopic redshifts is sigma^2 = 0.123 +/- 0.002 (compared to sigma^2 = 0.265 +/- 0.006 for the CZR), and 54.9 +/- 0.7%, 73.3 +/- 0.6%, and 80.7 +/- 0.3% of the objects are within delta z < 0.1, 0.2, and 0.3 respectively. We also match our sample to the Second Data Release of the Galaxy Evolution Explorer legacy data and the resulting 7,642 objects show a further improvement, giving a variance of sigma^2 = 0.054 +/- 0.005, and 70.8 +/- 1.2%, 85.8 +/- 1.0%, and 90.8 +/- 0.7% of objects within delta z < 0.1, 0.2, and 0.3. We show that the improvement is indeed due to the extra information provided by GALEX, by training on the same dataset using purely SDSS photometry, which has a variance of sigma^2 = 0.090 +/- 0.007. Each set of results represents a realistic standard for application to further datasets for which the spectra are representative.

preprint2006arXiv

Quasars Probing Quasars I: Optically Thick Absorbers Near Luminous Quasars

With close pairs of quasars at different redshifts, a background quasar sightline can be used to study a foreground quasar's environment in absorption. We search 149 moderate resolution background quasar spectra, from Gemini, Keck, the MMT, and the SDSS to survey Lyman Limit Systems (LLSs) and Damped Ly-alpha systems (DLAs) in the vicinity of 1.8 < z < 4.0 luminous foreground quasars. A sample of 27 new quasar-absorber pairs is uncovered with column densities, 17.2 < log (N_HI/cm^2) < 20.9, and transverse (proper) distances of 22 kpc/h < R < 1.7 Mpc/h, from the foreground quasars. If they emit isotropically, the implied ionizing photon fluxes are a factor of ~ 5-8000 times larger than the ambient extragalactic UV background over this range of distances. The observed probability of intercepting an absorber is very high for small separations: six out of eight projected sightlines with transverse separations R < 150 kpc/h have an absorber coincident with the foreground quasar, of which four have log N_HI > 10^19. The covering factor of log N_HI > 10^19 absorbers is thus ~ 50 % (4/8) on these small scales, whereas < 2% would have been expected at random. There are many cosmological applications of these new sightlines: they provide laboratories for studying fluorescent Ly-alpha recombination radiation from LLSs, constrain the environments, emission geometry, and radiative histories of quasars, and shed light on the physical nature of LLSs and DLAs.

preprint2005arXiv

Active Galactic Nuclei in the Sloan Digital Sky Survey: I. Sample Selection

We have compiled a large sample of low-redshift active galactic nuclei (AGN) identified via their emission line characteristics from the spectroscopic data of the Sloan Digital Sky Survey. Since emission lines are often contaminated by stellar absorption lines, we developed an objective and efficient method of subtracting the stellar continuum from every galaxy spectrum before making emission line measurements. The distribution of the measured H$α$ Full Width at Half Maxima values of emission line galaxies is strongly bimodal, with two populations separated at about 1,200km s$^{-1}$. This feature provides a natural separation between narrow-line and broad-line AGN. The narrow-line AGN are identified using standard emission line ratio diagnostic diagrams. 1,317 broad-line and 3,074 narrow-line AGN are identified from about 100,000 galaxy spectra selected over 1151 square degrees. This sample is used in a companion paper to determine the emission-line luminosity function of AGN.

preprint2005arXiv

Active Galactic Nuclei in the Sloan Digital Sky Survey: II. Emission-Line Luminosity Function

The emission line luminosity function of active galactic nuclei (AGN) is measured from about 3000 AGN included in the main galaxy sample of the Sloan Digital Sky Survey within a redshift range of $0<z<0.15$. The $\Ha$ and [OIII]$λ5007$ luminosity functions for Seyferts cover luminosity range of $10^{5-9}$$L_\odot$ in H$α$ and the shapes are well fit by broken power laws, without a turnover at fainter nuclear luminosities. Assuming a universal conversion from emission line strength to continuum luminosity, the inferred B band magnitude luminosity function is comparable both to the AGN luminosity function of previous studies and to the low redshift quasar luminosity function derived from the 2dF redshift survey. The inferred AGN number density is approximately 1/5 of all galaxies and about $6\times 10^{-3}$ of the total light of galaxies in the $r$-band comes from the nuclear activity. The numbers of Seyfert 1s and Seyfert 2s are comparable at low luminosity, while at high luminosity, Seyfert 1s outnumber Seyfert 2s by a factor of 2-4. In making the luminosity function measurements, we assumed that the nuclear luminosity is independent of the host galaxy luminosity, an assumption we test {\it a posteriori}, and show to be consistent with the data. Given the relationship between black hole mass and host galaxy bulge luminosity, the lack of correlation between nuclear and host luminosity suggests that the main variable that determines the AGN luminosity is the Eddington ratio, not the black hole mass. This appears to be different from luminous quasars, which are most likely to be shining near the Eddington limit.

preprint2005arXiv

Binary Quasars in the Sloan Digital Sky Survey: Evidence for Excess Clustering on Small Scales

We present a sample of 218 new quasar pairs with proper transverse separations R_prop < 1 Mpc/h over the redshift range 0.5 < z < 3.0, discovered from an extensive follow up campaign to find companions around the Sloan Digital Sky Survey and 2dF Quasar Redshift Survey quasars. This sample includes 26 new binary quasars with separations R_prop < 50 kpc/h (theta < 10 arcseconds), more than doubling the number of such systems known. We define a statistical sample of binaries selected with homogeneous criteria and compute its selection function, taking into account sources of incompleteness. The first measurement of the quasar correlation function on scales 10 kpc/h < R_prop < 400 kpc/h is presented. For R_prop < 40 kpc/h, we detect an order of magnitude excess clustering over the expectation from the large scale R_prop > 3 Mpc/h quasar correlation function, extrapolated down as a power law to the separations probed by our binaries. The excess grows to ~ 30 at R_prop ~ 10 kpc/h, and provides compelling evidence that the quasar autocorrelation function gets progressively steeper on sub-Mpc scales. This small scale excess can likely be attributed to dissipative interaction events which trigger quasar activity in rich environments. Recent small scale measurements of galaxy clustering and quasar-galaxy clustering are reviewed and discussed in relation to our measurement of small scale quasar clustering.

preprint2003arXiv

Peculiar Broad Absorption Line Quasars found in DPOSS

With the recent release of large (i.e., > hundred million objects), well-calibrated photometric surveys, such as DPOSS, 2MASS, and SDSS, spectroscopic identification of important targets is no longer a simple issue. In order to enhance the returns from a spectroscopic survey, candidate sources are often preferentially selected to be of interest, such as brown dwarfs or high redshift quasars. This approach, while useful for targeted projects, risks missing new or unusual species. We have, as a result, taken the alternative path of spectroscopically identifying interesting sources with the sole criterion being that they are in low density areas of the g - r and r - i color-space defined by the DPOSS survey. In this paper, we present three peculiar broad absorption line quasars that were discovered during this spectroscopic survey, demonstrating the efficacy of this approach. PSS J0052+2405 is an Iron LoBAL quasar at a redshift z = 2.4512 with very broad absorption from many species. PSS J0141+3334 is a reddened LoBAL quasar at z = 3.005 with no obvious emission lines. PSS J1537+1227 is a Iron LoBAL at a redshift of z = 1.212 with strong narrow Mgii and Feii emission. Follow-up high resolution spectroscopy of these three quasars promises to improve our understanding of BAL quasars. The sensitivity of particular parameter spaces, in this case a two-color space, to the redshift of these three sources is dramatic, raising questions about traditional techniques of defining quasar populations for statistical analysis.

preprint2000arXiv

Digital Sky Surveys: Software Tools and Technologies

Large digital sky surveys, over a broad range of wavelengths, both from the ground and from space observatories, are becoming a major source of astronomical data. Some examples include the Sloan Digital Sky Survey (SDSS) and the Digital Palomar Observatory Sky Survey (DPOSS) in the visible, the Two-Micron All-Sky Survey (2MASS) in the near-infrared, the NRAO VLA Sky Survey (NVSS) and the Faint Images of the Radio Sky at Twenty centimeters (FIRST) in the radio. Many others surveys are planned or expected, in addition to the previously named surveys. While most surveys are exclusively imaging, large-scale spectroscopic surveys also exist. In addition, a number of experiments with specific scientific goals, e.g., microlensing surveys for MACHOs, searches for near-Earth asteroids, are generating comparable data volumes. Typical sizes of resulting data sets (as of the late 1990's) are in the range of tens of Terabytes of digital information, with detections of many millions or even billions of sources, and several tens of parameters measured for each detected source. This vast amount of new information presents both a great scientific opportunity and a great technological challenge: how to process, and calibrate the raw data; how to store, combine, and access them using modern computing hardware and networks; and how to visualize, explore and analyses these great data sets quickly and efficiently.

Robert J. Brunner

What is connected

Connect this record

See the researcher in context

Building this map preview

25 published item(s)

Extended Isolation Forest

Machine Learning and Cosmological Simulations II: Hydrodynamical Simulations

Star-galaxy Classification Using Deep Convolutional Neural Networks

Teaching Data Science

A Hybrid Ensemble Learning Approach to Star-Galaxy Classification

Creating updated, scientifically-calibrated mosaic images for the RC3 catalogue

Machine Learning and Cosmological Simulations I: Semi-Analytical Models

Narrow absorption line variability in repeat quasar observations from the Sloan Digital Sky Survey

Data Mining and Machine Learning in Astronomy

Evolution of the Clustering of Photometrically Selected SDSS Galaxies

A Cross-Correlation Analysis of Mg II Absorption Line Systems and Luminous Red Galaxies from the SDSS DR5

Clustering of Low-Redshift (z <= 2.2) Quasars from the Sloan Digital Sky Survey

Eight-Dimensional Mid-Infrared/Optical Bayesian Quasar Selection

Halo-model Analysis of the Clustering of Photometrically Selected Galaxies from SDSS

LSST Science Book, Version 2.0

Efficient Photometric Selection of Quasars from the Sloan Digital Sky Survey: II. ~1,000,000 Quasars from Data Release Six

Normalization of the Matter Power Spectrum via Higher-Order Angular Correlations of Luminous Red Galaxies

Quasar Clustering from SDSS DR5: Dependences on Physical Properties

Robust Machine Learning Applied to Astronomical Datasets II: Quantifying Photometric Redshifts for Quasars Using Instance-Based Learning

Quasars Probing Quasars I: Optically Thick Absorbers Near Luminous Quasars

Active Galactic Nuclei in the Sloan Digital Sky Survey: I. Sample Selection

Active Galactic Nuclei in the Sloan Digital Sky Survey: II. Emission-Line Luminosity Function

Binary Quasars in the Sloan Digital Sky Survey: Evidence for Excess Clustering on Small Scales

Peculiar Broad Absorption Line Quasars found in DPOSS

Digital Sky Surveys: Software Tools and Technologies