Source author record

G. Jogesh Babu

G. Jogesh Babu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.IM Applications astro-ph.HE cs.CY hep-ex Machine Learning math.ST Methodology Statistics Theory

Catalog footprint

What is connected

7works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Incorporating Measurement Error in Astronomical Object Classification

Most general-purpose classification methods, such as support-vector machine (SVM) and random forest (RF), fail to account for an unusual characteristic of astronomical data: known measurement error uncertainties. In astronomical data, this information is often given in the data but discarded because popular machine learning classifiers cannot incorporate it. We propose a simulation-based approach that incorporates heteroscedastic measurement error into existing classification method to better quantify uncertainty in classification. The proposed method first simulates perturbed realizations of the data from a Bayesian posterior predictive distribution of a Gaussian measurement error model. Then, a chosen classifier is fit to each simulation. The variation across the simulations naturally reflects the uncertainty propagated from the measurement errors in both labeled and unlabeled data sets. We demonstrate the use of this approach via two numerical studies. The first is a thorough simulation study applying the proposed procedure to SVM and RF, which are well-known hard and soft classifiers, respectively. The second study is a realistic classification problem of identifying high-$z$ $(2.9 \leq z \leq 5.1)$ quasar candidates from photometric data. The data are from merged catalogs of the Sloan Digital Sky Survey, the $Spitzer$ IRAC Equatorial Survey, and the $Spitzer$-HETDEX Exploratory Large-Area Survey. The proposed approach reveals that out of 11,847 high-$z$ quasar candidates identified by a random forest without incorporating measurement error, 3,146 are potential misclassifications with measurement error. Additionally, out of $1.85$ million objects not identified as high-$z$ quasars without measurement error, 936 can be considered new candidates with measurement error.

preprint2021arXiv

A Statistician Teaches Deep Learning

Deep learning (DL) has gained much attention and become increasingly popular in modern data science. Computer scientists led the way in developing deep learning techniques, so the ideas and perspectives can seem alien to statisticians. Nonetheless, it is important that statisticians become involved -- many of our students need this expertise for their careers. In this paper, developed as part of a program on DL held at the Statistical and Applied Mathematical Sciences Institute, we address this culture gap and provide tips on how to teach deep learning to statistics graduate students. After some background, we list ways in which DL and statistical perspectives differ, provide a recommended syllabus that evolved from teaching two iterations of a DL graduate course, offer examples of suggested homework assignments, give an annotated list of teaching resources, and discuss DL in the context of two research areas.

preprint2020arXiv

Some Optimizations on Detecting Gravitational Wave Using Convolutional Neural Network

This work investigates the problem of detecting gravitational wave (GW) events based on simulated damped sinusoid signals contaminated with white Gaussian noise. It is treated as a classification problem with one class for the interesting events. The proposed scheme consists of the following two successive steps: decomposing the data using a wavelet packet, representing the GW signal and noise using the derived decomposition coefficients; and determining the existence of any GW event using a convolutional neural network (CNN) with a logistic regression output layer. The characteristics of this work is its comprehensive investigations on CNN structure, detection window width, data resolution, wavelet packet decomposition and detection window overlap scheme. Extensive simulation experiments show excellent performances for reliable detection of signals with a range of GW model parameters and signal-to-noise ratios. While we use a simple waveform model in this study, we expect the method to be particularly valuable when the potential GW shapes are too complex to be characterized with a template bank.

preprint2013arXiv

VOStat: A Statistical Web Service for Astronomers

VOStat is a Web service providing interactive statistical analysis of astronomical tabular datasets. It is integrated into the suite of analysis and visualization tools associated with the international Virtual Observatory (VO) through the SAMP communication system. A user supplies VOStat with a dataset extracted from the VO, or otherwise acquired, and chooses among $\sim 60$ statistical functions. These include data transformations, plots and summaries, density estimation, one- and two-sample hypothesis tests, global and local regressions, multivariate analysis and clustering, spatial analysis, directional statistics, survival analysis (for censored data like upper limits), and time series analysis. The statistical operations are performed using the public domain {\bf R} statistical software environment, including a small fraction of its $>4000$ {\bf CRAN} add-on packages. The purpose of VOStat is to facilitate a wider range of statistical analyses than are commonly used in astronomy, and to promote use of more advanced methodology in {\bf R} and {\bf CRAN}.

preprint2012arXiv

Statistical Methods for Astronomy

This review outlines concepts of mathematical statistics, elements of probability theory, hypothesis tests and point estimation for use in the analysis of modern astronomical data. Least squares, maximum likelihood, and Bayesian approaches to statistical inference are treated. Resampling methods, particularly the bootstrap, provide valuable procedures when distributions functions of statistics are not known. Several approaches to model selection and good- ness of fit are considered. Applied statistics relevant to astronomical research are briefly discussed: nonparametric methods for use when little is known about the behavior of the astronomical populations or processes; data smoothing with kernel density estimation and nonparametric regression; unsupervised clustering and supervised classification procedures for multivariate problems; survival analysis for astronomical datasets with nondetections; time- and frequency-domain times series analysis for light curves; and spatial statistics to interpret the spatial distributions of points in low dimensions. Two types of resources are presented: about 40 recommended texts and monographs in various fields of statistics, and the public domain R software system for statistical analysis. Together with its \sim 3500 (and growing) add-on CRAN packages, R implements a vast range of statistical procedures in a coherent high-level language with advanced graphics.

preprint2012arXiv

The Astrophysical Multimessenger Observatory Network (AMON)

We summarize the science opportunity, design elements, current and projected partner observatories, and anticipated science returns of the Astrophysical Multimessenger Observatory Network (AMON). AMON will link multiple current and future high-energy, multimessenger, and follow-up observatories together into a single network, enabling near real-time coincidence searches for multimessenger astrophysical transients and their electromagnetic counterparts. Candidate and high-confidence multimessenger transient events will be identified, characterized, and distributed as AMON alerts within the network and to interested external observers, leading to follow-up observations across the electromagnetic spectrum. In this way, AMON aims to evoke the discovery of multimessenger transients from within observatory subthreshold data streams and facilitate the exploitation of these transients for purposes of astronomy and fundamental physics. As a central hub of global multimessenger science, AMON will also enable cross-collaboration analyses of archival datasets in search of rare or exotic astrophysical phenomena.

preprint2011arXiv

Limit theorems for functions of marginal quantiles

Multivariate distributions are explored using the joint distributions of marginal sample quantiles. Limit theory for the mean of a function of order statistics is presented. The results include a multivariate central limit theorem and a strong law of large numbers. A result similar to Bahadur's representation of quantiles is established for the mean of a function of the marginal quantiles. In particular, it is shown that \[\sqrt{n}\Biggl(\frac{1}{n}\sum_{i=1}^nϕ\bigl(X_{n:i}^{(1)},...,X_{n:i}^{(d)}\bigr)-\barγ\Biggr)=\frac{1}{\sqrt{n}}\sum_{i=1}^nZ_{n,i}+\mathrm{o}_P(1)\] as $n\rightarrow\infty$, where $\barγ$ is a constant and $Z_{n,i}$ are i.i.d. random variables for each $n$. This leads to the central limit theorem. Weak convergence to a Gaussian process using equicontinuity of functions is indicated. The results are established under very general conditions. These conditions are shown to be satisfied in many commonly occurring situations.

G. Jogesh Babu

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Incorporating Measurement Error in Astronomical Object Classification

A Statistician Teaches Deep Learning

Some Optimizations on Detecting Gravitational Wave Using Convolutional Neural Network

VOStat: A Statistical Web Service for Astronomers

Statistical Methods for Astronomy

The Astrophysical Multimessenger Observatory Network (AMON)

Limit theorems for functions of marginal quantiles