Source author record

Roman Garnett

Roman Garnett appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence astro-ph.GA Human-Computer Interaction astro-ph.CO astro-ph.IM Data Structures and Algorithms math.NA Numerical Analysis physics.data-an

Catalog footprint

What is connected

15works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Local Bayesian optimization via maximizing probability of descent

Local optimization presents a promising approach to expensive, high-dimensional black-box optimization by sidestepping the need to globally explore the search space. For objective functions whose gradient cannot be evaluated directly, Bayesian optimization offers one solution -- we construct a probabilistic model of the objective, design a policy to learn about the gradient at the current location, and use the resulting information to navigate the objective landscape. Previous work has realized this scheme by minimizing the variance in the estimate of the gradient, then moving in the direction of the expected gradient. In this paper, we re-examine and refine this approach. We demonstrate that, surprisingly, the expected value of the gradient is not always the direction maximizing the probability of descent, and in fact, these directions may be nearly orthogonal. This observation then inspires an elegant optimization scheme seeking to maximize the probability of descent while moving in the direction of most-probable descent. Experiments on both synthetic and real-world objectives show that our method outperforms previous realizations of this optimization scheme and is competitive against other, significantly more complicated baselines.

preprint2022arXiv

A Unified Comparison of User Modeling Techniques for Predicting Data Interaction and Detecting Exploration Bias

The visual analytics community has proposed several user modeling algorithms to capture and analyze users' interaction behavior in order to assist users in data exploration and insight generation. For example, some can detect exploration biases while others can predict data points that the user will interact with before that interaction occurs. Researchers believe this collection of algorithms can help create more intelligent visual analytics tools. However, the community lacks a rigorous evaluation and comparison of these existing techniques. As a result, there is limited guidance on which method to use and when. Our paper seeks to fill in this missing gap by comparing and ranking eight user modeling algorithms based on their performance on a diverse set of four user study datasets. We analyze exploration bias detection, data interaction prediction, and algorithmic complexity, among other measures. Based on our findings, we highlight open challenges and new directions for analyzing user interactions and visualization provenance.

preprint2022arXiv

Guided Data Discovery in Interactive Visualizations via Active Search

Recent advances in visual analytics have enabled us to learn from user interactions and uncover analytic goals. These innovations set the foundation for actively guiding users during data exploration. Providing such guidance will become more critical as datasets grow in size and complexity, precluding exhaustive investigation. Meanwhile, the machine learning community also struggles with datasets growing in size and complexity, precluding exhaustive labeling. Active learning is a broad family of algorithms developed for actively guiding models during training. We will consider the intersection of these analogous research thrusts. First, we discuss the nuances of matching the choice of an active learning algorithm to the task at hand. This is critical for performance, a fact we demonstrate in a simulation study. We then present results of a user study for the particular task of data discovery guided by an active learning algorithm specifically designed for this task.

preprint2020arXiv

Automated Measurement of Quasar Redshift with a Gaussian Process

We develop an automated technique to measure quasar redshifts in the Baryon Oscillation Spectroscopic Survey (BOSS) of the Sloan Digital Sky Survey (SDSS). Our technique is an extension of an earlier Gaussian process method for detecting damped Lyman-alpha absorbers (DLAs) in quasar spectra with known redshifts. We apply this technique to a subsample of SDSS DR12 with BAL quasars removed and redshift larger than 2.15. We show that we are broadly competitive to existing quasar redshift estimators, disagreeing with the PCA redshift by more than 0.5 in only 0.38% of spectra. Our method produces a probabilistic density function for the quasar redshift, allowing quasar redshift uncertainty to be propagated to downstream users. We apply this method to detecting DLAs, accounting in a Bayesian fashion for redshift uncertainty. Compared to our earlier method with a known quasar redshift, we have a moderate decrease in our ability to detect DLAs, predominantly in the noisiest spectra. The area under curve drops from 0.96 to 0.91. Our code is publicly available.

preprint2020arXiv

BINOCULARS for Efficient, Nonmyopic Sequential Experimental Design

Finite-horizon sequential experimental design (SED) arises naturally in many contexts, including hyperparameter tuning in machine learning among more traditional settings. Computing the optimal policy for such problems requires solving Bellman equations, which are generally intractable. Most existing work resorts to severely myopic approximations by limiting the decision horizon to only a single time-step, which can underweight exploration in favor of exploitation. We present BINOCULARS: Batch-Informed NOnmyopic Choices, Using Long-horizons for Adaptive, Rapid SED, a general framework for deriving efficient, nonmyopic approximations to the optimal experimental policy. Our key idea is simple and surprisingly effective: we first compute a one-step optimal batch of experiments, then select a single point from this batch to evaluate. We realize BINOCULARS for Bayesian optimization and Bayesian quadrature -- two notable SED problems with radically different objectives -- and demonstrate that BINOCULARS significantly outperforms myopic alternatives in real-world scenarios.

preprint2020arXiv

Detecting Multiple DLAs per Spectrum in SDSS DR12 with Gaussian Processes

We present a revised version of our automated technique using Gaussian processes (GPs) to detect Damped Lyman-$α$ absorbers (DLAs) along quasar (QSO) sightlines. The main improvement is to allow our Gaussian process pipeline to detect multiple DLAs along a single sightline. Our DLA detections are regularised by an improved model for the absorption from the Lyman-$α$ forest which improves performance at high redshift. We also introduce a model for unresolved sub-DLAs which reduces mis-classifications of absorbers without detectable damping wings. We compare our results to those of two different large-scale DLA catalogues and provide a catalogue of the processed results of our Gaussian process pipeline using 158 825 Lyman-$α$ spectra from SDSS data release 12. We present updated estimates for the statistical properties of DLAs, including the column density distribution function (CDDF), line density ($dN/dX$), and neutral hydrogen density ($Ω_{\textrm{DLA}}$).

preprint2020arXiv

Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees

Bayesian optimization is a sequential decision making framework for optimizing expensive-to-evaluate black-box functions. Computing a full lookahead policy amounts to solving a highly intractable stochastic dynamic program. Myopic approaches, such as expected improvement, are often adopted in practice, but they ignore the long-term impact of the immediate decision. Existing nonmyopic approaches are mostly heuristic and/or computationally expensive. In this paper, we provide the first efficient implementation of general multi-step lookahead Bayesian optimization, formulated as a sequence of nested optimization problems within a multi-step scenario tree. Instead of solving these problems in a nested way, we equivalently optimize all decision variables in the full tree jointly, in a ``one-shot'' fashion. Combining this with an efficient method for implementing multi-step Gaussian process ``fantasization,'' we demonstrate that multi-step expected improvement is computationally tractable and exhibits performance superior to existing methods on a wide range of benchmarks.

preprint2016arXiv

Active Search for Sparse Signals with Region Sensing

Autonomous systems can be used to search for sparse signals in a large space; e.g., aerial robots can be deployed to localize threats, detect gas leaks, or respond to distress calls. Intuitively, search algorithms may increase efficiency by collecting aggregate measurements summarizing large contiguous regions. However, most existing search methods either ignore the possibility of such region observations (e.g., Bayesian optimization and multi-armed bandits) or make strong assumptions about the sensing mechanism that allow each measurement to arbitrarily encode all signals in the entire environment (e.g., compressive sensing). We propose an algorithm that actively collects data to search for sparse signals using only noisy measurements of the average values on rectangular regions (including single points), based on the greedy maximization of information gain. We analyze our algorithm in 1d and show that it requires $\tilde{O}(\frac{n}{μ^2}+k^2)$ measurements to recover all of $k$ signal locations with small Bayes error, where $μ$ and $n$ are the signal strength and the size of the search space, respectively. We also show that active designs can be fundamentally more efficient than passive designs with region sensing, contrasting with the results of Arias-Castro, Candes, and Davenport (2013). We demonstrate the empirical performance of our algorithm on a search problem using satellite image data and in high dimensions.

preprint2015arXiv

Anomaly Detection and Removal Using Non-Stationary Gaussian Processes

This paper proposes a novel Gaussian process approach to fault removal in time-series data. Fault removal does not delete the faulty signal data but, instead, massages the fault from the data. We assume that only one fault occurs at any one time and model the signal by two separate non-parametric Gaussian process models for both the physical phenomenon and the fault. In order to facilitate fault removal we introduce the Markov Region Link kernel for handling non-stationary Gaussian processes. This kernel is piece-wise stationary but guarantees that functions generated by it and their derivatives (when required) are everywhere continuous. We apply this kernel to the removal of drift and bias errors in faulty sensor data and also to the recovery of EOG artifact corrupted EEG signals.

preprint2015arXiv

Differentially Private Bayesian Optimization

Bayesian optimization is a powerful tool for fine-tuning the hyper-parameters of a wide variety of machine learning models. The success of machine learning has led practitioners in diverse real-world settings to learn classifiers for practical problems. As machine learning becomes commonplace, Bayesian optimization becomes an attractive method for practitioners to automate the process of classifier hyper-parameter tuning. A key observation is that the data used for tuning models in these settings is often sensitive. Certain data such as genetic predisposition, personal email statistics, and car accident history, if not properly private, may be at risk of being inferred from Bayesian optimization outputs. To address this, we introduce methods for releasing the best hyper-parameters and classifier accuracy privately. Leveraging the strong theoretical guarantees of differential privacy and known Bayesian optimization convergence bounds, we prove that under a GP assumption these private quantities are also near-optimal. Finally, even if this assumption is not satisfied, we can use different smoothness guarantees to protect privacy.

preprint2014arXiv

Propagation Kernels

We introduce propagation kernels, a general graph-kernel framework for efficiently measuring the similarity of structured data. Propagation kernels are based on monitoring how information spreads through a set of given graphs. They leverage early-stage distributions from propagation schemes such as random walks to capture structural information encoded in node labels, attributes, and edge information. This has two benefits. First, off-the-shelf propagation schemes can be used to naturally construct kernels for many graph types, including labeled, partially labeled, unlabeled, directed, and attributed graphs. Second, by leveraging existing efficient and informative propagation schemes, propagation kernels can be considerably faster than state-of-the-art approaches without sacrificing predictive performance. We will also show that if the graphs at hand have a regular structure, for instance when modeling image or video data, one can exploit this regularity to scale the kernel computation to large databases of graphs with thousands of nodes. We support our contributions by exhaustive experiments on a number of real-world graphs from a variety of application domains.

preprint2014arXiv

Sampling for Inference in Probabilistic Models with Fast Bayesian Quadrature

We propose a novel sampling framework for inference in probabilistic models: an active learning approach that converges more quickly (in wall-clock time) than Markov chain Monte Carlo (MCMC) benchmarks. The central challenge in probabilistic inference is numerical integration, to average over ensembles of models or unknown (hyper-)parameters (for example to compute the marginal likelihood or a partition function). MCMC has provided approaches to numerical integration that deliver state-of-the-art inference, but can suffer from sample inefficiency and poor convergence diagnostics. Bayesian quadrature techniques offer a model-based solution to such problems, but their uptake has been hindered by prohibitive computation costs. We introduce a warped model for probabilistic integrands (likelihoods) that are known to be non-negative, permitting a cheap active learning scheme to optimally select sample locations. Our algorithm is demonstrated to offer faster convergence (in seconds) relative to simple Monte Carlo and annealed importance sampling on both synthetic and real-world examples.

preprint2013arXiv

Active Learning of Linear Embeddings for Gaussian Processes

We propose an active learning method for discovering low-dimensional structure in high-dimensional Gaussian process (GP) tasks. Such problems are increasingly frequent and important, but have hitherto presented severe practical difficulties. We further introduce a novel technique for approximately marginalizing GP hyperparameters, yielding marginal predictions robust to hyperparameter mis-specification. Our method offers an efficient means of performing GP regression, quadrature, or Bayesian optimization in high-dimensional spaces.

preprint2012arXiv

Bayesian Optimal Active Search and Surveying

We consider two active binary-classification problems with atypical objectives. In the first, active search, our goal is to actively uncover as many members of a given class as possible. In the second, active surveying, our goal is to actively query points to ultimately predict the proportion of a given class. Numerous real-world problems can be framed in these terms, and in either case typical model-based concerns such as generalization error are only of secondary importance. We approach these problems via Bayesian decision theory; after choosing natural utility functions, we derive the optimal policies. We provide three contributions. In addition to introducing the active surveying problem, we extend previous work on active search in two ways. First, we prove a novel theoretical result, that less-myopic approximations to the optimal policy can outperform more-myopic approximations by any arbitrary degree. We then derive bounds that for certain models allow us to reduce (in practice dramatically) the exponential search space required by a naive implementation of the optimal policy, enabling further lookahead while still ensuring that optimal decisions are always made.

preprint2012arXiv

Submodularity in Batch Active Learning and Survey Problems on Gaussian Random Fields

Many real-world datasets can be represented in the form of a graph whose edge weights designate similarities between instances. A discrete Gaussian random field (GRF) model is a finite-dimensional Gaussian process (GP) whose prior covariance is the inverse of a graph Laplacian. Minimizing the trace of the predictive covariance Sigma (V-optimality) on GRFs has proven successful in batch active learning classification problems with budget constraints. However, its worst-case bound has been missing. We show that the V-optimality on GRFs as a function of the batch query set is submodular and hence its greedy selection algorithm guarantees an (1-1/e) approximation ratio. Moreover, GRF models have the absence-of-suppressor (AofS) condition. For active survey problems, we propose a similar survey criterion which minimizes 1'(Sigma)1. In practice, V-optimality criterion performs better than GPs with mutual information gain criteria and allows nonuniform costs for different nodes.

Roman Garnett

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Local Bayesian optimization via maximizing probability of descent

A Unified Comparison of User Modeling Techniques for Predicting Data Interaction and Detecting Exploration Bias

Guided Data Discovery in Interactive Visualizations via Active Search

Automated Measurement of Quasar Redshift with a Gaussian Process

BINOCULARS for Efficient, Nonmyopic Sequential Experimental Design

Detecting Multiple DLAs per Spectrum in SDSS DR12 with Gaussian Processes

Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees

Active Search for Sparse Signals with Region Sensing

Anomaly Detection and Removal Using Non-Stationary Gaussian Processes

Differentially Private Bayesian Optimization

Propagation Kernels

Sampling for Inference in Probabilistic Models with Fast Bayesian Quadrature

Active Learning of Linear Embeddings for Gaussian Processes

Bayesian Optimal Active Search and Surveying

Submodularity in Batch Active Learning and Survey Problems on Gaussian Random Fields