Source author record

David Tolpin

David Tolpin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Programming Languages

Catalog footprint

What is connected

20works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

How To Train Your Program: a Probabilistic Programming Pattern for Bayesian Learning From Data

We present a Bayesian approach to machine learning with probabilistic programs. In our approach, training on available data is implemented as inference on a hierarchical model. The posterior distribution of model parameters is then used to \textit{stochastically condition} a complementary model, such that inference on new data yields the same posterior distribution of latent parameters corresponding to the new data as inference on a hierachical model on the combination of both previously available and new data, at a lower computation cost. We frame the approach as a design pattern of probabilistic programming referred to herein as `stump and fungus', and evaluate realization of the pattern on case studies.

preprint2022arXiv

Monitoring Time Series With Missing Values: a Deep Probabilistic Approach

Systems are commonly monitored for health and security through collection and streaming of multivariate time series. Advances in time series forecasting due to adoption of multilayer recurrent neural network architectures make it possible to forecast in high-dimensional time series, and identify and classify novelties early, based on subtle changes in the trends. However, mainstream approaches to multi-variate time series predictions do not handle well cases when the ongoing forecast must include uncertainty, nor they are robust to missing data. We introduce a new architecture for time series monitoring based on combination of state-of-the-art methods of forecasting in high-dimensional time series with full probabilistic handling of uncertainty. We demonstrate advantage of the architecture for time series forecasting and novelty detection, in particular with partially missing data, and empirically evaluate and compare the architecture to state-of-the-art approaches on a real-world data set.

preprint2021arXiv

Probabilistic Programs with Stochastic Conditioning

We tackle the problem of conditioning probabilistic programs on distributions of observable variables. Probabilistic programs are usually conditioned on samples from the joint data distribution, which we refer to as deterministic conditioning. However, in many real-life scenarios, the observations are given as marginal distributions, summary statistics, or samplers. Conventional probabilistic programming systems lack adequate means for modeling and inference in such scenarios. We propose a generalization of deterministic conditioning to stochastic conditioning, that is, conditioning on the marginal distribution of a variable taking a particular form. To this end, we first define the formal notion of stochastic conditioning and discuss its key properties. We then show how to perform inference in the presence of stochastic conditioning. We demonstrate potential usage of stochastic conditioning on several case studies which involve various kinds of stochastic conditioning and are difficult to solve otherwise. Although we present stochastic conditioning in the context of probabilistic programming, our formalization is general and applicable to other settings.

preprint2020arXiv

Stochastic Probabilistic Programs

We introduce the notion of a stochastic probabilistic program and present a reference implementation of a probabilistic programming facility supporting specification of stochastic probabilistic programs and inference in them. Stochastic probabilistic programs allow straightforward specification and efficient inference in models with nuisance parameters, noise, and nondeterminism. We give several examples of stochastic probabilistic programs, and compare the programs with corresponding deterministic probabilistic programs in terms of model specification and inference. We conclude with discussion of open research topics and related work.

preprint2020arXiv

Stochastically Differentiable Probabilistic Programs

Probabilistic programs with mixed support (both continuous and discrete latent random variables) commonly appear in many probabilistic programming systems (PPSs). However, the existence of the discrete random variables prohibits many basic gradient-based inference engines, which makes the inference procedure on such models particularly challenging. Existing PPSs either require the user to manually marginalize out the discrete variables or to perform a composing inference by running inference separately on discrete and continuous variables. The former is infeasible in most cases whereas the latter has some fundamental shortcomings. We present a novel approach to run inference efficiently and robustly in such programs using stochastic gradient Markov Chain Monte Carlo family of algorithms. We compare our stochastic gradient-based inference algorithm against conventional baselines in several important cases of probabilistic programs with mixed support, and demonstrate that it outperforms existing composing inference baselines and works almost as well as inference in marginalized versions of the programs, but with less programming effort and at a lower computation cost.

preprint2016arXiv

Black-Box Policy Search with Probabilistic Programs

In this work, we explore how probabilistic programs can be used to represent policies in sequential decision problems. In this formulation, a probabilistic program is a black-box stochastic simulator for both the problem domain and the agent. We relate classic policy gradient techniques to recently introduced black-box variational methods which generalize to probabilistic program inference. We present case studies in the Canadian traveler problem, Rock Sample, and a benchmark for optimal diagnosis inspired by Guess Who. Each study illustrates how programs can efficiently represent policies using moderate numbers of parameters.

preprint2016arXiv

Design and Implementation of Probabilistic Programming Language Anglican

Anglican is a probabilistic programming system designed to interoperate with Clojure and other JVM languages. We introduce the programming language Anglican, outline our design choices, and discuss in depth the implementation of the Anglican language and runtime, including macro-based compilation, extended CPS-based evaluation model, and functional representations for probabilistic paradigms, such as a distribution, a random process, and an inference algorithm. We show that a probabilistic functional language can be implemented efficiently and integrated tightly with a conventional functional language with only moderate computational overhead. We also demonstrate how advanced probabilistic modeling concepts are mapped naturally to the functional foundation.

preprint2015arXiv

Maximum a Posteriori Estimation by Search in Probabilistic Programs

We introduce an approximate search algorithm for fast maximum a posteriori probability estimation in probabilistic programs, which we call Bayesian ascent Monte Carlo (BaMC). Probabilistic programs represent probabilistic models with varying number of mutually dependent finite, countable, and continuous random variables. BaMC is an anytime MAP search algorithm applicable to any combination of random variables and dependencies. We compare BaMC to other MAP estimation algorithms and show that BaMC is faster and more robust on a range of probabilistic models.

preprint2015arXiv

Output-Sensitive Adaptive Metropolis-Hastings for Probabilistic Programs

We introduce an adaptive output-sensitive Metropolis-Hastings algorithm for probabilistic models expressed as programs, Adaptive Lightweight Metropolis-Hastings (AdLMH). The algorithm extends Lightweight Metropolis-Hastings (LMH) by adjusting the probabilities of proposing random variables for modification to improve convergence of the program output. We show that AdLMH converges to the correct equilibrium distribution and compare convergence of AdLMH to that of LMH on several test problems to highlight different aspects of the adaptation scheme. We observe consistent improvement in convergence on the test problems.

preprint2015arXiv

Path Finding under Uncertainty through Probabilistic Inference

We introduce a new approach to solving path-finding problems under uncertainty by representing them as probabilistic models and applying domain-independent inference algorithms to the models. This approach separates problem representation from the inference algorithm and provides a framework for efficient learning of path-finding policies. We evaluate the new approach on the Canadian Traveler Problem, which we formulate as a probabilistic model, and show how probabilistic inference allows high performance stochastic policies to be obtained for this problem.

preprint2014arXiv

Justifying and Improving Meta-Agent Conflict-Based Search

The Meta-Agent Conflict-Based Search~(MA-CBS) is a recently proposed algorithm for the multi-agent path finding problem. The algorithm is an extension of Conflict-Based Search~(CBS), which automatically merges conflicting agents into meta-agents if the number of conflicts exceeds a certain threshold. However, the decision to merge agents is made according to an empirically chosen fixed threshold on the number of conflicts. The best threshold depends both on the domain and on the number of agents, and the nature of the dependence is not clearly understood. We suggest a justification for the use of a fixed threshold on the number of conflicts based on the analysis of a model problem. Following the suggested justification, we introduce new decision policies for the MA-CBS algorithm, which considerably improve the algorithm's performance. The improved variants of the algorithm are evaluated on several sets of problems, chosen to underline different aspects of the algorithms.

preprint2014arXiv

Rational Deployment of Multiple Heuristics in IDA*

Recent advances in metareasoning for search has shown its usefulness in improving numerous search algorithms. This paper applies rational metareasoning to IDA* when several admissible heuristics are available. The obvious basic approach of taking the maximum of the heuristics is improved upon by lazy evaluation of the heuristics, resulting in a variant known as Lazy IDA*. We introduce a rational version of lazy IDA* that decides whether to compute the more expensive heuristics or to bypass it, based on a myopic expected regret estimate. Empirical evaluation in several domains supports the theoretical results, and shows that rational lazy IDA* is a state-of-the-art heuristic combination method.

preprint2014arXiv

Selecting Computations: Theory and Applications

Sequential decision problems are often approximately solvable by simulating possible future action sequences. Metalevel decision procedures have been developed for selecting which action sequences to simulate, based on estimating the expected improvement in decision quality that would result from any particular simulation; an example is the recent work on using bandit algorithms to control Monte Carlo tree search in the game of Go. In this paper we develop a theoretical basis for metalevel decisions in the statistical framework of Bayesian selection problems, arguing (as others have done) that this is more appropriate than the bandit framework. We derive a number of basic results applicable to Monte Carlo selection problems, including the first finite sampling bounds for optimal policies in certain cases; we also provide a simple counterexample to the intuitive conjecture that an optimal policy will necessarily reach a decision in all cases. We then derive heuristic approximations in both Bayesian and distribution-free settings and demonstrate their superiority to bandit-based heuristics in one-shot decision problems and in Go.

preprint2013arXiv

Towards Rational Deployment of Multiple Heuristics in A*

The obvious way to use several admissible heuristics in A* is to take their maximum. In this paper we aim to reduce the time spent on computing heuristics. We discuss Lazy A*, a variant of A* where heuristics are evaluated lazily: only when they are essential to a decision to be made in the A* search process. We present a new rational meta-reasoning based scheme, rational lazy A*, which decides whether to compute the more expensive heuristics at all, based on a myopic value of information estimate. Both methods are examined theoretically. Empirical evaluation on several domains supports the theoretical results, and shows that lazy A* and rational lazy A* are state-of-the-art heuristic combination methods.

preprint2012arXiv

Doing Better Than UCT: Rational Monte Carlo Sampling in Trees

UCT, a state-of-the art algorithm for Monte Carlo tree sampling (MCTS), is based on UCB, a sampling policy for the Multi-armed Bandit Problem (MAB) that minimizes the accumulated regret. However, MCTS differs from MAB in that only the final choice, rather than all arm pulls, brings a reward, that is, the simple regret, as opposite to the cumulative regret, must be minimized. This ongoing work aims at applying meta-reasoning techniques to MCTS, which is non-trivial. We begin by introducing policies for multi-armed bandits with lower simple regret than UCB, and an algorithm for MCTS which combines cumulative and simple regret minimization and outperforms UCT. We also develop a sampling scheme loosely based on a myopic version of perfect value of information. Finite-time and asymptotic analysis of the policies is provided, and the algorithms are compared empirically.

preprint2012arXiv

MCTS Based on Simple Regret

UCT, a state-of-the art algorithm for Monte Carlo tree search (MCTS) in games and Markov decision processes, is based on UCB, a sampling policy for the Multi-armed Bandit problem (MAB) that minimizes the cumulative regret. However, search differs from MAB in that in MCTS it is usually only the final "arm pull" (the actual move selection) that collects a reward, rather than all "arm pulls". Therefore, it makes more sense to minimize the simple regret, as opposed to the cumulative regret. We begin by introducing policies for multi-armed bandits with lower finite-time and asymptotic simple regret than UCB, using it to develop a two-stage scheme (SR+CR) for MCTS which outperforms UCT empirically. Optimizing the sampling process is itself a metareasoning problem, a solution of which can use value of information (VOI) techniques. Although the theory of VOI for search exists, applying it to MCTS is non-trivial, as typical myopic assumptions fail. Lacking a complete working VOI theory for MCTS, we nevertheless propose a sampling scheme that is "aware" of VOI, achieving an algorithm that in empirical evaluation outperforms both UCT and the other proposed algorithms.

preprint2012arXiv

Selecting Computations: Theory and Applications

Sequential decision problems are often approximately solvable by simulating possible future action sequences. {\em Metalevel} decision procedures have been developed for selecting {\em which} action sequences to simulate, based on estimating the expected improvement in decision quality that would result from any particular simulation; an example is the recent work on using bandit algorithms to control Monte Carlo tree search in the game of Go. In this paper we develop a theoretical basis for metalevel decisions in the statistical framework of Bayesian {\em selection problems}, arguing (as others have done) that this is more appropriate than the bandit framework. We derive a number of basic results applicable to Monte Carlo selection problems, including the first finite sampling bounds for optimal policies in certain cases; we also provide a simple counterexample to the intuitive conjecture that an optimal policy will necessarily reach a decision in all cases. We then derive heuristic approximations in both Bayesian and distribution-free settings and demonstrate their superiority to bandit-based heuristics in one-shot decision problems and in Go.

preprint2012arXiv

VOI-aware MCTS

UCT, a state-of-the art algorithm for Monte Carlo tree search (MCTS) in games and Markov decision processes, is based on UCB1, a sampling policy for the Multi-armed Bandit problem (MAB) that minimizes the cumulative regret. However, search differs from MAB in that in MCTS it is usually only the final "arm pull" (the actual move selection) that collects a reward, rather than all "arm pulls". In this paper, an MCTS sampling policy based on Value of Information (VOI) estimates of rollouts is suggested. Empirical evaluation of the policy and comparison to UCB1 and UCT is performed on random MAB instances as well as on Computer Go.

preprint2011arXiv

Rational Deployment of CSP Heuristics

Heuristics are crucial tools in decreasing search effort in varied fields of AI. In order to be effective, a heuristic must be efficient to compute, as well as provide useful information to the search algorithm. However, some well-known heuristics which do well in reducing backtracking are so heavy that the gain of deploying them in a search algorithm might be outweighed by their overhead. We propose a rational metareasoning approach to decide when to deploy heuristics, using CSP backtracking search as a case study. In particular, a value of information approach is taken to adaptive deployment of solution-count estimation heuristics for value ordering. Empirical results show that indeed the proposed mechanism successfully balances the tradeoff between decreasing backtracking and heuristic computational overhead, resulting in a significant overall search time reduction.

preprint2010arXiv

Rational Value of Information Estimation for Measurement Selection

Computing value of information (VOI) is a crucial task in various aspects of decision-making under uncertainty, such as in meta-reasoning for search; in selecting measurements to make, prior to choosing a course of action; and in managing the exploration vs. exploitation tradeoff. Since such applications typically require numerous VOI computations during a single run, it is essential that VOI be computed efficiently. We examine the issue of anytime estimation of VOI, as frequently it suffices to get a crude estimate of the VOI, thus saving considerable computational resources. As a case study, we examine VOI estimation in the measurement selection problem. Empirical evaluation of the proposed scheme in this domain shows that computational resources can indeed be significantly reduced, at little cost in expected rewards achieved in the overall decision problem.

David Tolpin

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

How To Train Your Program: a Probabilistic Programming Pattern for Bayesian Learning From Data

Monitoring Time Series With Missing Values: a Deep Probabilistic Approach

Probabilistic Programs with Stochastic Conditioning

Stochastic Probabilistic Programs

Stochastically Differentiable Probabilistic Programs

Black-Box Policy Search with Probabilistic Programs

Design and Implementation of Probabilistic Programming Language Anglican

Maximum a Posteriori Estimation by Search in Probabilistic Programs

Output-Sensitive Adaptive Metropolis-Hastings for Probabilistic Programs

Path Finding under Uncertainty through Probabilistic Inference

Justifying and Improving Meta-Agent Conflict-Based Search

Rational Deployment of Multiple Heuristics in IDA*

Selecting Computations: Theory and Applications

Towards Rational Deployment of Multiple Heuristics in A*

Doing Better Than UCT: Rational Monte Carlo Sampling in Trees

MCTS Based on Simple Regret

Selecting Computations: Theory and Applications

VOI-aware MCTS

Rational Deployment of CSP Heuristics

Rational Value of Information Estimation for Measurement Selection