Source author record

Ronen I. Brafman

Ronen I. Brafman appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Information Retrieval Machine Learning Robotics Software Engineering

Catalog footprint

What is connected

13works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Towards Plug'n Play Task-Level Autonomy for Robotics Using POMDPs and Generative Models

To enable robots to achieve high level objectives, engineers typically write scripts that apply existing specialized skills, such as navigation, object detection and manipulation to achieve these goals. Writing good scripts is challenging since they must intelligently balance the inherent stochasticity of a physical robot's actions and sensors, and the limited information it has. In principle, AI planning can be used to address this challenge and generate good behavior policies automatically. But this requires passing three hurdles. First, the AI must understand each skill's impact on the world. Second, we must bridge the gap between the more abstract level at which we understand what a skill does and the low-level state variables used within its code. Third, much integration effort is required to tie together all components. We describe an approach for integrating robot skills into a working autonomous robot controller that schedules its skills to achieve a specified task and carries four key advantages. 1) Our Generative Skill Documentation Language (GSDL) makes code documentation simpler, compact, and more expressive using ideas from probabilistic programming languages. 2) An expressive abstraction mapping (AM) bridges the gap between low-level robot code and the abstract AI planning model. 3) Any properly documented skill can be used by the controller without any additional programming effort, providing a Plug'n Play experience. 4) A POMDP solver schedules skill execution while properly balancing partial observability, stochastic behavior, and noisy sensing.

preprint2020arXiv

Learning and Solving Regular Decision Processes

Regular Decision Processes (RDPs) are a recently introduced model that extends MDPs with non-Markovian dynamics and rewards. The non-Markovian behavior is restricted to depend on regular properties of the history. These can be specified using regular expressions or formulas in linear dynamic logic over finite traces. Fully specified RDPs can be solved by compiling them into an appropriate MDP. Learning RDPs from data is a challenging problem that has yet to be addressed, on which we focus in this paper. Our approach rests on a new representation for RDPs using Mealy Machines that emit a distribution and an expected reward for each state-action pair. Building on this representation, we combine automata learning techniques with history clustering to learn such a Mealy machine and solve it by adapting MCTS to it. We empirically evaluate this approach, demonstrating its feasibility.

preprint2015arXiv

An MDP-based Recommender System

Typical Recommender systems adopt a static view of the recommendation process and treat it as a prediction problem. We argue that it is more appropriate to view the problem of generating recommendations as a sequential decision problem and, consequently, that Markov decision processes (MDP) provide a more appropriate model for Recommender systems. MDPs introduce two benefits: they take into account the long-term effects of each recommendation, and they take into account the expected value of each recommendation. To succeed in practice, an MDP-based Recommender system must employ a strong initial model; and the bulk of this paper is concerned with the generation of such a model. In particular, we suggest the use of an n-gram predictive model for generating the initial MDP. Our n-gram model induces a Markov-chain model of user behavior whose predictive accuracy is greater than that of existing predictive models. We describe our predictive model in detail and evaluate its performance on real data. In addition, we show how the model can be used in an MDP-based Recommender system.

preprint2014arXiv

A Heuristic Search Approach to Planning with Continuous Resources in Stochastic Domains

We consider the problem of optimal planning in stochastic domains with resource constraints, where the resources are continuous and the choice of action at each step depends on resource availability. We introduce the HAO* algorithm, a generalization of the AO* algorithm that performs search in a hybrid state space that is modeled using both discrete and continuous state variables, where the continuous variables represent monotonic resources. Like other heuristic search algorithms, HAO* leverages knowledge of the start state and an admissible heuristic to focus computational effort on those parts of the state space that could be reached from the start state by following an optimal policy. We show that this approach is especially effective when resource constraints limit how much of the state space is reachable. Experimental results demonstrate its effectiveness in the domain that motivates our research: automated planning for planetary exploration rovers.

preprint2014arXiv

Generic Preferences over Subsets of Structured Objects

Various tasks in decision making and decision support systems require selecting a preferred subset of a given set of items. Here we focus on problems where the individual items are described using a set of characterizing attributes, and a generic preference specification is required, that is, a specification that can work with an arbitrary set of items. For example, preferences over the content of an online newspaper should have this form: At each viewing, the newspaper contains a subset of the set of articles currently available. Our preference specification over this subset should be provided offline, but we should be able to use it to select a subset of any currently available set of articles, e.g., based on their tags. We present a general approach for lifting formalisms for specifying preferences over objects with multiple attributes into ones that specify preferences over subsets of such objects. We also show how we can compute an optimal subset given such a specification in a relatively efficient manner. We provide an empirical evaluation of the approach as well as some worst-case complexity results.

preprint2014arXiv

Replanning in Domains with Partial Information and Sensing Actions

Replanning via determinization is a recent, popular approach for online planning in MDPs. In this paper we adapt this idea to classical, non-stochastic domains with partial information and sensing actions, presenting a new planner: SDR (Sample, Determinize, Replan). At each step we generate a solution plan to a classical planning problem induced by the original problem. We execute this plan as long as it is safe to do so. When this is no longer the case, we replan. The classical planning problem we generate is based on the translation-based approach for conformant planning introduced by Palacios and Geffner. The state of the classical planning problem generated in this approach captures the belief state of the agent in the original problem. Unfortunately, when this method is applied to planning problems with sensing, it yields a non-deterministic planning problem that is typically very large. Our main contribution is the introduction of state sampling techniques for overcoming these two problems. In addition, we introduce a novel, lazy, regression-based method for querying the agents belief state during run-time. We provide a comprehensive experimental evaluation of the planner, showing that it scales better than the state-of-the-art CLG planner on existing benchmark problems, but also highlighting its weaknesses with new domains. We also discuss its theoretical guarantees.

preprint2013arXiv

Reasoning With Conditional Ceteris Paribus Preference Statem

In many domains it is desirable to assess the preferences of users in a qualitative rather than quantitative way. Such representations of qualitative preference orderings form an importnat component of automated decision tools. We propose a graphical representation of preferences that reflects conditional dependence and independence of preference statements under a ceteris paribus (all else being equal) interpretation. Such a representation is ofetn compact and arguably natural. We describe several search algorithms for dominance testing based on this representation; these algorithms are quite effective, especially in specific network topologies, such as chain-and tree- structured networks, as well as polytrees.

preprint2013arXiv

Structured Reachability Analysis for Markov Decision Processes

Recent research in decision theoretic planning has focussed on making the solution of Markov decision processes (MDPs) more feasible. We develop a family of algorithms for structured reachability analysis of MDPs that are suitable when an initial state (or set of states) is known. Using compact, structured representations of MDPs (e.g., Bayesian networks), our methods, which vary in the tradeoff between complexity and accuracy, produce structured descriptions of (estimated) reachable states that can be used to eliminate variables or variable values from the problem description, reducing the size of the MDP and making it easier to solve. One contribution of our work is the extension of ideas from GRAPHPLAN to deal with the distributed nature of action representations typically embodied within Bayes nets and the problem of correlated action effects. We also demonstrate that our algorithm can be made more complete by using k-ary constraints instead of binary constraints. Another contribution is the illustration of how the compact representation of reachability constraints can be exploited by several existing (exact and approximate) abstraction algorithms for MDPs.

preprint2013arXiv

UCP-Networks: A Directed Graphical Representation of Conditional Utilities

We propose a new directed graphical representation of utility functions, called UCP-networks, that combines aspects of two existing graphical models: generalized additive models and CP-networks. The network decomposes a utility function into a number of additive factors, with the directionality of the arcs reflecting conditional dependence of preference statements - in the underlying (qualitative) preference ordering - under a {em ceteris paribus} (all else being equal) interpretation. This representation is arguably natural in many settings. Furthermore, the strong CP-semantics ensures that computation of optimization and dominance queries is very efficient. We also demonstrate the value of this representation in decision making. Finally, we describe an interactive elicitation procedure that takes advantage of the linear nature of the constraints on "`tradeoff weights" imposed by a UCP-network. This procedure allows the network to be refined until the regret of the decision with minimax regret (with respect to the incompletely specified utility function) falls below a specified threshold (e.g., the cost of further questioning.

preprint2012arXiv

Compact Value-Function Representations for Qualitative Preferences

We consider the challenge of preference elicitation in systems that help users discover the most desirable item(s) within a given database. Past work on preference elicitation focused on structured models that provide a factored representation of users' preferences. Such models require less information to construct and support efficient reasoning algorithms. This paper makes two substantial contributions to this area: (1) Strong representation theorems for factored value functions. (2) A methodology that utilizes our representation results to address the problem of optimal item selection.

preprint2012arXiv

Exploiting Uniform Assignments in First-Order MPE

The MPE (Most Probable Explanation) query plays an important role in probabilistic inference. MPE solution algorithms for probabilistic relational models essentially adapt existing belief assessment method, replacing summation with maximization. But the rich structure and symmetries captured by relational models together with the properties of the maximization operator offer an opportunity for additional simplification with potentially significant computational ramifications. Specifically, these models often have groups of variables that define symmetric distributions over some population of formulas. The maximizing choice for different elements of this group is the same. If we can realize this ahead of time, we can significantly reduce the size of the model by eliminating a potentially significant portion of random variables. This paper defines the notion of uniformly assigned and partially uniformly assigned sets of variables, shows how one can recognize these sets efficiently, and how the model can be greatly simplified once we recognize them, with little computational effort. We demonstrate the effectiveness of these ideas empirically on a number of models.

preprint2012arXiv

Extended Lifted Inference with Joint Formulas

The First-Order Variable Elimination (FOVE) algorithm allows exact inference to be applied directly to probabilistic relational models, and has proven to be vastly superior to the application of standard inference methods on a grounded propositional model. Still, FOVE operators can be applied under restricted conditions, often forcing one to resort to propositional inference. This paper aims to extend the applicability of FOVE by providing two new model conversion operators: the first and the primary is joint formula conversion and the second is just-different counting conversion. These new operations allow efficient inference methods to be applied directly on relational models, where no existing efficient method could be applied hitherto. In addition, aided by these capabilities, we show how to adapt FOVE to provide exact solutions to Maximum Expected Utility (MEU) queries over relational models for decision under uncertainty. Experimental evaluations show our algorithms to provide significant speedup over the alternatives.

preprint2012arXiv

Introducing Variable Importance Tradeoffs into CP-Nets

The ability to make decisions and to assess potential courses of action is a corner-stone of many AI applications, and usually this requires explicit information about the decision-maker s preferences. IN many applications, preference elicitation IS a serious bottleneck.The USER either does NOT have the time, the knowledge, OR the expert support required TO specify complex multi - attribute utility functions. IN such cases, a method that IS based ON intuitive, yet expressive, preference statements IS required. IN this paper we suggest the USE OF TCP - nets, an enhancement OF CP - nets, AS a tool FOR representing, AND reasoning about qualitative preference statements.We present AND motivate this framework, define its semantics, AND show how it can be used TO perform constrained optimization.

Ronen I. Brafman

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

Towards Plug'n Play Task-Level Autonomy for Robotics Using POMDPs and Generative Models

Learning and Solving Regular Decision Processes

An MDP-based Recommender System

A Heuristic Search Approach to Planning with Continuous Resources in Stochastic Domains

Generic Preferences over Subsets of Structured Objects

Replanning in Domains with Partial Information and Sensing Actions

Reasoning With Conditional Ceteris Paribus Preference Statem

Structured Reachability Analysis for Markov Decision Processes

UCP-Networks: A Directed Graphical Representation of Conditional Utilities

Compact Value-Function Representations for Qualitative Preferences

Exploiting Uniform Assignments in First-Order MPE

Extended Lifted Inference with Joint Formulas

Introducing Variable Importance Tradeoffs into CP-Nets