Source author record

Marc Toussaint

Marc Toussaint appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics Machine Learning Computer Vision Artificial Intelligence gr-qc Computation Computational Geometry hep-th Human-Computer Interaction math.NA math.OC Numerical Analysis

Catalog footprint

What is connected

32works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Manifold Sampling via Entropy Maximization

Sampling from constrained distributions has a wide range of applications, including in Bayesian optimization and robotics. Prior work establishes convergence and feasibility guarantees for constrained sampling, but assumes that the feasible set is connected. However, in practice, the feasible set often decomposes into multiple disconnected components, which makes efficient sampling under constraints challenging. In this paper, we propose MAnifold Sampling via Entropy Maximization (MASEM) for sampling on a manifold with an unknown number of disconnected components, implicitly defined by smooth equality and inequality constraints. The presented method uses a resampling scheme to maximize the entropy of the empirical distribution based on k-nearest neighbor density estimation. We show that, in the mean field, MASEM decreases the KL-divergence between the empirical distribution and the maximum-entropy target exponentially in the number of resampling steps. We instantiate MASEM with multiple local samplers and demonstrate its versatility and efficiency on synthetic and robotics-based benchmarks. MASEM enables fast and scalable mixing across a range of constrained sampling problems, improving over alternatives by an order of magnitude in Sinkhorn distance with competitive runtime.

preprint2026arXiv

Segment Anything with Robust Uncertainty-Accuracy Correlation

Despite strong zero-shot performance, SAM is unreliable under domain shift due to Mask-level Confidence Confusion (MCC), where a single IoU-based mask score fails to reflect pixel-wise reliability near boundaries. Motivated by the contrast between texture-biased shortcuts in neural networks and shape-centric processing in human vision, we model out-of-domain variation as appearance shifts and non-rigid deformations that jointly stress calibration. We propose Segment Anything with Robust Uncertainty-Accuracy Correlation (RUAC) for robust pixel-wise uncertainty estimation under appearance and deformation shifts. RUAC adds a lightweight uncertainty head, trains it with a collaborative style-deformation attack that jointly perturbs texture and geometry, and applies Uncertainty-Accuracy Alignment to ensure uncertainty consistently highlights erroneous pixels even under adversarial perturbations. Across 23 zero-shot domains, RUAC improves segmentation quality and yields more faithful uncertainty with stronger uncertainty-accuracy correlation. Project page: https://github.com/HongyouZhou/ruac.git.

preprint2022arXiv

db-A*: Discontinuity-bounded Search for Kinodynamic Mobile Robot Motion Planning

We consider time-optimal motion planning for dynamical systems that are translation-invariant, a property that holds for many mobile robots, such as differential-drives, cars, airplanes, and multirotors. Our key insight is that we can extend graph-search algorithms to the continuous case when used symbiotically with optimization. For the graph search, we introduce discontinuity-bounded A* (db-A*), a generalization of the A* algorithm that uses concepts and data structures from sampling-based planners. Db-A* reuses short trajectories, so-called motion primitives, as edges and allows a maximum user-specified discontinuity at the vertices. These trajectories are locally repaired with trajectory optimization, which also provides new improved motion primitives. Our novel kinodynamic motion planner, kMP-db-A*, has almost surely asymptotic optimal behavior and computes near-optimal solutions quickly. For our empirical validation, we provide the first benchmark that compares search-, sampling-, and optimization-based time-optimal motion planning on multiple dynamical systems in different settings. Compared to the baselines, kMP-db-A* consistently solves more problem instances, finds lower-cost initial solutions, and converges more quickly.

preprint2022arXiv

Deep Visual Constraints: Neural Implicit Models for Manipulation Planning from Visual Input

Manipulation planning is the problem of finding a sequence of robot configurations that involves interactions with objects in the scene, e.g., grasping and placing an object, or more general tool-use. To achieve such interactions, traditional approaches require hand-engineering of object representations and interaction constraints, which easily becomes tedious when complex objects/interactions are considered. Inspired by recent advances in 3D modeling, e.g. NeRF, we propose a method to represent objects as continuous functions upon which constraint features are defined and jointly trained. In particular, the proposed pixel-aligned representation is directly inferred from images with known camera geometry and naturally acts as a perception component in the whole manipulation pipeline, thereby enabling long-horizon planning only from visual input. Project page: https://sites.google.com/view/deep-visual-constraints

preprint2022arXiv

FC$^3$: Feasibility-Based Control Chain Coordination

Hierarchical coordination of controllers often uses symbolic state representations that fully abstract their underlying low-level controllers, treating them as "black boxes" to the symbolic action abstraction. This paper proposes a framework to realize robust behavior, which we call Feasibility-based Control Chain Coordination (FC$^3$). Our controllers expose the geometric features and constraints they operate on. Based on this, FC$^3$ can reason over the controllers' feasibility and their sequence feasibility. For a given task, FC$^3$ first automatically constructs a library of potential controller chains using a symbolic action tree, which is then used to coordinate controllers in a chain, evaluate task feasibility, as well as switching between controller chains if necessary. In several real-world experiments we demonstrate FC$^3$'s robustness and awareness of the task's feasibility through its own actions and gradual responses to different interferences.

preprint2022arXiv

Learning Multi-Object Dynamics with Compositional Neural Radiance Fields

We present a method to learn compositional multi-object dynamics models from image observations based on implicit object encoders, Neural Radiance Fields (NeRFs), and graph neural networks. NeRFs have become a popular choice for representing scenes due to their strong 3D prior. However, most NeRF approaches are trained on a single scene, representing the whole scene with a global model, making generalization to novel scenes, containing different numbers of objects, challenging. Instead, we present a compositional, object-centric auto-encoder framework that maps multiple views of the scene to a set of latent vectors representing each object separately. The latent vectors parameterize individual NeRFs from which the scene can be reconstructed. Based on those latent vectors, we train a graph neural network dynamics model in the latent space to achieve compositionality for dynamics prediction. A key feature of our approach is that the latent vectors are forced to encode 3D information through the NeRF decoder, which enables us to incorporate structural priors in learning the dynamics models, making long-term predictions more stable compared to several baselines. Simulated and real world experiments show that our method can model and learn the dynamics of compositional scenes including rigid and deformable objects. Video: https://dannydriess.github.io/compnerfdyn/

preprint2022arXiv

MotionBenchMaker: A Tool to Generate and Benchmark Motion Planning Datasets

Recently, there has been a wealth of development in motion planning for robotic manipulation new motion planners are continuously proposed, each with their own unique strengths and weaknesses. However, evaluating new planners is challenging and researchers often create their own ad-hoc problems for benchmarking, which is time-consuming, prone to bias, and does not directly compare against other state-of-the-art planners. We present MotionBenchMaker, an open-source tool to generate benchmarking datasets for realistic robot manipulation problems. MotionBenchMaker is designed to be an extensible, easy-to-use tool that allows users to both generate datasets and benchmark them by comparing motion planning algorithms. Empirically, we show the benefit of using MotionBenchMaker as a tool to procedurally generate datasets which helps in the fair evaluation of planners. We also present a suite of 40 prefabricated datasets, with 5 different commonly used robots in 8 environments, to serve as a common ground to accelerate motion planning research.

preprint2022arXiv

Path-Tree Optimization in Discrete Partially Observable Environments using Rapidly-Exploring Belief-Space Graphs

Robots often need to solve path planning problems where essential and discrete aspects of the environment are partially observable. This introduces a multi-modality, where the robot must be able to observe and infer the state of its environment. To tackle this problem, we introduce the Path-Tree Optimization (PTO) algorithm which plans a path-tree in belief-space. A path-tree is a tree-like motion with branching points where the robot receives an observation leading to a belief-state update. The robot takes different branches depending on the observation received. The algorithm has three main steps. First, a rapidly-exploring random graph (RRG) on the state space is grown. Second, the RRG is expanded to a belief-space graph by querying the observation model. In a third step, dynamic programming is performed on the belief-space graph to extract a path-tree. The resulting path-tree combines exploration with exploitation i.e. it balances the need for gaining knowledge about the environment with the need for reaching the goal. We demonstrate the algorithm capabilities on navigation and mobile manipulation tasks, and show its advantage over a baseline using a task and motion planning approach (TAMP) both in terms of optimality and runtime.

preprint2022arXiv

Reinforcement Learning with Neural Radiance Fields

It is a long-standing problem to find effective representations for training reinforcement learning (RL) agents. This paper demonstrates that learning state representations with supervision from Neural Radiance Fields (NeRFs) can improve the performance of RL compared to other learned representations or even low-dimensional, hand-engineered state information. Specifically, we propose to train an encoder that maps multiple image observations to a latent space describing the objects in the scene. The decoder built from a latent-conditioned NeRF serves as the supervision signal to learn the latent space. An RL algorithm then operates on the learned latent space as its state representation. We call this NeRF-RL. Our experiments indicate that NeRF as supervision leads to a latent space better suited for the downstream RL tasks involving robotic object manipulations like hanging mugs on hooks, pushing objects, or opening doors. Video: https://dannydriess.github.io/nerf-rl

preprint2022arXiv

RHH-LGP: Receding Horizon And Heuristics-Based Logic-Geometric Programming For Task And Motion Planning

Sequential decision-making and motion planning for robotic manipulation induce combinatorial complexity. For long-horizon tasks, especially when the environment comprises many objects that can be interacted with, planning efficiency becomes even more important. To plan such long-horizon tasks, we present the RHH-LGP algorithm for combined task and motion planning (TAMP). First, we propose a TAMP approach (based on Logic-Geometric Programming) that effectively uses geometry-based heuristics for solving long-horizon manipulation tasks. The efficiency of this planner is then further improved by a receding horizon formulation, resulting in RHH-LGP. We demonstrate the robustness and effectiveness of our approach on a diverse range of long-horizon tasks that require reasoning about interactions with a large number of objects. Using our framework, we can solve tasks that require multiple robots, including a mobile robot and snake-like walking robots, to form novel heterogeneous kinematic structures autonomously. By combining geometry-based heuristics with iterative planning, our approach brings an order-of-magnitude reduction of planning time in all investigated problems.

preprint2022arXiv

ST-RRT*: Asymptotically-Optimal Bidirectional Motion Planning through Space-Time

We present a motion planner for planning through space-time with dynamic obstacles, velocity constraints, and unknown arrival time. Our algorithm, Space-Time RRT* (ST-RRT*), is a probabilistically complete, bidirectional motion planning algorithm, which is asymptotically optimal with respect to the shortest arrival time. We experimentally evaluate ST-RRT* in both abstract (2D disk, 8D disk in cluttered spaces, and on a narrow passage problem), and simulated robotic path planning problems (sequential planning of 8DoF mobile robots, and 7DoF robotic arms). The proposed planner outperforms RRT-Connect and RRT* on both initial solution time, and attained final solution cost. The code for ST-RRT* is available in the Open Motion Planning Library (OMPL).

preprint2021arXiv

Visualization of Nonlinear Programming for Robot Motion Planning

Nonlinear programming targets nonlinear optimization with constraints, which is a generic yet complex methodology involving humans for problem modeling and algorithms for problem solving. We address the particularly hard challenge of supporting domain experts in handling, understanding, and trouble-shooting high-dimensional optimization with a large number of constraints. Leveraging visual analytics, users are supported in exploring the computation process of nonlinear constraint optimization. Our system was designed for robot motion planning problems and developed in tight collaboration with domain experts in nonlinear programming and robotics. We report on the experiences from this design study, illustrate the usefulness for relevant example cases, and discuss the extension to visual analytics for nonlinear programming in general.

preprint2020arXiv

An Interior Point Method Solving Motion Planning Problems with Narrow Passages

Algorithmic solutions for the motion planning problem have been investigated for five decades. Since the development of A* in 1969 many approaches have been investigated, traditionally classified as either grid decomposition, potential fields or sampling-based. In this work, we focus on using numerical optimization, which is understudied for solving motion planning problems. This lack of interest in the favor of sampling-based methods is largely due to the non-convexity introduced by narrow passages. We address this shortcoming by grounding the solution in differential geometry. We demonstrate through a series of experiments on 3 Dofs and 6 Dofs narrow passage problems, how modeling explicitly the underlying Riemannian manifold leads to an efficient interior-point non-linear programming solution.

preprint2020arXiv

Anticipating Human Intention for Full-Body Motion Prediction in Object Grasping and Placing Tasks

Motion prediction in unstructured environments is a difficult problem and is essential for safe and efficient human-robot space sharing and collaboration. In this work, we focus on manipulation movements in environments such as homes, workplaces or restaurants, where the overall task and environment can be leveraged to produce accurate motion prediction. For these cases we propose an algorithmic framework that accounts explicitly for the environment geometry based on a model of affordances and a model of short-term human dynamics both trained on motion capture data. We propose dedicated function networks for graspability and placebility affordances and we make use of a dedicated RNN for short-term motion prediction. The prediction of grasp and placement probability densities are used by a constraint-based trajectory optimizer to produce a full-body motion prediction over the entire horizon. We show by comparing to ground truth data that we achieve similar performance for full-body motion predictions as using oracle grasp and place locations.

preprint2020arXiv

Deep Visual Reasoning: Learning to Predict Action Sequences for Task and Motion Planning from an Initial Scene Image

In this paper, we propose a deep convolutional recurrent neural network that predicts action sequences for task and motion planning (TAMP) from an initial scene image. Typical TAMP problems are formalized by combining reasoning on a symbolic, discrete level (e.g. first-order logic) with continuous motion planning such as nonlinear trajectory optimization. Due to the great combinatorial complexity of possible discrete action sequences, a large number of optimization/motion planning problems have to be solved to find a solution, which limits the scalability of these approaches. To circumvent this combinatorial complexity, we develop a neural network which, based on an initial image of the scene, directly predicts promising discrete action sequences such that ideally only one motion planning problem has to be solved to find a solution to the overall TAMP problem. A key aspect is that our method generalizes to scenes with many and varying number of objects, although being trained on only two objects at a time. This is possible by encoding the objects of the scene in images as input to the neural network, instead of a fixed feature vector. Results show runtime improvements of several magnitudes. Video: https://youtu.be/i8yyEbbvoEk

preprint2020arXiv

Describing Physics For Physical Reasoning: Force-based Sequential Manipulation Planning

Physical reasoning is a core aspect of intelligence in animals and humans. A central question is what model should be used as a basis for reasoning. Existing work considered models ranging from intuitive physics and physical simulators to contact dynamics models used in robotic manipulation and locomotion. In this work we propose descriptions of physics which directly allow us to leverage optimization methods for physical reasoning and sequential manipulation planning. The proposed multi-physics formulation enables the solver to mix various levels of abstraction and simplifications for different objects and phases of the solution. As an essential ingredient, we propose a specific parameterization of wrench exchange between object surfaces in a path optimization framework, introducing the point-of-attack as decision variable. We demonstrate the approach on various robot manipulation planning problems, such as grasping a stick in order to push or lift another object to a target, shifting and grasping a book from a shelve, and throwing an object to bounce towards a target.

preprint2020arXiv

Natural Gradient Shared Control

We propose a formalism for shared control, which is the problem of defining a policy that blends user control and autonomous control. The challenge posed by the shared autonomy system is to maintain user control authority while allowing the robot to support the user. This can be done by enforcing constraints or acting optimally when the intent is clear. Our proposed solution relies on natural gradients emerging from the divergence constraint between the robot and the shared policy. We approximate the Fisher information by sampling a learned robot policy and computing the local gradient to augment the user control when necessary. A user study performed on a manipulation task demonstrates that our approach allows for more efficient task completion while keeping control authority against a number of baseline methods.

preprint2020arXiv

Prediction of Human Full-Body Movements with Motion Optimization and Recurrent Neural Networks

Human movement prediction is difficult as humans naturally exhibit complex behaviors that can change drastically from one environment to the next. In order to alleviate this issue, we propose a prediction framework that decouples short-term prediction, linked to internal body dynamics, and long-term prediction, linked to the environment and task constraints. In this work we investigate encoding short-term dynamics in a recurrent neural network, while we account for environmental constraints, such as obstacle avoidance, using gradient-based trajectory optimization. Experiments on real motion data demonstrate that our framework improves the prediction with respect to state-of-the-art motion prediction methods, as it accounts to beforehand unseen environmental structures. Moreover we demonstrate on an example, how this framework can be used to plan robot trajectories that are optimized to coordinate with a human partner.

preprint2020arXiv

Probabilistic Framework for Constrained Manipulations and Task and Motion Planning under Uncertainty

Logic-Geometric Programming (LGP) is a powerful motion and manipulation planning framework, which represents hierarchical structure using logic rules that describe discrete aspects of problems, e.g., touch, grasp, hit, or push, and solves the resulting smooth trajectory optimization. The expressive power of logic allows LGP for handling complex, large-scale sequential manipulation and tool-use planning problems. In this paper, we extend the LGP formulation to stochastic domains. Based on the control-inference duality, we interpret LGP in a stochastic domain as fitting a mixture of Gaussians to the posterior path distribution, where each logic profile defines a single Gaussian path distribution. The proposed framework enables a robot to prioritize various interaction modes and to acquire interesting behaviors such as contact exploitation for uncertainty reduction, eventually providing a composite control scheme that is reactive to disturbance. The supplementary video can be found at https://youtu.be/CEaJdVlSZyo

preprint2020arXiv

Qgraph-bounded Q-learning: Stabilizing Model-Free Off-Policy Deep Reinforcement Learning

In state of the art model-free off-policy deep reinforcement learning, a replay memory is used to store past experience and derive all network updates. Even if both state and action spaces are continuous, the replay memory only holds a finite number of transitions. We represent these transitions in a data graph and link its structure to soft divergence. By selecting a subgraph with a favorable structure, we construct a simplified Markov Decision Process for which exact Q-values can be computed efficiently as more data comes in. The subgraph and its associated Q-values can be represented as a QGraph. We show that the Q-value for each transition in the simplified MDP is a lower bound of the Q-value for the same transition in the original continuous Q-learning problem. By using these lower bounds in temporal difference learning, our method QG-DDPG is less prone to soft divergence and exhibits increased sample efficiency while being more robust to hyperparameters. QGraphs also retain information from transitions that have already been overwritten in the replay memory, which can decrease the algorithm's sensitivity to the replay memory capacity.

preprint2020arXiv

Visualizing Local Minima in Multi-Robot Motion Planning using Multilevel Morse Theory

Multi-robot motion planning problems often have many local minima. It is essential to visualize those local minima such that we can better understand, debug and interact with multi-robot systems. Towards this goal, we present the multi-robot motion explorer, an algorithm which extends previous results on multilevel Morse theory by introducing a component-based framework, where we reduce multi-robot configuration spaces by reducing each robots component space using fiber bundles. Our algorithm exploits this component structure to search for and visualize local minima. A user of the algorithm can specify a multilevel abstraction and an optimization algorithm. We use this information to incrementally build a local minima tree for a given problem. We demonstrate this algorithm on several multi-robot systems of up to 20 degrees of freedom.

preprint2019arXiv

An Optimal Algorithm to Solve the Combined Task Allocation and Path Finding Problem

We consider multi-agent transport task problems where, e.g. in a factory setting, items have to be delivered from a given start to a goal pose while the delivering robots need to avoid collisions with each other on the floor. We introduce a Task Conflict-Based Search (TCBS) Algorithm to solve the combined delivery task allocation and multi-agent path planning problem optimally. The problem is known to be NP-hard and the optimal solver cannot scale. However, we introduce it as a baseline to evaluate the sub-optimality of other approaches. We show experimental results that compare our solver with different sub-optimal ones in terms of regret.

preprint2016arXiv

On the Fundamental Importance of Gauss-Newton in Motion Optimization

Hessian information speeds convergence substantially in motion optimization. The better the Hessian approximation the better the convergence. But how good is a given approximation theoretically? How much are we losing? This paper addresses that question and proves that for a particularly popular and empirically strong approximation known as the Gauss-Newton approximation, we actually lose very little--for a large class of highly expressive objective terms, the true Hessian actually limits to the Gauss-Newton Hessian quickly as the trajectory's time discretization becomes small. This result both motivates it's use and offers insight into computationally efficient design. For instance, traditional representations of kinetic energy exploit the generalized inertia matrix whose derivatives are usually difficult to compute. We introduce here a novel reformulation of rigid body kinetic energy designed explicitly for fast and accurate curvature calculation. Our theorem proves that the Gauss-Newton Hessian under this formulation efficiently captures the kinetic energy curvature, but requires only as much computation as a single evaluation of the traditional representation. Additionally, we introduce a technique that exploits these ideas implicitly using Cholesky decompositions for some cases when similar objective terms reformulations exist but may be difficult to find. Our experiments validate these findings and demonstrate their use on a real-world motion optimization system for high-dof motion generation.

preprint2015arXiv

The Advantage of Cross Entropy over Entropy in Iterative Information Gathering

Gathering the most information by picking the least amount of data is a common task in experimental design or when exploring an unknown environment in reinforcement learning and robotics. A widely used measure for quantifying the information contained in some distribution of interest is its entropy. Greedily minimizing the expected entropy is therefore a standard method for choosing samples in order to gain strong beliefs about the underlying random variables. We show that this approach is prone to temporally getting stuck in local optima corresponding to wrongly biased beliefs. We suggest instead maximizing the expected cross entropy between old and new belief, which aims at challenging refutable beliefs and thereby avoids these local optima. We show that both criteria are closely related and that their difference can be traced back to the asymmetry of the Kullback-Leibler divergence. In illustrative examples as well as simulated and real-world experiments we demonstrate the advantage of cross entropy over simple entropy for practical applications.

preprint2014arXiv

A Novel Augmented Lagrangian Approach for Inequalities and Convergent Any-Time Non-Central Updates

Motivated by robotic trajectory optimization problems we consider the Augmented Lagrangian approach to constrained optimization. We first propose an alternative augmentation of the Lagrangian to handle the inequality case (not based on slack variables) and a corresponding "central" update of the dual parameters. We proove certain properties of this update: roughly, in the case of LPs and when the "constraint activity" does not change between iterations, the KKT conditions hold after just one iteration. This gives essential insight on when the method is efficient in practise. We then present our main contribution, which are consistent any-time (non-central) updates of the dual parameters (i.e., updating the dual parameters when we are not currently at an extremum of the Lagrangian). Similar to the primal-dual Newton method, this leads to an algorithm that parallely updates the primal and dual solutions, not distinguishing between an outer loop to adapt the dual parameters and an inner loop to minimize the Lagrangian. We again proof certain properties of this anytime update: roughly, in the case of LPs and when constraint activities would not change, the dual solution converges after one iteration. Again, this gives essential insight in the caveats of the method: if constraint activities change the method may destablize. We propose simple smoothing, step-size adaptation and regularization mechanisms to counteract this effect and guarantee monotone convergence. Finally, we evaluate the proposed method on random LPs as well as on standard robot trajectory optimization problems, confirming our motivation and intuition that our approach performs well if the problem structure implies moderate stability of constraint activity.

preprint2014arXiv

Newton methods for k-order Markov Constrained Motion Problems

This is a documentation of a framework for robot motion optimization that aims to draw on classical constrained optimization methods. With one exception the underlying algorithms are classical ones: Gauss-Newton (with adaptive step size and damping), Augmented Lagrangian, log-barrier, etc. The exception is a novel any-time version of the Augmented Lagrangian. The contribution of this framework is to frame motion optimization problems in a way that makes the application of these methods efficient, especially by defining a very general class of robot motion problems while at the same time introducing abstractions that directly reflect the API of the source code.

preprint2014arXiv

Planning with Noisy Probabilistic Relational Rules

Noisy probabilistic relational rules are a promising world model representation for several reasons. They are compact and generalize over world instantiations. They are usually interpretable and they can be learned effectively from the action experiences in complex worlds. We investigate reasoning with such rules in grounded relational domains. Our algorithms exploit the compactness of rules for efficient and flexible decision-theoretic planning. As a first approach, we combine these rules with the Upper Confidence Bounds applied to Trees (UCT) algorithm based on look-ahead trees. Our second approach converts these rules into a structured dynamic Bayesian network representation and predicts the effects of action sequences using approximate inference and beliefs over world states. We evaluate the effectiveness of our approaches for planning in a simulated complex 3D robot manipulation scenario with an articulated manipulator and realistic physics and in domains of the probabilistic planning competition. Empirical results show that our methods can solve problems where existing methods fail.

preprint2012arXiv

Hierarchical POMDP Controller Optimization by Likelihood Maximization

Planning can often be simpli ed by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. [4] recently showed that the hierarchy discovery problem can be framed as a non-convex optimization problem. However, the inherent computational di culty of solving such an optimization problem makes it hard to scale to realworld problems. In another line of research, Toussaint et al. [18] developed a method to solve planning problems by maximumlikelihood estimation. In this paper, we show how the hierarchy discovery problem in partially observable domains can be tackled using a similar maximum likelihood approach. Our technique rst transforms the problem into a dynamic Bayesian network through which a hierarchical structure can naturally be discovered while optimizing the policy. Experimental results demonstrate that this approach scales better than previous techniques based on non-convex optimization.

preprint2012arXiv

Path Integral Control by Reproducing Kernel Hilbert Space Embedding

We present an embedding of stochastic optimal control problems, of the so called path integral form, into reproducing kernel Hilbert spaces. Using consistent, sample based estimates of the embedding leads to a model free, non-parametric approach for calculation of an approximate solution to the control problem. This formulation admits a decomposition of the problem into an invariant and task dependent component. Consequently, we make much more efficient use of the sample data compared to previous sample based approaches in this domain, e.g., by allowing sample re-use across tasks. Numerical examples on test problems, which illustrate the sample efficiency, are provided.

preprint2010arXiv

Approximate Inference and Stochastic Optimal Control

We propose a novel reformulation of the stochastic optimal control problem as an approximate inference problem, demonstrating, that such a interpretation leads to new practical methods for the original problem. In particular we characterise a novel class of iterative solutions to the stochastic optimal control problem based on a natural relaxation of the exact dual formulation. These theoretical insights are applied to the Reinforcement Learning problem where they lead to new model free, off policy methods for discrete and continuous problems.

preprint2000arXiv

A gauge theoretical view of the charge concept in Einstein gravity

We will discuss some analogies between internal gauge theories and gravity in order to better understand the charge concept in gravity. A dimensional analysis of gauge theories in general and a strict definition of elementary, monopole, and topological charges are applied to electromagnetism and to teleparallelism, a gauge theoretical formulation of Einstein gravity. As a result we inevitably find that the gravitational coupling constant has dimension $\hbar/l^2$, the mass parameter of a particle dimension $\hbar/l$, and the Schwarzschild mass parameter dimension l (where l means length). These dimensions confirm the meaning of mass as elementary and as monopole charge of the translation group, respectively. In detail, we find that the Schwarzschild mass parameter is a quasi-electric monopole charge of the time translation whereas the NUT parameter is a quasi-magnetic monopole charge of the time translation as well as a topological charge. The Kerr parameter and the electric and magnetic charges are interpreted similarly. We conclude that each elementary charge of a Casimir operator of the gauge group is the source of a (quasi-electric) monopole charge of the respective Killing vector.

preprint1999arXiv

A numeric solution for metric-affine gravity and Einstein's gravitational theory with Proca matter

A special case of metric-affine gauge theory of gravity (MAG) is equivalent to general relativity with Proca matter as source. We study in detail a corresponding numeric solution of the Reissner-Nordstr"om type. It is static, spherically symmetric, and of electric type. In particular, this solution has no horizon, so it has a naked singularity as its origin.

Marc Toussaint

What is connected

Connect this record

See the researcher in context

Building this map preview

32 published item(s)

Manifold Sampling via Entropy Maximization

Segment Anything with Robust Uncertainty-Accuracy Correlation

db-A*: Discontinuity-bounded Search for Kinodynamic Mobile Robot Motion Planning

Deep Visual Constraints: Neural Implicit Models for Manipulation Planning from Visual Input

FC$^3$: Feasibility-Based Control Chain Coordination

Learning Multi-Object Dynamics with Compositional Neural Radiance Fields

MotionBenchMaker: A Tool to Generate and Benchmark Motion Planning Datasets

Path-Tree Optimization in Discrete Partially Observable Environments using Rapidly-Exploring Belief-Space Graphs

Reinforcement Learning with Neural Radiance Fields

RHH-LGP: Receding Horizon And Heuristics-Based Logic-Geometric Programming For Task And Motion Planning

ST-RRT*: Asymptotically-Optimal Bidirectional Motion Planning through Space-Time

Visualization of Nonlinear Programming for Robot Motion Planning

An Interior Point Method Solving Motion Planning Problems with Narrow Passages

Anticipating Human Intention for Full-Body Motion Prediction in Object Grasping and Placing Tasks

Deep Visual Reasoning: Learning to Predict Action Sequences for Task and Motion Planning from an Initial Scene Image

Describing Physics For Physical Reasoning: Force-based Sequential Manipulation Planning

Natural Gradient Shared Control

Prediction of Human Full-Body Movements with Motion Optimization and Recurrent Neural Networks

Probabilistic Framework for Constrained Manipulations and Task and Motion Planning under Uncertainty

Qgraph-bounded Q-learning: Stabilizing Model-Free Off-Policy Deep Reinforcement Learning

Visualizing Local Minima in Multi-Robot Motion Planning using Multilevel Morse Theory

An Optimal Algorithm to Solve the Combined Task Allocation and Path Finding Problem

On the Fundamental Importance of Gauss-Newton in Motion Optimization

The Advantage of Cross Entropy over Entropy in Iterative Information Gathering

A Novel Augmented Lagrangian Approach for Inequalities and Convergent Any-Time Non-Central Updates

Newton methods for k-order Markov Constrained Motion Problems

Planning with Noisy Probabilistic Relational Rules

Hierarchical POMDP Controller Optimization by Likelihood Maximization

Path Integral Control by Reproducing Kernel Hilbert Space Embedding

Approximate Inference and Stochastic Optimal Control

A gauge theoretical view of the charge concept in Einstein gravity

A numeric solution for metric-affine gravity and Einstein's gravitational theory with Proca matter