Source author record

Robert Platt

Robert Platt appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics Machine Learning Artificial Intelligence Computer Vision

Catalog footprint

What is connected

14works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

$\mathrm{SO}(2)$-Equivariant Reinforcement Learning

Equivariant neural networks enforce symmetry within the structure of their convolutional layers, resulting in a substantial improvement in sample efficiency when learning an equivariant or invariant function. Such models are applicable to robotic manipulation learning which can often be formulated as a rotationally symmetric problem. This paper studies equivariant model architectures in the context of $Q$-learning and actor-critic reinforcement learning. We identify equivariant and invariant characteristics of the optimal $Q$-function and the optimal policy and propose equivariant DQN and SAC algorithms that leverage this structure. We present experiments that demonstrate that our equivariant versions of DQN and SAC can be significantly more sample efficient than competing algorithms on an important class of robotic manipulation problems.

preprint2022arXiv

Binding Actions to Objects in World Models

We study the problem of binding actions to objects in object-factored world models using action-attention mechanisms. We propose two attention mechanisms for binding actions to objects, soft attention and hard attention, which we evaluate in the context of structured world models for five environments. Our experiments show that hard attention helps contrastively-trained structured world models to learn to separate individual objects in an object-based grid-world environment. Further, we show that soft attention increases performance of factored world models trained on a robotic manipulation task. The learned action attention weights can be used to interpret the factored world model as the attention focuses on the manipulated object in the environment.

preprint2022arXiv

Efficient and Accurate Candidate Generation for Grasp Pose Detection in SE(3)

Grasp detection of novel objects in unstructured environments is a key capability in robotic manipulation. For 2D grasp detection problems where grasps are assumed to lie in the plane, it is common to design a fully convolutional neural network that predicts grasps over an entire image in one step. However, this is not possible for grasp pose detection where grasp poses are assumed to exist in SE(3). In this case, it is common to approach the problem in two steps: grasp candidate generation and candidate classification. Since grasp candidate classification is typically expensive, the problem becomes one of efficiently identifying high quality candidate grasps. This paper proposes a new grasp candidate generation method that significantly outperforms major 3D grasp detection baselines. Supplementary material is available at https://atenpas.github.io/psn/.

preprint2022arXiv

Factored World Models for Zero-Shot Generalization in Robotic Manipulation

World models for environments with many objects face a combinatorial explosion of states: as the number of objects increases, the number of possible arrangements grows exponentially. In this paper, we learn to generalize over robotic pick-and-place tasks using object-factored world models, which combat the combinatorial explosion by ensuring that predictions are equivariant to permutations of objects. Previous object-factored models were limited either by their inability to model actions, or by their inability to plan for complex manipulation tasks. We build on recent contrastive methods for training object-factored world models, which we extend to model continuous robot actions and to accurately predict the physics of robotic pick-and-place. To do so, we use a residual stack of graph neural networks that receive action information at multiple levels in both their node and edge neural networks. Crucially, our learned model can make predictions about tasks not represented in the training data. That is, we demonstrate successful zero-shot generalization to novel tasks, with only a minor decrease in model performance. Moreover, we show that an ensemble of our models can be used to plan for tasks involving up to 12 pick and place actions using heuristic search. We also demonstrate transfer to a physical robot.

preprint2022arXiv

GASCN: Graph Attention Shape Completion Network

Shape completion, the problem of inferring the complete geometry of an object given a partial point cloud, is an important problem in robotics and computer vision. This paper proposes the Graph Attention Shape Completion Network (GASCN), a novel neural network model that solves this problem. This model combines a graph-based model for encoding local point cloud information with an MLP-based architecture for encoding global information. For each completed point, our model infers the normal and extent of the local surface patch which is used to produce dense yet precise shape completions. We report experiments that demonstrate that GASCN outperforms standard shape completion methods on a standard benchmark drawn from the Shapenet dataset.

preprint2022arXiv

Hierarchical Reinforcement Learning under Mixed Observability

The framework of mixed observable Markov decision processes (MOMDP) models many robotic domains in which some state variables are fully observable while others are not. In this work, we identify a significant subclass of MOMDPs defined by how actions influence the fully observable components of the state and how those, in turn, influence the partially observable components and the rewards. This unique property allows for a two-level hierarchical approach we call HIerarchical Reinforcement Learning under Mixed Observability (HILMO), which restricts partial observability to the top level while the bottom level remains fully observable, enabling higher learning efficiency. The top level produces desired goals to be reached by the bottom level until the task is solved. We further develop theoretical guarantees to show that our approach can achieve optimal and quasi-optimal behavior under mild assumptions. Empirical results on long-horizon continuous control tasks demonstrate the efficacy and efficiency of our approach in terms of improved success rate, sample efficiency, and wall-clock training time. We also deploy policies learned in simulation on a real robot.

preprint2022arXiv

Tactile Pose Estimation and Policy Learning for Unknown Object Manipulation

Object pose estimation methods allow finding locations of objects in unstructured environments. This is a highly desired skill for autonomous robot manipulation as robots need to estimate the precise poses of the objects in order to manipulate them. In this paper, we investigate the problems of tactile pose estimation and manipulation for category-level objects. Our proposed method uses a Bayes filter with a learned tactile observation model and a deterministic motion model. Later, we train policies using deep reinforcement learning where the agents use the belief estimation from the Bayes filter. Our models are trained in simulation and transferred to the real world. We analyze the reliability and the performance of our framework through a series of simulated and real-world experiments and compare our method to the baseline work. Our results show that the learned tactile observation model can localize the pose of novel objects at 2-mm and 1-degree resolution for position and orientation, respectively. Furthermore, we experiment on a bottle opening task where the gripper needs to reach the desired grasp state.

preprint2022arXiv

Visual Foresight With a Local Dynamics Model

Model-free policy learning has been shown to be capable of learning manipulation policies which can solve long-time horizon tasks using single-step manipulation primitives. However, training these policies is a time-consuming process requiring large amounts of data. We propose the Local Dynamics Model (LDM) which efficiently learns the state-transition function for these manipulation primitives. By combining the LDM with model-free policy learning, we can learn policies which can solve complex manipulation tasks using one-step lookahead planning. We show that the LDM is both more sample-efficient and outperforms other model architectures. When combined with planning, we can outperform other model-based and model-free policies on several challenging manipulation tasks in simulation.

preprint2021arXiv

Learning Discrete State Abstractions With Deep Variational Inference

Abstraction is crucial for effective sequential decision making in domains with large state spaces. In this work, we propose an information bottleneck method for learning approximate bisimulations, a type of state abstraction. We use a deep neural encoder to map states onto continuous embeddings. We map these embeddings onto a discrete representation using an action-conditioned hidden Markov model, which is trained end-to-end with the neural network. Our method is suited for environments with high-dimensional states and learns from a stream of experience collected by an agent acting in a Markov decision process. Through this learned discrete abstract model, we can efficiently plan for unseen goals in a multi-goal Reinforcement Learning setting. We test our method in simplified robotic manipulation domains with image states. We also compare it against previous model-based approaches to finding bisimulations in discrete grid-world-like environments. Source code is available at https://github.com/ondrejba/discrete_abstractions.

preprint2021arXiv

Robotic Pick-and-Place With Uncertain Object Instance Segmentation and Shape Completion

We consider robotic pick-and-place of partially visible, novel objects, where goal placements are non-trivial, e.g., tightly packed into a bin. One approach is (a) use object instance segmentation and shape completion to model the objects and (b) use a regrasp planner to decide grasps and places displacing the models to their goals. However, it is critical for the planner to account for uncertainty in the perceived models, as object geometries in unobserved areas are just guesses. We account for perceptual uncertainty by incorporating it into the regrasp planner's cost function. We compare seven different costs. One of these, which uses neural networks to estimate probability of grasp and place stability, consistently outperforms uncertainty-unaware costs and evaluates faster than Monte Carlo sampling. On a real robot, the proposed cost results in successfully packing objects tightly into a bin 7.8% more often versus the commonly used minimum-number-of-grasps cost.

preprint2020arXiv

Learning Manipulation Skills Via Hierarchical Spatial Attention

Learning generalizable skills in robotic manipulation has long been challenging due to real-world sized observation and action spaces. One method for addressing this problem is attention focus -- the robot learns where to attend its sensors and irrelevant details are ignored. However, these methods have largely not caught on due to the difficulty of learning a good attention policy and the added partial observability induced by a narrowed window of focus. This article addresses the first issue by constraining gazes to a spatial hierarchy. For the second issue, we identify a case where the partial observability induced by attention does not prevent Q-learning from finding an optimal policy. We conclude with real-robot experiments on challenging pick-place tasks demonstrating the applicability of the approach.

preprint2020arXiv

Learning visual servo policies via planner cloning

Learning control policies for visual servoing in novel environments is an important problem. However, standard model-free policy learning methods are slow. This paper explores planner cloning: using behavior cloning to learn policies that mimic the behavior of a full-state motion planner in simulation. We propose Penalized Q Cloning (PQC), a new behavior cloning algorithm. We show that it outperforms several baselines and ablations on some challenging problems involving visual servoing in novel environments while avoiding obstacles. Finally, we demonstrate that these policies can be transferred effectively onto a real robotic platform, achieving approximately an 87% success rate both in simulation and on a real robot.

preprint2015arXiv

Using Geometry to Detect Grasps in 3D Point Clouds

This paper proposes a new approach to detecting grasp points on novel objects presented in clutter. The input to our algorithm is a point cloud and the geometric parameters of the robot hand. The output is a set of hand configurations that are expected to be good grasps. Our key idea is to use knowledge of the geometry of a good grasp to improve detection. First, we use a geometrically necessary condition to sample a large set of high quality grasp hypotheses. We were surprised to find that using simple geometric conditions for detection can result in a relatively high grasp success rate. Second, we use the notion of an antipodal grasp (a standard characterization of a good two fingered grasp) to help us classify these grasp hypotheses. In particular, we generate a large automatically labeled training set that gives us high classification accuracy. Overall, our method achieves an average grasp success rate of 88% when grasping novels objects presented in isolation and an average success rate of 73% when grasping novel objects presented in dense clutter. This system is available as a ROS package at http://wiki.ros.org/agile_grasp.

preprint2013arXiv

Localizing Grasp Affordances in 3-D Points Clouds Using Taubin Quadric Fitting

Perception-for-grasping is a challenging problem in robotics. Inexpensive range sensors such as the Microsoft Kinect provide sensing capabilities that have given new life to the effort of developing robust and accurate perception methods for robot grasping. This paper proposes a new approach to localizing enveloping grasp affordances in 3-D point clouds efficiently. The approach is based on modeling enveloping grasp affordances as a cylindrical shells that corresponds to the geometry of the robot hand. A fast and accurate fitting method for quadratic surfaces is the core of our approach. An evaluation on a set of cluttered environments shows high precision and recall statistics. Our results also show that the approach compares favorably with some alternatives, and that it is efficient enough to be employed for robot grasping in real-time.

Robert Platt

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

$\mathrm{SO}(2)$-Equivariant Reinforcement Learning

Binding Actions to Objects in World Models

Efficient and Accurate Candidate Generation for Grasp Pose Detection in SE(3)

Factored World Models for Zero-Shot Generalization in Robotic Manipulation

GASCN: Graph Attention Shape Completion Network

Hierarchical Reinforcement Learning under Mixed Observability

Tactile Pose Estimation and Policy Learning for Unknown Object Manipulation

Visual Foresight With a Local Dynamics Model

Learning Discrete State Abstractions With Deep Variational Inference

Robotic Pick-and-Place With Uncertain Object Instance Segmentation and Shape Completion

Learning Manipulation Skills Via Hierarchical Spatial Attention

Learning visual servo policies via planner cloning

Using Geometry to Detect Grasps in 3D Point Clouds

Localizing Grasp Affordances in 3-D Points Clouds Using Taubin Quadric Fitting