Source author record

Franziska Meier

Franziska Meier appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics Machine Learning Artificial Intelligence Computer Vision Systems and Control

Catalog footprint

What is connected

11works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Differentiable and Learnable Robot Models

Building differentiable simulations of physical processes has recently received an increasing amount of attention. Specifically, some efforts develop differentiable robotic physics engines motivated by the computational benefits of merging rigid body simulations with modern differentiable machine learning libraries. Here, we present a library that focuses on the ability to combine data driven methods with analytical rigid body computations. More concretely, our library \emph{Differentiable Robot Models} implements both \emph{differentiable} and \emph{learnable} models of the kinematics and dynamics of robots in Pytorch. The source-code is available at \url{https://github.com/facebookresearch/differentiable-robot-model}

preprint2022arXiv

Habitat 2.0: Training Home Assistants to Rearrange their Habitat

We introduce Habitat 2.0 (H2.0), a simulation platform for training virtual robots in interactive 3D environments and complex physics-enabled scenarios. We make comprehensive contributions to all levels of the embodied AI stack - data, simulation, and benchmark tasks. Specifically, we present: (i) ReplicaCAD: an artist-authored, annotated, reconfigurable 3D dataset of apartments (matching real spaces) with articulated objects (e.g. cabinets and drawers that can open/close); (ii) H2.0: a high-performance physics-enabled 3D simulator with speeds exceeding 25,000 simulation steps per second (850x real-time) on an 8-GPU node, representing 100x speed-ups over prior work; and, (iii) Home Assistant Benchmark (HAB): a suite of common tasks for assistive robots (tidy the house, prepare groceries, set the table) that test a range of mobile manipulation capabilities. These large-scale engineering contributions allow us to systematically compare deep reinforcement learning (RL) at scale and classical sense-plan-act (SPA) pipelines in long-horizon structured tasks, with an emphasis on generalization to new objects, receptacles, and layouts. We find that (1) flat RL policies struggle on HAB compared to hierarchical ones; (2) a hierarchy with independent skills suffers from 'hand-off problems', and (3) SPA pipelines are more brittle than RL policies.

preprint2022arXiv

Model Based Meta Learning of Critics for Policy Gradients

Being able to seamlessly generalize across different tasks is fundamental for robots to act in our world. However, learning representations that generalize quickly to new scenarios is still an open research problem in reinforcement learning. In this paper we present a framework to meta-learn the critic for gradient-based policy learning. Concretely, we propose a model-based bi-level optimization algorithm that updates the critics parameters such that the policy that is learned with the updated critic gets closer to solving the meta-training tasks. We illustrate that our algorithm leads to learned critics that resemble the ground truth Q function for a given task. Finally, after meta-training, the learned critic can be used to learn new policies for new unseen task and environment settings via model-free policy gradient optimization, without requiring a model. We present results that show the generalization capabilities of our learned critic to new tasks and dynamics when used to learn a new policy in a new scenario.

preprint2022arXiv

Residual Learning from Demonstration: Adapting DMPs for Contact-rich Manipulation

Manipulation skills involving contact and friction are inherent to many robotics tasks. Using the class of motor primitives for peg-in-hole like insertions, we study how robots can learn such skills. Dynamic Movement Primitives (DMP) are a popular way of extracting such policies through behaviour cloning (BC) but can struggle in the context of insertion. Policy adaptation strategies such as residual learning can help improve the overall performance of policies in the context of contact-rich manipulation. However, it is not clear how to best do this with DMPs. As a result, we consider several possible ways for adapting a DMP formulation and propose ``residual Learning from Demonstration`` (rLfD), a framework that combines DMPs with Reinforcement Learning (RL) to learn a residual correction policy. Our evaluations suggest that applying residual learning directly in task space and operating on the full pose of the robot can significantly improve the overall performance of DMPs. We show that rLfD offers a gentle to the joints solution that improves the task success and generalisation of DMPs \rb{and enables transfer to different geometries and frictions through few-shot task adaptation}. The proposed framework is evaluated on a set of tasks. A simulated robot and a physical robot have to successfully insert pegs, gears and plugs into their respective sockets. Other material and videos accompanying this paper are provided at https://sites.google.com/view/rlfd/.

preprint2021arXiv

Meta-Learning via Learned Loss

Typically, loss functions, regularization mechanisms and other important aspects of training parametric models are chosen heuristically from a limited set of options. In this paper, we take the first step towards automating this process, with the view of producing models which train faster and more robustly. Concretely, we present a meta-learning method for learning parametric loss functions that can generalize across different tasks and model architectures. We develop a pipeline for meta-training such loss functions, targeted at maximizing the performance of the model trained under them. The loss landscape produced by our learned losses significantly improves upon the original task-specific losses in both supervised and reinforcement learning tasks. Furthermore, we show that our meta-learning framework is flexible enough to incorporate additional information at meta-train time. This information shapes the learned loss function such that the environment does not need to provide this information during meta-test time. We make our code available at https://sites.google.com/view/mlthree.

preprint2020arXiv

Adversarial Continual Learning

Continual learning aims to learn new tasks without forgetting previously learned ones. We hypothesize that representations learned to solve each task in a sequence have a shared structure while containing some task-specific properties. We show that shared features are significantly less prone to forgetting and propose a novel hybrid continual learning framework that learns a disjoint representation for task-invariant and task-specific features required to solve a sequence of tasks. Our model combines architecture growth to prevent forgetting of task-specific skills and an experience replay approach to preserve shared skills. We demonstrate our hybrid approach is effective in avoiding forgetting and show it is superior to both architecture-based and memory-based approaches on class incrementally learning of a single dataset as well as a sequence of multiple datasets in image classification. Our code is available at \url{https://github.com/facebookresearch/Adversarial-Continual-Learning}.

preprint2018arXiv

Using Simulation to Improve Sample-Efficiency of Bayesian Optimization for Bipedal Robots

Learning for control can acquire controllers for novel robotic tasks, paving the path for autonomous agents. Such controllers can be expert-designed policies, which typically require tuning of parameters for each task scenario. In this context, Bayesian optimization (BO) has emerged as a promising approach for automatically tuning controllers. However, when performing BO on hardware for high-dimensional policies, sample-efficiency can be an issue. Here, we develop an approach that utilizes simulation to map the original parameter space into a domain-informed space. During BO, similarity between controllers is now calculated in this transformed space. Experiments on the ATRIAS robot hardware and another bipedal robot simulation show that our approach succeeds at sample-efficiently learning controllers for multiple robots. Another question arises: What if the simulation significantly differs from hardware? To answer this, we create increasingly approximate simulators and study the effect of increasing simulation-hardware mismatch on the performance of Bayesian optimization. We also compare our approach to other approaches from literature, and find it to be more reliable, especially in cases of high mismatch. Our experiments show that our approach succeeds across different controller types, bipedal robot models and simulator fidelity levels, making it applicable to a wide range of bipedal locomotion problems.

preprint2016arXiv

A Probabilistic Representation for Dynamic Movement Primitives

Dynamic Movement Primitives have successfully been used to realize imitation learning, trial-and-error learning, reinforce- ment learning, movement recognition and segmentation and control. Because of this they have become a popular represen- tation for motor primitives. In this work, we showcase how DMPs can be reformulated as a probabilistic linear dynamical system with control inputs. Through this probabilistic repre- sentation of DMPs, algorithms such as Kalman filtering and smoothing are directly applicable to perform inference on pro- prioceptive sensor measurements during execution. We show that inference in this probabilistic model automatically leads to a feedback term to online modulate the execution of a DMP. Furthermore, we show how inference allows us to measure the likelihood that we are successfully executing a given motion primitive. In this context, we show initial results of using the probabilistic model to detect execution failures on a simulated movement primitive dataset.

preprint2016arXiv

DOOMED: Direct Online Optimization of Modeling Errors in Dynamics

It has long been hoped that model-based control will improve tracking performance while maintaining or increasing compliance. This hope hinges on having or being able to estimate an accurate inverse dynamics model. As a result, substantial effort has gone into modeling and estimating dynamics (error) models. Most recent research has focused on learning the true inverse dynamics using data points mapping observed accelerations to the torques used to generate them. Unfortunately, if the initial tracking error is bad, such learning processes may train substantially off-distribution to predict well on actual observed acceleration rather then the desired accelerations. This work takes a different approach. We define a class of gradient-based online learning algorithms we term Direct Online Optimization for Modeling Errors in Dynamics (DOOMED) that directly minimize an objective measuring the divergence between actual and desired accelerations. Our objective is defined in terms of the true system's unknown dynamics and is therefore impossible to evaluate. However, we show that its gradient is measurable online from system data. We develop a novel adaptive control approach based on running online learning to directly correct (inverse) dynamics errors in real time using the data stream from the robot to accurately achieve desired accelerations during execution.

preprint2016arXiv

Robust Gaussian Filtering using a Pseudo Measurement

Many sensors, such as range, sonar, radar, GPS and visual devices, produce measurements which are contaminated by outliers. This problem can be addressed by using fat-tailed sensor models, which account for the possibility of outliers. Unfortunately, all estimation algorithms belonging to the family of Gaussian filters (such as the widely-used extended Kalman filter and unscented Kalman filter) are inherently incompatible with such fat-tailed sensor models. The contribution of this paper is to show that any Gaussian filter can be made compatible with fat-tailed sensor models by applying one simple change: Instead of filtering with the physical measurement, we propose to filter with a pseudo measurement obtained by applying a feature function to the physical measurement. We derive such a feature function which is optimal under some conditions. Simulation results show that the proposed method can effectively handle measurement outliers and allows for robust filtering in both linear and nonlinear systems.

preprint2014arXiv

Local Gaussian Regression

Locally weighted regression was created as a nonparametric learning method that is computationally efficient, can learn from very large amounts of data and add data incrementally. An interesting feature of locally weighted regression is that it can work with spatially varying length scales, a beneficial property, for instance, in control problems. However, it does not provide a generative model for function values and requires training and test data to be generated identically, independently. Gaussian (process) regression, on the other hand, provides a fully generative model without significant formal requirements on the distribution of training data, but has much higher computational cost and usually works with one global scale per input dimension. Using a localising function basis and approximate inference techniques, we take Gaussian (process) regression to increasingly localised properties and toward the same computational complexity class as locally weighted regression.

Franziska Meier

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Differentiable and Learnable Robot Models

Habitat 2.0: Training Home Assistants to Rearrange their Habitat

Model Based Meta Learning of Critics for Policy Gradients

Residual Learning from Demonstration: Adapting DMPs for Contact-rich Manipulation

Meta-Learning via Learned Loss

Adversarial Continual Learning

Using Simulation to Improve Sample-Efficiency of Bayesian Optimization for Bipedal Robots

A Probabilistic Representation for Dynamic Movement Primitives

DOOMED: Direct Online Optimization of Modeling Errors in Dynamics

Robust Gaussian Filtering using a Pseudo Measurement

Local Gaussian Regression