Source author record

Stéphane Doncieux

Stéphane Doncieux appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics Machine Learning Artificial Intelligence Neural and Evolutionary Computing

Catalog footprint

What is connected

5works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Automatic Acquisition of a Repertoire of Diverse Grasping Trajectories through Behavior Shaping and Novelty Search

Grasping a particular object may require a dedicated grasping movement that may also be specific to the robot end-effector. No generic and autonomous method does exist to generate these movements without making hypotheses on the robot or on the object. Learning methods could help to autonomously discover relevant grasping movements, but they face an important issue: grasping movements are so rare that a learning method based on exploration has little chance to ever observe an interesting movement, thus creating a bootstrap issue. We introduce an approach to generate diverse grasping movements in order to solve this problem. The movements are generated in simulation, for particular object positions. We test it on several simulated robots: Baxter, Pepper and a Kuka Iiwa arm. Although we show that generated movements actually work on a real Baxter robot, the aim is to use this method to create a large dataset to bootstrap deep learning methods.

preprint2022arXiv

Exploratory State Representation Learning

Not having access to compact and meaningful representations is known to significantly increase the complexity of reinforcement learning (RL). For this reason, it can be useful to perform state representation learning (SRL) before tackling RL tasks. However, obtaining a good state representation can only be done if a large diversity of transitions is observed, which can require a difficult exploration, especially if the environment is initially reward-free. To solve the problems of exploration and SRL in parallel, we propose a new approach called XSRL (eXploratory State Representation Learning). On one hand, it jointly learns compact state representations and a state transition estimator which is used to remove unexploitable information from the representations. On the other hand, it continuously trains an inverse model, and adds to the prediction error of this model a $k$-step learning progress bonus to form the maximization objective of a discovery policy. This results in a policy that seeks complex transitions from which the trained models can effectively learn. Our experimental results show that the approach leads to efficient exploration in challenging environments with image observations, and to state representations that significantly accelerate learning in RL tasks.

preprint2020arXiv

Behavioural Repertoire via Generative Adversarial Policy Networks

Learning algorithms are enabling robots to solve increasingly challenging real-world tasks. These approaches often rely on demonstrations and reproduce the behavior shown. Unexpected changes in the environment may require using different behaviors to achieve the same effect, for instance to reach and grasp an object in changing clutter. An emerging paradigm addressing this robustness issue is to learn a diverse set of successful behaviors for a given task, from which a robot can select the most suitable policy when faced with a new environment. In this paper, we explore a novel realization of this vision by learning a generative model over policies. Rather than learning a single policy, or a small fixed repertoire, our generative model for policies compactly encodes an unbounded number of policies and allows novel controller variants to be sampled. Leveraging our generative policy network, a robot can sample novel behaviors until it finds one that works for a new environment. We demonstrate this idea with an application of robust ball-throwing in the presence of obstacles. We show that this approach achieves a greater diversity of behaviors than an existing evolutionary approach, while maintaining good efficacy of sampled behaviors, allowing a Baxter robot to hit targets more often when ball throwing in the presence of obstacles.

preprint2015arXiv

Multi-objective analysis of computational models

Computational models are of increasing complexity and their behavior may in particular emerge from the interaction of different parts. Studying such models becomes then more and more difficult and there is a need for methods and tools supporting this process. Multi-objective evolutionary algorithms generate a set of trade-off solutions instead of a single optimal solution. The availability of a set of solutions that have the specificity to be optimal relative to carefully chosen objectives allows to perform data mining in order to better understand model features and regularities. We review the corresponding work, propose a unifying framework, and highlight its potential use. Typical questions that such a methodology allows to address are the following: what are the most critical parameters of the model? What are the relations between the parameters and the objectives? What are the typical behaviors of the model? Two examples are provided to illustrate the capabilities of the methodology. The features of a flapping-wing robot are thus evaluated to find out its speed-energy relation, together with the criticality of its parameters. A neurocomputational model of the Basal Ganglia brain nuclei is then considered and its most salient features according to this methodology are presented and discussed.

preprint2013arXiv

Crossing the Reality Gap: a Short Introduction to the Transferability Approach

In robotics, gradient-free optimization algorithms (e.g. evolutionary algorithms) are often used only in simulation because they require the evaluation of many candidate solutions. Nevertheless, solutions obtained in simulation often do not work well on the real device. The transferability approach aims at crossing this gap between simulation and reality by \emph{making the optimization algorithm aware of the limits of the simulation}. In the present paper, we first describe the transferability function, that maps solution descriptors to a score representing how well a simulator matches the reality. We then show that this function can be learned using a regression algorithm and a few experiments with the real devices. Our results are supported by an extensive study of the reality gap for a simple quadruped robot whose control parameters are optimized. In particular, we mapped the whole search space in reality and in simulation to understand the differences between the fitness landscapes.