Source author record

Jacopo Panerati

Jacopo Panerati appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics Machine Learning eess.SY Multiagent Systems Systems and Control Artificial Intelligence Neural and Evolutionary Computing

Catalog footprint

What is connected

6works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

safe-control-gym: a Unified Benchmark Suite for Safe Learning-based Control and Reinforcement Learning in Robotics

In recent years, both reinforcement learning and learning-based control -- as well as the study of their safety, which is crucial for deployment in real-world robots -- have gained significant traction. However, to adequately gauge the progress and applicability of new results, we need the tools to equitably compare the approaches proposed by the controls and reinforcement learning communities. Here, we propose a new open-source benchmark suite, called safe-control-gym, supporting both model-based and data-based control techniques. We provide implementations for three dynamic systems -- the cart-pole, the 1D, and 2D quadrotor -- and two control tasks -- stabilization and trajectory tracking. We propose to extend OpenAI's Gym API -- the de facto standard in reinforcement learning research -- with (i) the ability to specify (and query) symbolic dynamics and (ii) constraints, and (iii) (repeatably) inject simulated disturbances in the control inputs, state measurements, and inertial properties. To demonstrate our proposal and in an attempt to bring research communities closer together, we show how to use safe-control-gym to quantitatively compare the control performance, data efficiency, and safety of multiple approaches from the fields of traditional control, learning-based control, and reinforcement learning.

preprint2021arXiv

Learning-based Bias Correction for Time Difference of Arrival Ultra-wideband Localization of Resource-constrained Mobile Robots

Accurate indoor localization is a crucial enabling technology for many robotics applications, from warehouse management to monitoring tasks. Ultra-wideband (UWB) time difference of arrival (TDOA)-based localization is a promising lightweight, low-cost solution that can scale to a large number of devices -- making it especially suited for resource-constrained multi-robot applications. However, the localization accuracy of standard, commercially available UWB radios is often insufficient due to significant measurement bias and outliers. In this letter, we address these issues by proposing a robust UWB TDOA localization framework comprising of (i) learning-based bias correction and (ii) M-estimation-based robust filtering to handle outliers. The key properties of our approach are that (i) the learned biases generalize to different UWB anchor setups and (ii) the approach is computationally efficient enough to run on resource-constrained hardware. We demonstrate our approach on a Crazyflie nano-quadcopter. Experimental results show that the proposed localization framework, relying only on the onboard IMU and UWB, provides an average of 42.08 percent localization error reduction (in three different anchor setups) compared to the baseline approach without bias compensation. {We also show autonomous trajectory tracking on a quadcopter using our UWB TDOA localization approach.}

preprint2020arXiv

An Adversarial Approach to Private Flocking in Mobile Robot Teams

Privacy is an important facet of defence against adversaries. In this letter, we introduce the problem of private flocking. We consider a team of mobile robots flocking in the presence of an adversary, who is able to observe all robots' trajectories, and who is interested in identifying the leader. We present a method that generates private flocking controllers that hide the identity of the leader robot. Our approach towards privacy leverages a data-driven adversarial co-optimization scheme. We design a mechanism that optimizes flocking control parameters, such that leader inference is hindered. As the flocking performance improves, we successively train an adversarial discriminator that tries to infer the identity of the leader robot. To evaluate the performance of our co-optimization scheme, we investigate different classes of reference trajectories. Although it is reasonable to assume that there is an inherent trade-off between flocking performance and privacy, our results demonstrate that we are able to achieve high flocking performance and simultaneously reduce the risk of revealing the leader.

preprint2020arXiv

Learning-based Bias Correction for Ultra-wideband Localization of Resource-constrained Mobile Robots

Accurate indoor localization is a crucial enabling technology for many robotics applications, from warehouse management to monitoring tasks. Ultra-wideband (UWB) ranging is a promising solution which is low-cost, lightweight, and computationally inexpensive compared to alternative state-of-the-art approaches such as simultaneous localization and mapping, making it especially suited for resource-constrained aerial robots. Many commercially-available ultra-wideband radios, however, provide inaccurate, biased range measurements. In this article, we propose a bias correction framework compatible with both two-way ranging and time difference of arrival ultra-wideband localization. Our method comprises of two steps: (i) statistical outlier rejection and (ii) a learning-based bias correction. This approach is scalable and frugal enough to be deployed on-board a nano-quadcopter's microcontroller. Previous research mostly focused on two-way ranging bias correction and has not been implemented in closed-loop nor using resource-constrained robots. Experimental results show that, using our approach, the localization error is reduced by ~18.5% and 48% (for TWR and TDoA, respectively), and a quadcopter can accurately track trajectories with position information from UWB only.

preprint2020arXiv

Multi-Vehicle Mixed-Reality Reinforcement Learning for Autonomous Multi-Lane Driving

Autonomous driving promises to transform road transport. Multi-vehicle and multi-lane scenarios, however, present unique challenges due to constrained navigation and unpredictable vehicle interactions. Learning-based methods---such as deep reinforcement learning---are emerging as a promising approach to automatically design intelligent driving policies that can cope with these challenges. Yet, the process of safely learning multi-vehicle driving behaviours is hard: while collisions---and their near-avoidance---are essential to the learning process, directly executing immature policies on autonomous vehicles raises considerable safety concerns. In this article, we present a safe and efficient framework that enables the learning of driving policies for autonomous vehicles operating in a shared workspace, where the absence of collisions cannot be guaranteed. Key to our learning procedure is a sim2real approach that uses real-world online policy adaptation in a mixed-reality setup, where other vehicles and static obstacles exist in the virtual domain. This allows us to perform safe learning by simulating (and learning from) collisions between the learning agent(s) and other objects in virtual reality. Our results demonstrate that, after only a few runs in mixed-reality, collisions are significantly reduced.

preprint2018arXiv

Stop, Think, and Roll: Online Gain Optimization for Resilient Multi-robot Topologies

Efficient networking of many-robot systems is considered one of the grand challenges of robotics. In this article, we address the problem of achieving resilient, dynamic interconnection topologies in multi-robot systems. In scenarios in which the overall network topology is constantly changing, we aim at avoiding the onset of single points of failure, particularly situations in which the failure of a single robot causes the loss of connectivity for the overall network. We propose a method based on the combination of multiple control objectives and we introduce an online distributed optimization strategy that computes the optimal choice of control parameters for each robot. This ensures that the connectivity of the multi-robot system is not only preserved but also made more resilient to failures, as the network topology evolves. We provide simulation results, as well as experiments with real robots to validate theoretical findings and demonstrate the portability to robotic hardware.

Jacopo Panerati

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

safe-control-gym: a Unified Benchmark Suite for Safe Learning-based Control and Reinforcement Learning in Robotics

Learning-based Bias Correction for Time Difference of Arrival Ultra-wideband Localization of Resource-constrained Mobile Robots

An Adversarial Approach to Private Flocking in Mobile Robot Teams

Learning-based Bias Correction for Ultra-wideband Localization of Resource-constrained Mobile Robots

Multi-Vehicle Mixed-Reality Reinforcement Learning for Autonomous Multi-Lane Driving

Stop, Think, and Roll: Online Gain Optimization for Resilient Multi-robot Topologies