Source author record

Roland Siegwart

Roland Siegwart appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics Computer Vision Machine Learning Systems and Control eess.SY Multiagent Systems

Catalog footprint

What is connected

70works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

maplab 2.0 -- A Modular and Multi-Modal Mapping Framework

Integration of multiple sensor modalities and deep learning into Simultaneous Localization And Mapping (SLAM) systems are areas of significant interest in current research. Multi-modality is a stepping stone towards achieving robustness in challenging environments and interoperability of heterogeneous multi-robot systems with varying sensor setups. With maplab 2.0, we provide a versatile open-source platform that facilitates developing, testing, and integrating new modules and features into a fully-fledged SLAM system. Through extensive experiments, we show that maplab 2.0's accuracy is comparable to the state-of-the-art on the HILTI 2021 benchmark. Additionally, we showcase the flexibility of our system with three use cases: i) large-scale (approx. 10 km) multi-robot multi-session (23 missions) mapping, ii) integration of non-visual landmarks, and iii) incorporating a semantic object-based loop closure module into the mapping framework. The code is available open-source at https://github.com/ethz-asl/maplab.

preprint2023arXiv

Obstacle avoidance using raycasting and Riemannian Motion Policies at kHz rates for MAVs

In this paper, we present a novel method for using Riemannian Motion Policies on volumetric maps, shown in the example of obstacle avoidance for Micro Aerial Vehicles (MAVs). While sampling or optimization-based planners are widely used for obstacle avoidance with volumetric maps, they are computationally expensive and often have inflexible monolithic architectures. Riemannian Motion Policies are a modular, parallelizable, and efficient navigation paradigm but are challenging to use with the widely used voxel-based environment representations. We propose using GPU raycasting and a large number of concurrent policies to provide direct obstacle avoidance using Riemannian Motion Policies in voxelized maps without the need for smoothing or pre-processing of the map. Additionally, we present how the same method can directly plan on LiDAR scans without the need for an intermediate map. We show how this reactive approach compares favorably to traditional planning methods and is able to plan using thousands of rays at kilohertz rates. We demonstrate the planner successfully on a real MAV for static and dynamic obstacles. The presented planner is made available as an open-source software package.

preprint2022arXiv

Autonomous Teamed Exploration of Subterranean Environments using Legged and Aerial Robots

This paper presents a novel strategy for autonomous teamed exploration of subterranean environments using legged and aerial robots. Tailored to the fact that subterranean settings, such as cave networks and underground mines, often involve complex, large-scale and multi-branched topologies, while wireless communication within them can be particularly challenging, this work is structured around the synergy of an onboard exploration path planner that allows for resilient long-term autonomy, and a multi-robot coordination framework. The onboard path planner is unified across legged and flying robots and enables navigation in environments with steep slopes, and diverse geometries. When a communication link is available, each robot of the team shares submaps to a centralized location where a multi-robot coordination framework identifies global frontiers of the exploration space to inform each system about where it should re-position to best continue its mission. The strategy is verified through a field deployment inside an underground mine in Switzerland using a legged and a flying robot collectively exploring for 45 min, as well as a longer simulation study with three systems.

preprint2022arXiv

Building an Aerial-Ground Robotics System for Precision Farming: An Adaptable Solution

The application of autonomous robots in agriculture is gaining increasing popularity thanks to the high impact it may have on food security, sustainability, resource use efficiency, reduction of chemical treatments, and the optimization of human effort and yield. With this vision, the Flourish research project aimed to develop an adaptable robotic solution for precision farming that combines the aerial survey capabilities of small autonomous unmanned aerial vehicles (UAVs) with targeted intervention performed by multi-purpose unmanned ground vehicles (UGVs). This paper presents an overview of the scientific and technological advances and outcomes obtained in the project. We introduce multi-spectral perception algorithms and aerial and ground-based systems developed for monitoring crop density, weed pressure, crop nitrogen nutrition status, and to accurately classify and locate weeds. We then introduce the navigation and mapping systems tailored to our robots in the agricultural environment, as well as the modules for collaborative mapping. We finally present the ground intervention hardware, software solutions, and interfaces we implemented and tested in different field conditions and with different crops. We describe a real use case in which a UAV collaborates with a UGV to monitor the field and to perform selective spraying without human intervention.

preprint2022arXiv

CERBERUS: Autonomous Legged and Aerial Robotic Exploration in the Tunnel and Urban Circuits of the DARPA Subterranean Challenge

Autonomous exploration of subterranean environments constitutes a major frontier for robotic systems as underground settings present key challenges that can render robot autonomy hard to achieve. This has motivated the DARPA Subterranean Challenge, where teams of robots search for objects of interest in various underground environments. In response, the CERBERUS system-of-systems is presented as a unified strategy towards subterranean exploration using legged and flying robots. As primary robots, ANYmal quadruped systems are deployed considering their endurance and potential to traverse challenging terrain. For aerial robots, both conventional and collision-tolerant multirotors are utilized to explore spaces too narrow or otherwise unreachable by ground systems. Anticipating degraded sensing conditions, a complementary multi-modal sensor fusion approach utilizing camera, LiDAR, and inertial data for resilient robot pose estimation is proposed. Individual robot pose estimates are refined by a centralized multi-robot map optimization approach to improve the reported location accuracy of detected objects of interest in the DARPA-defined coordinate frame. Furthermore, a unified exploration path planning policy is presented to facilitate the autonomous operation of both legged and aerial robots in complex underground networks. Finally, to enable communication between the robots and the base station, CERBERUS utilizes a ground rover with a high-gain antenna and an optical fiber connection to the base station, alongside breadcrumbing of wireless nodes by our legged robots. We report results from the CERBERUS system-of-systems deployment at the DARPA Subterranean Challenge Tunnel and Urban Circuits, along with the current limitations and the lessons learned for the benefit of the community.

preprint2022arXiv

Closed-Loop Next-Best-View Planning for Target-Driven Grasping

Picking a specific object from clutter is an essential component of many manipulation tasks. Partial observations often require the robot to collect additional views of the scene before attempting a grasp. This paper proposes a closed-loop next-best-view planner that drives exploration based on occluded object parts. By continuously predicting grasps from an up-to-date scene reconstruction, our policy can decide online to finalize a grasp execution or to adapt the robot's trajectory for further exploration. We show that our reactive approach decreases execution times without loss of grasp success rates compared to common camera placements and handles situations where the fixed baselines fail. Video and code are available at https://github.com/ethz-asl/active_grasp.

preprint2022arXiv

Collaborative Robot Mapping using Spectral Graph Analysis

In this paper, we deal with the problem of creating globally consistent pose graphs in a centralized multi-robot SLAM framework. For each robot to act autonomously, individual onboard pose estimates and maps are maintained, which are then communicated to a central server to build an optimized global map. However, inconsistencies between onboard and server estimates can occur due to onboard odometry drift or failure. Furthermore, robots do not benefit from the collaborative map if the server provides no feedback in a computationally tractable and bandwidth-efficient manner. Motivated by this challenge, this paper proposes a novel collaborative mapping framework to enable accurate global mapping among robots and server. In particular, structural differences between robot and server graphs are exploited at different spatial scales using graph spectral analysis to generate necessary constraints for the individual robot pose graphs. The proposed approach is thoroughly analyzed and validated using several real-world multi-robot field deployments where we show improvements of the onboard system up to 90%.

preprint2022arXiv

Descriptellation: Deep Learned Constellation Descriptors

Current descriptors for global localization often struggle under vast viewpoint or appearance changes. One possible improvement is the addition of topological information on semantic objects. However, handcrafted topological descriptors are hard to tune and not robust to environmental noise, drastic perspective changes, object occlusion or misdetections. To solve this problem, we formulate a learning-based approach by modelling semantically meaningful object constellations as graphs and using Deep Graph Convolution Networks to map a constellation to a descriptor. We demonstrate the effectiveness of our Deep Learned Constellation Descriptor (Descriptellation) on two real-world datasets. Although Descriptellation is trained on randomly generated simulation datasets, it shows good generalization abilities on real-world datasets. Descriptellation also outperforms state-of-the-art and handcrafted constellation descriptors for global localization, and is robust to different types of noise. The code is publicly available at https://github.com/ethz-asl/Descriptellation.

preprint2022arXiv

Embodied Active Domain Adaptation for Semantic Segmentation via Informative Path Planning

This work presents an embodied agent that can adapt its semantic segmentation network to new indoor environments in a fully autonomous way. Because semantic segmentation networks fail to generalize well to unseen environments, the agent collects images of the new environment which are then used for self-supervised domain adaptation. We formulate this as an informative path planning problem, and present a novel information gain that leverages uncertainty extracted from the semantic model to safely collect relevant data. As domain adaptation progresses, these uncertainties change over time and the rapid learning feedback of our system drives the agent to collect different data. Experiments show that our method adapts to new environments faster and with higher final performance compared to an exploration objective, and can successfully be deployed to real-world environments on physical robots.

preprint2022arXiv

Energy Tank-Based Policies for Robust Aerial Physical Interaction with Moving Objects

Although manipulation capabilities of aerial robots greatly improved in the last decade, only few works addressed the problem of aerial physical interaction with dynamic environments, proposing strongly model-based approaches. However, in real scenarios, modeling the environment with high accuracy is often impossible. In this work we aim at developing a control framework for OMAVs for reliable physical interaction tasks with articulated and movable objects in the presence of possibly unforeseen disturbances, and without relying on an accurate model of the environment. Inspired by previous applications of energy-based controllers for physical interaction, we propose a passivity-based impedance and wrench tracking controller in combination with a momentum-based wrench estimator. This is combined with an energy-tank framework to guarantee the stability of the system, while energy and power flow-based adaptation policies are deployed to enable safe interaction with any type of passive environment. The control framework provides formal guarantees of stability, which is validated in practice considering the challenging task of pushing a cart of unknown mass, moving on a surface of unknown friction, as well as subjected to unknown disturbances. For this scenario, we present, evaluate and discuss three different policies.

preprint2022arXiv

Fast and Compute-efficient Sampling-based Local Exploration Planning via Distribution Learning

Exploration is a fundamental problem in robotics. While sampling-based planners have shown high performance, they are oftentimes compute intensive and can exhibit high variance. To this end, we propose to directly learn the underlying distribution of informative views based on the spatial context in the robot's map. We further explore a variety of methods to also learn the information gain. We show in thorough experimental evaluation that our proposed system improves exploration performance by up to 28% over classical methods, and find that learning the gains in addition to the sampling distribution can provide favorable performance vs. compute trade-offs for compute-constrained systems. We demonstrate in simulation and on a low-cost mobile robot that our system generalizes well to varying environments.

preprint2022arXiv

Human-State-Aware Controller for a Tethered Aerial Robot Guiding a Human by Physical Interaction

With the rapid development of Aerial Physical Interaction, the possibility to have aerial robots physically interacting with humans is attracting a growing interest. In one of our previous works, we considered one of the first systems in which a human is physically connected to an aerial vehicle by a cable. There, we developed a compliant controller that allows the robot to pull the human toward a desired position using forces only as an indirect communication-channel. However, this controller is based on the robot-state only, which makes the system not adaptable to the human behavior, and in particular to their walking speed. This reduces the effectiveness and comfort of the guidance when the human is still far from the desired point. In this paper, we formally analyze the problem and propose a human-state-aware controller that includes a human`s velocity feedback. We theoretically prove and experimentally show that this method provides a more consistent guiding force which enhances the guiding experience.

preprint2022arXiv

Learning Variable Impedance Control for Aerial Sliding on Uneven Heterogeneous Surfaces by Proprioceptive and Tactile Sensing

The recent development of novel aerial vehicles capable of physically interacting with the environment leads to new applications such as contact-based inspection. These tasks require the robotic system to exchange forces with partially-known environments, which may contain uncertainties including unknown spatially-varying friction properties and discontinuous variations of the surface geometry. Finding a control strategy that is robust against these environmental uncertainties remains an open challenge. This paper presents a learning-based adaptive control strategy for aerial sliding tasks. In particular, the gains of a standard impedance controller are adjusted in real-time by a policy based on the current control signals, proprioceptive measurements, and tactile sensing. This policy is trained in simulation with simplified actuator dynamics in a student-teacher learning setup. The real-world performance of the proposed approach is verified using a tilt-arm omnidirectional flying vehicle. The proposed controller structure combines data-driven and model-based control methods, enabling our approach to successfully transfer directly and without adaptation from simulation to the real platform. Compared to fine-tuned state of the art interaction control methods we achieve reduced tracking error and improved disturbance rejection.

preprint2022arXiv

MPC with Learned Residual Dynamics with Application on Omnidirectional MAVs

The growing field of aerial manipulation often relies on fully actuated or omnidirectional micro aerial vehicles (OMAVs) which can apply arbitrary forces and torques while in contact with the environment. Control methods are usually based on model-free approaches, separating a high-level wrench controller from an actuator allocation. If necessary, disturbances are rejected by online disturbance observers. However, while being general, this approach often produces sub-optimal control commands and cannot incorporate constraints given by the platform design. We present two model-based approaches to control OMAVs for the task of trajectory tracking while rejecting disturbances. The first one optimizes wrench commands and compensates model errors by a model learned from experimental data. The second one optimizes low-level actuator commands, allowing to exploit an allocation nullspace and to consider constraints given by the actuator hardware. The efficacy and real-time feasibility of both approaches is shown and evaluated in real-world experiments.

preprint2022arXiv

NavDreams: Towards Camera-Only RL Navigation Among Humans

Autonomously navigating a robot in everyday crowded spaces requires solving complex perception and planning challenges. When using only monocular image sensor data as input, classical two-dimensional planning approaches cannot be used. While images present a significant challenge when it comes to perception and planning, they also allow capturing potentially important details, such as complex geometry, body movement, and other visual cues. In order to successfully solve the navigation task from only images, algorithms must be able to model the scene and its dynamics using only this channel of information. We investigate whether the world model concept, which has shown state-of-the-art results for modeling and learning policies in Atari games as well as promising results in 2D LiDAR-based crowd navigation, can also be applied to the camera-based navigation problem. To this end, we create simulated environments where a robot must navigate past static and moving humans without colliding in order to reach its goal. We find that state-of-the-art methods are able to achieve success in solving the navigation problem, and can generate dream-like predictions of future image-sequences which show consistent geometry and moving persons. We are also able to show that policy performance in our high-fidelity sim2real simulation scenario transfers to the real world by testing the policy on a real robot. We make our simulator, models and experiments available at https://github.com/danieldugas/NavDreams.

preprint2022arXiv

Panoptic Multi-TSDFs: a Flexible Representation for Online Multi-resolution Volumetric Mapping and Long-term Dynamic Scene Consistency

For robotic interaction in environments shared with other agents, access to volumetric and semantic maps of the scene is crucial. However, such environments are inevitably subject to long-term changes, which the map needs to account for. We thus propose panoptic multi-TSDFs as a novel representation for multi-resolution volumetric mapping in changing environments. By leveraging high-level information for 3D reconstruction, our proposed system allocates high resolution only where needed. Through reasoning on the object level, semantic consistency over time is achieved. This enables our method to maintain up-to-date reconstructions with high accuracy while improving coverage by incorporating previous data. We show in thorough experimental evaluation that our map can be efficiently constructed, maintained, and queried during online operation, and that the presented approach can operate robustly on real depth sensors using non-optimized panoptic segmentation as input.

preprint2022arXiv

Power-based Safety Layer for Aerial Vehicles in Physical Interaction using Lyapunov Exponents

As the performance of autonomous systems increases, safety concerns arise, especially when operating in non-structured environments. To deal with these concerns, this work presents a safety layer for mechanical systems that detects and responds to unstable dynamics caused by external disturbances. The safety layer is implemented independently and on top of already present nominal controllers, like pose or wrench tracking, and limits power flow when the system's response would lead to instability. This approach is based on the computation of the Largest Lyapunov Exponent (LLE) of the system's error dynamics, which represent a measure of the dynamics' divergence or convergence rate. By actively computing this metric, divergent and possibly dangerous system behaviors can be promptly detected. The LLE is then used in combination with Control Barrier Functions (CBFs) to impose power limit constraints on a jerk controlled system. The proposed architecture is experimentally validated on an Omnidirectional Micro Aerial Vehicle (OMAV) both in free flight and interaction tasks.

preprint2022arXiv

Present and Future of SLAM in Extreme Underground Environments

This paper reports on the state of the art in underground SLAM by discussing different SLAM strategies and results across six teams that participated in the three-year-long SubT competition. In particular, the paper has four main goals. First, we review the algorithms, architectures, and systems adopted by the teams; particular emphasis is put on lidar-centric SLAM solutions (the go-to approach for virtually all teams in the competition), heterogeneous multi-robot operation (including both aerial and ground robots), and real-world underground operation (from the presence of obscurants to the need to handle tight computational constraints). We do not shy away from discussing the dirty details behind the different SubT SLAM systems, which are often omitted from technical papers. Second, we discuss the maturity of the field by highlighting what is possible with the current SLAM systems and what we believe is within reach with some good systems engineering. Third, we outline what we believe are fundamental open problems, that are likely to require further research to break through. Finally, we provide a list of open-source SLAM implementations and datasets that have been produced during the SubT challenge and related efforts, and constitute a useful resource for researchers and practitioners.

preprint2022arXiv

Probabilistic network topology prediction for active planning:An adaptive algorithm and application

This paper tackles the problem of active planning to achieve cooperative localization for multi-robot systems (MRS) under measurement uncertainty in GNSS-limited scenarios. Specifically, we address the issue of accurately predicting the probability of a future connection between two robots equipped with range-based measurement devices. Due to the limited range of the equipped sensors, edges in the network connection topology will be created or destroyed as the robots move with respect to one another. Accurately predicting the future existence of an edge, given imperfect state estimation and noisy actuation, is therefore a challenging task. An adaptive power series expansion (or APSE) algorithm is developed based on current estimates and control candidates. Such an algorithm applies the power series expansion formula of the quadratic positive form in a normal distribution. Finite-term approximation is made to realize the computational tractability. Further analyses are presented to show that the truncation error in the finite-term approximation can be theoretically reduced to a desired threshold by adaptively choosing the summation degree of the power series. Several sufficient conditions are rigorously derived as the selection principles. Finally, extensive simulation results and comparisons, with respect to both single and multi-robot cases, validate that a formally computed and therefore more accurate probability of future topology can help improve the performance of active planning under uncertainty.

preprint2022arXiv

RoBoa: Construction and Evaluation of a Steerable Vine Robot for Search and Rescue Applications

RoBoa is a vine-like search and rescue robot that can explore narrow and cluttered environments such as destroyed buildings. The robot assists rescue teams in finding and communicating with trapped people. It employs the principle of vine robots for locomotion, everting the tip of its tube to move forward. Inside the tube, pneumatic actuators enable lateral movement. The head carries sensors and is mounted outside at the tip of the tube. At the back, a supply box contains the rolled up tube and provides pressurized air, power, computation, as well as an interface for the user to interact with the system. A decentralized control scheme was implemented that reduces the required number of cables and takes care of the low-level control of the pneumatic actuators. The design, characterization, and experimental evaluation of the system and its crucial components is shown. The complete prototype is fully functional and was evaluated in a realistic environment of a collapsed building where the remote-controlled robot was able to repeatedly locate a trapped person after a travel distance of about 10 m.

preprint2022arXiv

SC-Explorer: Incremental 3D Scene Completion for Safe and Efficient Exploration Mapping and Planning

Exploration of unknown environments is a fundamental problem in robotics and an essential component in numerous applications of autonomous systems. A major challenge in exploring unknown environments is that the robot has to plan with the limited information available at each time step. While most current approaches rely on heuristics and assumption to plan paths based on these partial observations, we instead propose a novel way to integrate deep learning into exploration by leveraging 3D scene completion for informed, safe, and interpretable exploration mapping and planning. Our approach, SC-Explorer, combines scene completion using a novel incremental fusion mechanism and a newly proposed hierarchical multi-layer mapping approach, to guarantee safety and efficiency of the robot. We further present an informative path planning method, leveraging the capabilities of our mapping approach and a novel scene-completion-aware information gain. While our method is generally applicable, we evaluate it in the use case of a Micro Aerial Vehicle (MAV). We thoroughly study each component in high-fidelity simulation experiments using only mobile hardware, and show that our method can speed up coverage of an environment by 73% compared to the baselines with only minimal reduction in map accuracy. Even if scene completions are not included in the final map, we show that they can be used to guide the robot to choose more informative paths, speeding up the measurement of the scene with the robot's sensors by 35%. We validate our system on a fully autonomous MAV, showing rapid and reliable scene coverage even in a complex cluttered environment. We make our methods available as open-source.

preprint2022arXiv

See Yourself in Others: Attending Multiple Tasks for Own Failure Detection

Autonomous robots deal with unexpected scenarios in real environments. Given input images, various visual perception tasks can be performed, e.g., semantic segmentation, depth estimation and normal estimation. These different tasks provide rich information for the whole robotic perception system. All tasks have their own characteristics while sharing some latent correlations. However, some of the task predictions may suffer from the unreliability dealing with complex scenes and anomalies. We propose an attention-based failure detection approach by exploiting the correlations among multiple tasks. The proposed framework infers task failures by evaluating the individual prediction, across multiple visual perception tasks for different regions in an image. The formulation of the evaluations is based on an attention network supervised by multi-task uncertainty estimation and their corresponding prediction errors. Our proposed framework generates more accurate estimations of the prediction error for the different task's predictions.

preprint2022arXiv

Semi-automatic 3D Object Keypoint Annotation and Detection for the Masses

Creating computer vision datasets requires careful planning and lots of time and effort. In robotics research, we often have to use standardized objects, such as the YCB object set, for tasks such as object tracking, pose estimation, grasping and manipulation, as there are datasets and pre-learned methods available for these objects. This limits the impact of our research since learning-based computer vision methods can only be used in scenarios that are supported by existing datasets. In this work, we present a full object keypoint tracking toolkit, encompassing the entire process from data collection, labeling, model learning and evaluation. We present a semi-automatic way of collecting and labeling datasets using a wrist mounted camera on a standard robotic arm. Using our toolkit and method, we are able to obtain a working 3D object keypoint detector and go through the whole process of data collection, annotation and learning in just a couple hours of active time.

preprint2022arXiv

SL Sensor: An Open-Source, ROS-Based, Real-Time Structured Light Sensor for High Accuracy Construction Robotic Applications

High accuracy 3D surface information is required for many construction robotics tasks such as automated cement polishing or robotic plaster spraying. However, consumer-grade depth cameras currently found in the market are not accurate enough for these tasks where millimeter (mm)-level accuracy is required. This paper presents SL Sensor, a structured light sensing solution capable of producing high fidelity point clouds at 5 Hz by leveraging on phase shifting profilometry (PSP) codification techniques. The SL Sensor was compared with to two commercial depth cameras - the Azure Kinect and RealSense L515. Experiments showed that the SL Sensor surpasses the two devices in both precision and accuracy for indoor surface reconstruction applications. Furthermore, to demonstrate SL Sensor's ability to be a structured light sensing research platform for robotic applications, a motion compensation strategy was developed that allows the SL Sensor to operate during linear motion when traditional PSP methods only work when the sensor is static. Field experiments show that the SL Sensor is able to produce highly detailed reconstructions of spray plastered surfaces. The robot operating system (ROS)-based software and a sample hardware build of the SL Sensor are made open-source with the objective to make structured light sensing more accessible to the construction robotics community. All documentation and code is available at https://github.com/ethz-asl/sl_sensor/ .

preprint2022arXiv

Team CERBERUS Wins the DARPA Subterranean Challenge: Technical Overview and Lessons Learned

This article presents the CERBERUS robotic system-of-systems, which won the DARPA Subterranean Challenge Final Event in 2021. The Subterranean Challenge was organized by DARPA with the vision to facilitate the novel technologies necessary to reliably explore diverse underground environments despite the grueling challenges they present for robotic autonomy. Due to their geometric complexity, degraded perceptual conditions combined with lack of GPS support, austere navigation conditions, and denied communications, subterranean settings render autonomous operations particularly demanding. In response to this challenge, we developed the CERBERUS system which exploits the synergy of legged and flying robots, coupled with robust control especially for overcoming perilous terrain, multi-modal and multi-robot perception for localization and mapping in conditions of sensor degradation, and resilient autonomy through unified exploration path planning and local motion planning that reflects robot-specific limitations. Based on its ability to explore diverse underground environments and its high-level command and control by a single human supervisor, CERBERUS demonstrated efficient exploration, reliable detection of objects of interest, and accurate mapping. In this article, we report results from both the preliminary runs and the final Prize Round of the DARPA Subterranean Challenge, and discuss highlights and challenges faced, alongside lessons learned for the benefit of the community.

preprint2022arXiv

Towards 6DoF Bilateral Teleoperation of an Omnidirectional Aerial Vehicle for Aerial Physical Interaction

Bilateral teleoperation offers an intriguing solution towards shared autonomy with aerial vehicles in contact-based inspection and manipulation tasks. Omnidirectional aerial robots allow for full pose operations, making them particularly attractive in such tasks. Naturally, the question arises whether standard bilateral teleoperation methodologies are suitable for use with these vehicles. In this work, a fully decoupled 6DoF bilateral teleoperation framework for aerial physical interaction is designed and tested for the first time. The method is based on the well established rate control, recentering and interaction force feedback policy. However, practical experiments evince the difficulty of performing decoupled motions in a single axis only. As such, this work shows that the trivial extension of standard methods is insufficient for omnidirectional teleoperation, due to the operators physical inability to properly decouple all input DoFs. This suggests that further studies on enhanced haptic feedback are necessary.

preprint2021arXiv

Active Interaction Force Control for Contact-Based Inspection with a Fully Actuated Aerial Vehicle

This paper presents and validates active interaction force control and planning for fully actuated and omnidirectional aerial manipulation platforms, with the goal of aerial contact inspection in unstructured environments. We present a variable axis-selective impedance control which integrates direct force control for intentional interaction, using feedback from an on-board force sensor. The control approach aims to reject disturbances in free flight, while handling unintentional interaction, and actively controlling desired interaction forces. A fully actuated and omnidirectional tilt-rotor aerial system is used to show capabilities of the control and planning methods. Experiments demonstrate disturbance rejection, push-and-slide interaction, and force controlled interaction in different flight orientations. The system is validated as a tool for non-destructive testing of concrete infrastructure, and statistical results of

preprint2021arXiv

Dynamic Object Aware LiDAR SLAM based on Automatic Generation of Training Data

Highly dynamic environments, with moving objects such as cars or humans, can pose a performance challenge for LiDAR SLAM systems that assume largely static scenes. To overcome this challenge and support the deployment of robots in real world scenarios, we propose a complete solution for a dynamic object aware LiDAR SLAM algorithm. This is achieved by leveraging a real-time capable neural network that can detect dynamic objects, thus allowing our system to deal with them explicitly. To efficiently generate the necessary training data which is key to our approach, we present a novel end-to-end occupancy grid based pipeline that can automatically label a wide variety of arbitrary dynamic objects. Our solution can thus generalize to different environments without the need for expensive manual labeling and at the same time avoids assumptions about the presence of a predefined set of known objects in the scene. Using this technique, we automatically label over 12000 LiDAR scans collected in an urban environment with a large amount of pedestrians and use this data to train a neural network, achieving an average segmentation IoU of 0.82. We show that explicitly dealing with dynamic objects can improve the LiDAR SLAM odometry performance by 39.6% while yielding maps which better represent the environments. A supplementary video as well as our test data are available online.

preprint2021arXiv

Mesh Manifold based Riemannian Motion Planning for Omnidirectional Micro Aerial Vehicles

This paper presents a novel on-line path planning method that enables aerial robots to interact with surfaces. We present a solution to the problem of finding trajectories that drive a robot towards a surface and move along it. Triangular meshes are used as a surface map representation that is free of fixed discretization and allows for very large workspaces. We propose to leverage planar parametrization methods to obtain a lower-dimensional topologically equivalent representation of the original surface. Furthermore, we interpret the original surface and its lower-dimensional representation as manifold approximations that allow the use of Riemannian Motion Policies (RMPs), resulting in an efficient, versatile, and elegant motion generation framework. We compare against several Rapidly-exploring Random Tree (RRT) planners, a customized CHOMP variant, and the discrete geodesic algorithm. Using extensive simulations on real-world data we show that the proposed planner can reliably plan high-quality near-optimal trajectories at minimal computational cost. The accompanying multimedia attachment demonstrates feasibility on a real OMAV. The obtained paths show less than 10% deviation from the theoretical optimum while facilitating reactive re-planning at kHz refresh rates, enabling flying robots to perform motion planning for interaction with complex surfaces.

preprint2021arXiv

PHASER: a Robust and Correspondence-free Global Pointcloud Registration

We propose PHASER, a correspondence-free global registration of sensor-centric pointclouds that is robust to noise, sparsity, and partial overlaps. Our method can seamlessly handle multimodal information and does not rely on keypoint nor descriptor preprocessing modules. By exploiting properties of Fourier analysis, PHASER operates directly on the sensor's signal, fusing the spectra of multiple channels and computing the 6-DoF transformation based on correlation. Our registration pipeline starts by finding the most likely rotation followed by computing the most likely translation. Both estimates are distributed according to a probability distribution that takes the underlying manifold into account, i.e., a Bingham and Gaussian distribution, respectively. This further allows our approach to consider the periodic-nature of rotations and naturally represent its uncertainty. We extensively compare PHASER against several well-known registration algorithms on both simulated datasets, and real-world data acquired using different sensor configurations. Our results show that PHASER can globally align pointclouds in less than 100ms with an average accuracy of 2cm and 0.5deg, is resilient against noise, and can handle partial overlap.

preprint2021arXiv

Pixel-wise Anomaly Detection in Complex Driving Scenes

The inability of state-of-the-art semantic segmentation methods to detect anomaly instances hinders them from being deployed in safety-critical and complex applications, such as autonomous driving. Recent approaches have focused on either leveraging segmentation uncertainty to identify anomalous areas or re-synthesizing the image from the semantic label map to find dissimilarities with the input image. In this work, we demonstrate that these two methodologies contain complementary information and can be combined to produce robust predictions for anomaly segmentation. We present a pixel-wise anomaly detection framework that uses uncertainty maps to improve over existing re-synthesis methods in finding dissimilarities between the input and generated images. Our approach works as a general framework around already trained segmentation networks, which ensures anomaly detection without compromising segmentation accuracy, while significantly outperforming all similar methods. Top-2 performance across a range of different anomaly datasets shows the robustness of our approach to handling different anomaly instances.

preprint2021arXiv

Points2Vec: Unsupervised Object-level Feature Learning from Point Clouds

Unsupervised representation learning techniques, such as learning word embeddings, have had a significant impact on the field of natural language processing. Similar representation learning techniques have not yet become commonplace in the context of 3D vision. This, despite the fact that the physical 3D spaces have a similar semantic structure to bodies of text: words are surrounded by words that are semantically related, just like objects are surrounded by other objects that are similar in concept and usage. In this work, we exploit this structure in learning semantically meaningful low dimensional vector representations of objects. We learn these vector representations by mining a dataset of scanned 3D spaces using an unsupervised algorithm. We represent objects as point clouds, a flexible and general representation for 3D data, which we encode into a vector representation. We show that using our method to include context increases the ability of a clustering algorithm to distinguish different semantic classes from each other. Furthermore, we show that our algorithm produces continuous and meaningful object embeddings through interpolation experiments.

preprint2021arXiv

Spherical Multi-Modal Place Recognition for Heterogeneous Sensor Systems

In this paper, we propose a robust end-to-end multi-modal pipeline for place recognition where the sensor systems can differ from the map building to the query. Our approach operates directly on images and LiDAR scans without requiring any local feature extraction modules. By projecting the sensor data onto the unit sphere, we learn a multi-modal descriptor of partially overlapping scenes using a spherical convolutional neural network. The employed spherical projection model enables the support of arbitrary LiDAR and camera systems readily without losing information. Loop closure candidates are found using a nearest-neighbor lookup in the embedding space. We tackle the problem of correctly identifying the closest place by correlating the candidates' power spectra, obtaining a confidence value per prospect. Our estimate for the correct place corresponds then to the candidate with the highest confidence. We evaluate our proposal w.r.t. state-of-the-art approaches in place recognition using real-world data acquired using different sensors. Our approach can achieve a recall that is up to 10% and 5% higher than for a LiDAR- and vision-based system, respectively, when the sensor setup differs between model training and deployment. Additionally, our place selection can correctly identify up to 95% matches from the candidate set.

preprint2021arXiv

The Hidden Uncertainty in a Neural Networks Activations

The distribution of a neural network's latent representations has been successfully used to detect out-of-distribution (OOD) data. This work investigates whether this distribution moreover correlates with a model's epistemic uncertainty, thus indicates its ability to generalise to novel inputs. We first empirically verify that epistemic uncertainty can be identified with the surprise, thus the negative log-likelihood, of observing a particular latent representation. Moreover, we demonstrate that the output-conditional distribution of hidden representations also allows quantifying aleatoric uncertainty via the entropy of the predictive distribution. We analyse epistemic and aleatoric uncertainty inferred from the representations of different layers and conclude that deeper layers lead to uncertainty with similar behaviour as established - but computationally more expensive - methods (e.g. deep ensembles). While our approach does not require modifying the training process, we follow prior work and experiment with an additional regularising loss that increases the information in the latent representations. We find that this leads to improved OOD detection of epistemic uncertainty at the cost of ambiguous calibration close to the data distribution. We verify our findings on both classification and regression models.

preprint2021arXiv

Volumetric Grasping Network: Real-time 6 DOF Grasp Detection in Clutter

General robot grasping in clutter requires the ability to synthesize grasps that work for previously unseen objects and that are also robust to physical interactions, such as collisions with other objects in the scene. In this work, we design and train a network that predicts 6 DOF grasps from 3D scene information gathered from an on-board sensor such as a wrist-mounted depth camera. Our proposed Volumetric Grasping Network (VGN) accepts a Truncated Signed Distance Function (TSDF) representation of the scene and directly outputs the predicted grasp quality and the associated gripper orientation and opening width for each voxel in the queried 3D volume. We show that our approach can plan grasps in only 10 ms and is able to clear 92% of the objects in real-world clutter removal experiments without the need for explicit collision checking. The real-time capability opens up the possibility for closed-loop grasp planning, allowing robots to handle disturbances, recover from errors and provide increased robustness. Code is available at https://github.com/ethz-asl/vgn.

Roland Siegwart

What is connected

Connect this record

See the researcher in context

Building this map preview

70 published item(s)

maplab 2.0 -- A Modular and Multi-Modal Mapping Framework

Obstacle avoidance using raycasting and Riemannian Motion Policies at kHz rates for MAVs

Autonomous Teamed Exploration of Subterranean Environments using Legged and Aerial Robots

Building an Aerial-Ground Robotics System for Precision Farming: An Adaptable Solution

CERBERUS: Autonomous Legged and Aerial Robotic Exploration in the Tunnel and Urban Circuits of the DARPA Subterranean Challenge

Closed-Loop Next-Best-View Planning for Target-Driven Grasping

Collaborative Robot Mapping using Spectral Graph Analysis

Descriptellation: Deep Learned Constellation Descriptors

Embodied Active Domain Adaptation for Semantic Segmentation via Informative Path Planning

Energy Tank-Based Policies for Robust Aerial Physical Interaction with Moving Objects

Fast and Compute-efficient Sampling-based Local Exploration Planning via Distribution Learning

Human-State-Aware Controller for a Tethered Aerial Robot Guiding a Human by Physical Interaction

Learning Variable Impedance Control for Aerial Sliding on Uneven Heterogeneous Surfaces by Proprioceptive and Tactile Sensing

MPC with Learned Residual Dynamics with Application on Omnidirectional MAVs

NavDreams: Towards Camera-Only RL Navigation Among Humans

Panoptic Multi-TSDFs: a Flexible Representation for Online Multi-resolution Volumetric Mapping and Long-term Dynamic Scene Consistency

Power-based Safety Layer for Aerial Vehicles in Physical Interaction using Lyapunov Exponents

Present and Future of SLAM in Extreme Underground Environments

Probabilistic network topology prediction for active planning:An adaptive algorithm and application

RoBoa: Construction and Evaluation of a Steerable Vine Robot for Search and Rescue Applications

SC-Explorer: Incremental 3D Scene Completion for Safe and Efficient Exploration Mapping and Planning

See Yourself in Others: Attending Multiple Tasks for Own Failure Detection

Semi-automatic 3D Object Keypoint Annotation and Detection for the Masses

SL Sensor: An Open-Source, ROS-Based, Real-Time Structured Light Sensor for High Accuracy Construction Robotic Applications

Team CERBERUS Wins the DARPA Subterranean Challenge: Technical Overview and Lessons Learned

Towards 6DoF Bilateral Teleoperation of an Omnidirectional Aerial Vehicle for Aerial Physical Interaction

Active Interaction Force Control for Contact-Based Inspection with a Fully Actuated Aerial Vehicle

Dynamic Object Aware LiDAR SLAM based on Automatic Generation of Training Data

Mesh Manifold based Riemannian Motion Planning for Omnidirectional Micro Aerial Vehicles

PHASER: a Robust and Correspondence-free Global Pointcloud Registration

Pixel-wise Anomaly Detection in Complex Driving Scenes

Points2Vec: Unsupervised Object-level Feature Learning from Point Clouds

Spherical Multi-Modal Place Recognition for Heterogeneous Sensor Systems

The Hidden Uncertainty in a Neural Networks Activations

Volumetric Grasping Network: Real-time 6 DOF Grasp Detection in Clutter

Accurate Mapping and Planning for Autonomous Racing

An Efficient Sampling-based Method for Online Informative Path Planning in Unknown Environments

An informative path planning framework for UAV-based terrain monitoring

An Open-Source System for Vision-Based Micro-Aerial Vehicle Mapping, Planning, and Flight in Cluttered Environments

Ascento: A Two-Wheeled Jumping Robot

Deep Learning-based Human Detection for UAVs with Optical and Infrared Cameras: System and Experiments

Deep UAV Localization with Reference View Rendering

Depth Based Semantic Scene Completion with Position Importance Aware Loss

Design and optimal control of a tiltrotor micro aerial vehicle for efficient omnidirectional flight

End-to-End Velocity Estimation For Autonomous Racing

Go Fetch: Mobile Manipulation in Unstructured Environments

IDOL: A Framework for IMU-DVS Odometry using Lines

LCD -- Line Clustering and Description for Place Recognition

Learning Camera Miscalibration Detection

Learning Densities in Feature Space for Reliable Segmentation of Indoor Scenes

Learning dynamics for improving control of overactuated flying systems

Learning Trajectories for Visual-Inertial System Calibration via Model-based Heuristic Deep Reinforcement Learning

Leveraging Stereo-Camera Data for Real-Time Dynamic Obstacle Detection and Tracking

LQR-Assisted Whole-Body Control of a Wheeled Bipedal Robot with Kinematic Loops

MOZARD: Multi-Modal Localization for Autonomous Vehicles in Urban Outdoor Environments

Object Finding in Cluttered Scenes Using Interactive Perception

OneShot Global Localization: Instant LiDAR-Visual Pose Estimation

Precise Robot Localization in Architectural 3D Plans

Towards Efficient Full Pose Omnidirectionality with Overactuated MAVs

Voxgraph: Globally Consistent, Volumetric Mapping using Signed Distance Function Submaps

Whole-Body Control of a Mobile Manipulator using End-to-End Reinforcement Learning

An Omnidirectional Aerial Manipulation Platform for Contact-Based Inspection

OREOS: Oriented Recognition of 3D Point Clouds in Outdoor Scenarios

VersaVIS: An Open Versatile Multi-Camera Visual-Inertial Sensor Suite

Redundant Perception and State Estimation for Reliable Autonomous Racing

A Primer on the Differential Calculus of 3D Orientations

Collaborative Object Transportation Using MAVs via Passive Force Control

Distributed Infrastructure Inspection Path Planning subject to Time Constraints

Online Informative Path Planning for Active Classification on UAVs

Online Informative Path Planning for Active Classification Using UAVs