Source author record

Wolfgang Merkt

Wolfgang Merkt appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics eess.SY Systems and Control Artificial Intelligence Machine Learning math.OC

Catalog footprint

What is connected

15works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Feasibility-Driven Approach to Control-Limited DDP

Differential dynamic programming (DDP) is a direct single shooting method for trajectory optimization. Its efficiency derives from the exploitation of temporal structure (inherent to optimal control problems) and explicit roll-out/integration of the system dynamics. However, it suffers from numerical instability and, when compared to direct multiple shooting methods, it has limited initialization options (allows initialization of controls, but not of states) and lacks proper handling of control constraints. In this work, we tackle these issues with a feasibility-driven approach that regulates the dynamic feasibility during the numerical optimization and ensures control limits. Our feasibility search emulates the numerical resolution of a direct multiple shooting problem with only dynamics constraints. We show that our approach (named BOX-FDDP) has better numerical convergence than BOX-DDP+ (a single shooting method), and that its convergence rate and runtime performance are competitive with state-of-the-art direct transcription formulations solved using the interior point and active set algorithms available in KNITRO. We further show that BOX-FDDP decreases the dynamic feasibility error monotonically--as in state-of-the-art nonlinear programming algorithms. We demonstrate the benefits of our approach by generating complex and athletic motions for quadruped and humanoid robots. Finally, we highlight that BOX-FDDP is suitable for model predictive control in legged robots.

preprint2022arXiv

A Passive Navigation Planning Algorithm for Collision-free Control of Mobile Robots

Path planning and collision avoidance are challenging in complex and highly variable environments due to the limited horizon of events. In literature, there are multiple model- and learning-based approaches that require significant computational resources to be effectively deployed and they may have limited generality. We propose a planning algorithm based on a globally stable passive controller that can plan smooth trajectories using limited computational resources in challenging environmental conditions. The architecture combines the recently proposed fractal impedance controller with elastic bands and regions of finite time invariance. As the method is based on an impedance controller, it can also be used directly as a force/torque controller. We validated our method in simulation to analyse the ability of interactive navigation in challenging concave domains via the issuing of via-points, and its robustness to low bandwidth feedback. A swarm simulation using 11 agents validated the scalability of the proposed method. We have performed hardware experiments on a holonomic wheeled platform validating smoothness and robustness of interaction with dynamic agents (i.e., humans and robots). The computational complexity of the proposed local planner enables deployment with low-power micro-controllers lowering the energy consumption compared to other methods that rely upon numeric optimisation.

preprint2022arXiv

Agile Maneuvers in Legged Robots: a Predictive Control Approach

Planning and execution of agile locomotion maneuvers have been a longstanding challenge in legged robotics. It requires to derive motion plans and local feedback policies in real-time to handle the nonholonomy of the kinetic momenta. To achieve so, we propose a hybrid predictive controller that considers the robot's actuation limits and full-body dynamics. It combines the feedback policies with tactile information to locally predict future actions. It converges within a few milliseconds thanks to a feasibility-driven approach. Our predictive controller enables ANYmal robots to generate agile maneuvers in realistic scenarios. A crucial element is to track the local feedback policies as, in contrast to whole-body control, they achieve the desired angular momentum. To the best of our knowledge, our predictive controller is the first to handle actuation limits, generate agile locomotion maneuvers, and execute optimal feedback policies for low level torque control without the use of a separate whole-body controller.

preprint2022arXiv

Motion Planning in Dynamic Environments Using Context-Aware Human Trajectory Prediction

Over the years, the separate fields of motion planning, mapping, and human trajectory prediction have advanced considerably. However, the literature is still sparse in providing practical frameworks that enable mobile manipulators to perform whole-body movements and account for the predicted motion of moving obstacles. Previous optimisation-based motion planning approaches that use distance fields have suffered from the high computational cost required to update the environment representation. We demonstrate that GPU-accelerated predicted composite distance fields significantly reduce the computation time compared to calculating distance fields from scratch. We integrate this technique with a complete motion planning and perception framework that accounts for the predicted motion of humans in dynamic environments, enabling reactive and pre-emptive motion planning that incorporates predicted motions. To achieve this, we propose and implement a novel human trajectory prediction method that combines intention recognition with trajectory optimisation-based motion planning. We validate our resultant framework on a real-world Toyota Human Support Robot (HSR) using live RGB-D sensor data from the onboard camera. In addition to providing analysis on a publicly available dataset, we release the Oxford Indoor Human Motion (Oxford-IHM) dataset and demonstrate state-of-the-art performance in human trajectory prediction. The Oxford-IHM dataset is a human trajectory prediction dataset in which people walk between regions of interest in an indoor environment. Both static and robot-mounted RGB-D cameras observe the people while tracked with a motion-capture system.

preprint2022arXiv

Next Steps: Learning a Disentangled Gait Representation for Versatile Quadruped Locomotion

Quadruped locomotion is rapidly maturing to a degree where robots now routinely traverse a variety of unstructured terrains. However, while gaits can be varied typically by selecting from a range of pre-computed styles, current planners are unable to vary key gait parameters continuously while the robot is in motion. The synthesis, on-the-fly, of gaits with unexpected operational characteristics or even the blending of dynamic manoeuvres lies beyond the capabilities of the current state-of-the-art. In this work we address this limitation by learning a latent space capturing the key stance phases of a particular gait, via a generative model trained on a single trot style. This encourages disentanglement such that application of a drive signal to a single dimension of the latent state induces holistic plans synthesising a continuous variety of trot styles. In fact properties of this drive signal map directly to gait parameters such as cadence, footstep height and full stance duration. The use of a generative model facilitates the detection and mitigation of disturbances to provide a versatile and robust planning framework. We evaluate our approach on a real ANYmal quadruped robot and demonstrate that our method achieves a continuous blend of dynamic trot styles whilst being robust and reactive to external perturbations.

preprint2022arXiv

Simultaneous Scene Reconstruction and Whole-Body Motion Planning for Safe Operation in Dynamic Environments

Recent work has demonstrated real-time mapping and reconstruction from dense perception, while motion planning based on distance fields has been shown to achieve fast, collision-free motion synthesis with good convergence properties. However, demonstration of a fully integrated system that can safely re-plan in unknown environments, in the presence of static and dynamic obstacles, has remained an open challenge. In this work, we first study the impact that signed and unsigned distance fields have on optimisation convergence, and the resultant error cost in trajectory optimisation problems in 2D path planning, arm manipulator motion planning, and whole-body loco-manipulation planning. We further analyse the performance of three state-of-the-art approaches to generating distance fields (Voxblox, Fiesta, and GPU-Voxels) for use in real-time environment reconstruction. Finally, we use our findings to construct a practical hybrid mapping and motion planning system which uses GPU-Voxels and GPMP2 to perform receding-horizon whole-body motion planning that can smoothly avoid moving obstacles in 3D space using live sensor data. Our results are validated in simulation and on a real-world Toyota Human Support Robot (HSR).

preprint2022arXiv

Where Should I Look? Optimised Gaze Control for Whole-Body Collision Avoidance in Dynamic Environments

As robots operate in increasingly complex and dynamic environments, fast motion re-planning has become a widely explored area of research. In a real-world deployment, we often lack the ability to fully observe the environment at all times, giving rise to the challenge of determining how to best perceive the environment given a continuously updated motion plan. We provide the first investigation into a `smart' controller for gaze control with the objective of providing effective perception of the environment for obstacle avoidance and motion planning in dynamic and unknown environments. We detail the novel problem of determining the best head camera behaviour for mobile robots when constrained by a trajectory. Furthermore, we propose a greedy optimisation-based solution that uses a combination of voxelised rewards and motion primitives. We demonstrate that our method outperforms the benchmark methods in 2D and 3D environments, in respect of both the ability to explore the local surroundings, as well as in a superior success rate of finding collision-free trajectories -- our method is shown to provide 7.4x better map exploration while consistently achieving a higher success rate for generating collision-free trajectories. We verify our findings on a physical Toyota Human Support Robot (HSR) using a GPU-accelerated perception framework.

preprint2021arXiv

CPG-ACTOR: Reinforcement Learning for Central Pattern Generators

Central Pattern Generators (CPGs) have several properties desirable for locomotion: they generate smooth trajectories, are robust to perturbations and are simple to implement. Although conceptually promising, we argue that the full potential of CPGs has so far been limited by insufficient sensory-feedback information. This paper proposes a new methodology that allows tuning CPG controllers through gradient-based optimization in a Reinforcement Learning (RL) setting. To the best of our knowledge, this is the first time CPGs have been trained in conjunction with a MultilayerPerceptron (MLP) network in a Deep-RL context. In particular, we show how CPGs can directly be integrated as the Actor in an Actor-Critic formulation. Additionally, we demonstrate how this change permits us to integrate highly non-linear feedback directly from sensory perception to reshape the oscillators' dynamics. Our results on a locomotion task using a single-leg hopper demonstrate that explicitly using the CPG as the Actor rather than as part of the environment results in a significant increase in the reward gained over time (6x more) compared with previous approaches. Furthermore, we show that our method without feedback reproduces results similar to prior work with feedback. Finally, we demonstrate how our closed-loop CPG progressively improves the hopping behaviour for longer training epochs relying only on basic reward functions.

preprint2021arXiv

Predicted Composite Signed-Distance Fields for Real-Time Motion Planning in Dynamic Environments

We present a novel framework for motion planning in dynamic environments that accounts for the predicted trajectories of moving objects in the scene. We explore the use of composite signed-distance fields in motion planning and detail how they can be used to generate signed-distance fields (SDFs) in real-time to incorporate predicted obstacle motions. We benchmark our approach of using composite SDFs against performing exact SDF calculations on the workspace occupancy grid. Our proposed technique generates predictions substantially faster and typically exhibits an 81--97% reduction in time for subsequent predictions. We integrate our framework with GPMP2 to demonstrate a full implementation of our approach in real-time, enabling a 7-DoF Panda arm to smoothly avoid a moving robot.

preprint2021arXiv

Sparsity-Inducing Optimal Control via Differential Dynamic Programming

Optimal control is a popular approach to synthesize highly dynamic motion. Commonly, $L_2$ regularization is used on the control inputs in order to minimize energy used and to ensure smoothness of the control inputs. However, for some systems, such as satellites, the control needs to be applied in sparse bursts due to how the propulsion system operates. In this paper, we study approaches to induce sparsity in optimal control solutions -- namely via smooth $L_1$ and Huber regularization penalties. We apply these loss terms to state-of-the-art DDP-based solvers to create a family of sparsity-inducing optimal control methods. We analyze and compare the effect of the different losses on inducing sparsity, their numerical conditioning, their impact on convergence, and discuss hyperparameter settings. We demonstrate our method in simulation and hardware experiments on canonical dynamics systems, control of satellites, and the NASA Valkyrie humanoid robot. We provide an implementation of our method and all examples for reproducibility on GitHub.

preprint2020arXiv

Crocoddyl: An Efficient and Versatile Framework for Multi-Contact Optimal Control

We introduce Crocoddyl (Contact RObot COntrol by Differential DYnamic Library), an open-source framework tailored for efficient multi-contact optimal control. Crocoddyl efficiently computes the state trajectory and the control policy for a given predefined sequence of contacts. Its efficiency is due to the use of sparse analytical derivatives, exploitation of the problem structure, and data sharing. It employs differential geometry to properly describe the state of any geometrical system, e.g. floating-base systems. Additionally, we propose a novel optimal control algorithm called Feasibility-driven Differential Dynamic Programming (FDDP). Our method does not add extra decision variables which often increases the computation time per iteration due to factorization. FDDP shows a greater globalization strategy compared to classical Differential Dynamic Programming (DDP) algorithms. Concretely, we propose two modifications to the classical DDP algorithm. First, the backward pass accepts infeasible state-control trajectories. Second, the rollout keeps the gaps open during the early "exploratory" iterations (as expected in multiple-shooting methods with only equality constraints). We showcase the performance of our framework using different tasks. With our method, we can compute highly-dynamic maneuvers (e.g. jumping, front-flip) within few milliseconds.

preprint2020arXiv

Learning Whole-body Motor Skills for Humanoids

This paper presents a hierarchical framework for Deep Reinforcement Learning that acquires motor skills for a variety of push recovery and balancing behaviors, i.e., ankle, hip, foot tilting, and stepping strategies. The policy is trained in a physics simulator with realistic setting of robot model and low-level impedance control that are easy to transfer the learned skills to real robots. The advantage over traditional methods is the integration of high-level planner and feedback control all in one single coherent policy network, which is generic for learning versatile balancing and recovery motions against unknown perturbations at arbitrary locations (e.g., legs, torso). Furthermore, the proposed framework allows the policy to be learned quickly by many state-of-the-art learning algorithms. By comparing our learned results to studies of preprogrammed, special-purpose controllers in the literature, self-learned skills are comparable in terms of disturbance rejection but with additional advantages of producing a wide range of adaptive, versatile and robust behaviors.

preprint2020arXiv

Modeling and Control of a Hybrid Wheeled Jumping Robot

In this paper, we study a wheeled robot with a prismatic extension joint. This allows the robot to build up momentum to perform jumps over obstacles and to swing up to the upright position after the loss of balance. We propose a template model for the class of such two-wheeled jumping robots. This model can be considered as the simplest wheeled-legged system. We provide an analytical derivation of the system dynamics which we use inside a model predictive controller (MPC). We study the behavior of the model and demonstrate highly dynamic motions such as swing-up and jumping. Furthermore, these motions are discovered through optimization from first principles. We evaluate the controller on a variety of tasks and uneven terrains in a simulator.

preprint2020arXiv

Optimizing Dynamic Trajectories for Robustness to Disturbances Using Polytopic Projections

This paper focuses on robustness to disturbance forces and uncertain payloads. We present a novel formulation to optimize the robustness of dynamic trajectories. A straightforward transcription of this formulation into a nonlinear programming problem is not tractable for state-of-the-art solvers, but it is possible to overcome this complication by exploiting the structure induced by the kinematics of the robot. The non-trivial transcription proposed allows trajectory optimization frameworks to converge to highly robust dynamic solutions. We demonstrate the results of our approach using a quadruped robot equipped with a manipulator.

preprint2016arXiv

Scaling Sampling-based Motion Planning to Humanoid Robots

Planning balanced and collision-free motion for humanoid robots is non-trivial, especially when they are operated in complex environments, such as reaching targets behind obstacles or through narrow passages. We propose a method that allows us to apply existing sampling--based algorithms to plan trajectories for humanoids by utilizing a customized state space representation, biased sampling strategies, and a steering function based on a robust inverse kinematics solver. Our approach requires no prior offline computation, thus one can easily transfer the work to new robot platforms. We tested the proposed method solving practical reaching tasks on a 38 degrees-of-freedom humanoid robot, NASA Valkyrie, showing that our method is able to generate valid motion plans that can be executed on advanced full-size humanoid robots. We also present a benchmark between different motion planning algorithms evaluated on a variety of reaching motion problems. This allows us to find suitable algorithms for solving humanoid motion planning problems, and to identify the limitations of these algorithms.

Wolfgang Merkt

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

A Feasibility-Driven Approach to Control-Limited DDP

A Passive Navigation Planning Algorithm for Collision-free Control of Mobile Robots

Agile Maneuvers in Legged Robots: a Predictive Control Approach

Motion Planning in Dynamic Environments Using Context-Aware Human Trajectory Prediction

Next Steps: Learning a Disentangled Gait Representation for Versatile Quadruped Locomotion

Simultaneous Scene Reconstruction and Whole-Body Motion Planning for Safe Operation in Dynamic Environments

Where Should I Look? Optimised Gaze Control for Whole-Body Collision Avoidance in Dynamic Environments

CPG-ACTOR: Reinforcement Learning for Central Pattern Generators

Predicted Composite Signed-Distance Fields for Real-Time Motion Planning in Dynamic Environments

Sparsity-Inducing Optimal Control via Differential Dynamic Programming

Crocoddyl: An Efficient and Versatile Framework for Multi-Contact Optimal Control

Learning Whole-body Motor Skills for Humanoids

Modeling and Control of a Hybrid Wheeled Jumping Robot

Optimizing Dynamic Trajectories for Robustness to Disturbances Using Polytopic Projections

Scaling Sampling-based Motion Planning to Humanoid Robots