Source author record

Marco Pavone

Marco Pavone appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics Systems and Control eess.SY Machine Learning math.OC Artificial Intelligence Computer Vision Multiagent Systems Human-Computer Interaction Computer Science and Game Theory Cryptography and Security cs.CY Distributed, Parallel, and Cluster Computing econ.TH Performance

Catalog footprint

What is connected

79works

15topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Agile Tradespace Exploration for Space Rendezvous Mission Design via Transformers

Spacecraft rendezvous enables on-orbit servicing, debris removal, and crewed docking, forming the foundation for a scalable space economy. Designing such missions requires rapid exploration of the tradespace between control cost and flight time across multiple candidate targets. However, multi-objective optimization in this setting is challenging, as the underlying constraints are often nonconvex, and mission designers must balance accuracy (e.g., solving the full problem) with efficiency (e.g., convex relaxations), slowing iteration and limiting design agility. To address these challenges, this paper proposes an AI-powered framework that enables agile and generalized rendezvous mission design. Given the orbital information of the target spacecraft, boundary conditions of the servicer, and a range of flight times, a transformer model generates a set of near-Pareto optimal trajectories across varying flight times in a single parallelized inference step, thereby enabling rapid mission trade studies. The model is further extended to accommodate variable flight times and perturbed orbital dynamics, supporting realistic multi-objective trade-offs. Validation on chance-constrained rendezvous problems in Earth orbits with passive safety constraints demonstrates that the model generalizes across both flight times and dynamics, consistently providing high-quality initial guesses that converge to superior solutions in fewer iterations. Moreover, the framework efficiently approximates the Pareto front, achieving runtimes comparable to convex relaxation by exploiting parallelized inference. Together, these results position the proposed framework as a practical surrogate for nonconvex trajectory generation and mark an important step toward AI-driven trajectory design for accelerating preliminary mission planning in real-world rendezvous applications.

preprint2026arXiv

Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail

End-to-end architectures trained via imitation learning have advanced autonomous driving by scaling model size and data, yet performance remains brittle in safety-critical long-tail scenarios where supervision is sparse and causal understanding is limited. We introduce Alpamayo-R1 (AR1), a vision-language-action model (VLA) that integrates Chain of Causation reasoning with trajectory planning for complex driving scenarios. Our approach features three key innovations: (1) the Chain of Causation (CoC) dataset, built through a hybrid auto-labeling and human-in-the-loop pipeline producing decision-grounded, causally linked reasoning traces aligned with driving behaviors; (2) a modular VLA architecture combining Cosmos-Reason, a vision-language model pre-trained for Physical AI, with a diffusion-based trajectory decoder that generates dynamically feasible trajectories in real time; (3) a multi-stage training strategy using supervised fine-tuning to elicit reasoning and reinforcement learning (RL) to enforce reasoning-action consistency and optimize reasoning quality. AR1 achieves up to a 12% improvement in planning accuracy on challenging cases compared to a trajectory-only baseline, with a 35% reduction in close encounter rate in closed-loop simulation. RL post-training improves reasoning quality by 45% and reasoning-action consistency by 37%. Model scaling from 0.5B to 7B parameters shows consistent improvements. On-vehicle road tests confirm real-time performance (99 ms latency) and successful urban deployment. By bridging interpretable reasoning with precise control, AR1 demonstrates a practical path towards Level 4 autonomous driving. Model weights are available at https://huggingface.co/nvidia/Alpamayo-R1-10B with inference code at https://github.com/NVlabs/alpamayo.

preprint2025arXiv

Counterfactual VLA: Self-Reflective Vision-Language-Action Model with Adaptive Reasoning

Recent reasoning-augmented Vision-Language-Action (VLA) models have improved the interpretability of end-to-end autonomous driving by generating intermediate reasoning traces. Yet these models primarily describe what they perceive and intend to do, rarely questioning whether their planned actions are safe or appropriate. This work introduces Counterfactual VLA (CF-VLA), a self-reflective VLA framework that enables the model to reason about and revise its planned actions before execution. CF-VLA first generates time-segmented meta-actions that summarize driving intent, and then performs counterfactual reasoning conditioned on both the meta-actions and the visual context. This step simulates potential outcomes, identifies unsafe behaviors, and outputs corrected meta-actions that guide the final trajectory generation. To efficiently obtain such self-reflective capabilities, we propose a rollout-filter-label pipeline that mines high-value scenes from a base (non-counterfactual) VLA's rollouts and labels counterfactual reasoning traces for subsequent training rounds. Experiments on large-scale driving datasets show that CF-VLA improves trajectory accuracy by up to 17.6%, enhances safety metrics by 20.5%, and exhibits adaptive thinking: it only enables counterfactual reasoning in challenging scenarios. By transforming reasoning traces from one-shot descriptions to causal self-correction signals, CF-VLA takes a step toward self-reflective autonomous driving agents that learn to think before they act.

preprint2025arXiv

Reproducibility in the Control of Autonomous Mobility-on-Demand Systems

Autonomous Mobility-on-Demand (AMoD) systems, powered by advances in robotics, control, and Machine Learning (ML), offer a promising paradigm for future urban transportation. AMoD offers fast and personalized travel services by leveraging centralized control of autonomous vehicle fleets to optimize operations and enhance service performance. However, the rapid growth of this field has outpaced the development of standardized practices for evaluating and reporting results, leading to significant challenges in reproducibility. As AMoD control algorithms become increasingly complex and data-driven, a lack of transparency in modeling assumptions, experimental setups, and algorithmic implementation hinders scientific progress and undermines confidence in the results. This paper presents a systematic study of reproducibility in AMoD research. We identify key components across the research pipeline, spanning system modeling, control problems, simulation design, algorithm specification, and evaluation, and analyze common sources of irreproducibility. We survey prevalent practices in the literature, highlight gaps, and propose a structured framework to assess and improve reproducibility. Specifically, concrete guidelines are offered, along with a "reproducibility checklist", to support future work in achieving replicable, comparable, and extensible results. While focused on AMoD, the principles and practices we advocate generalize to a broader class of cyber-physical systems that rely on networked autonomy and data-driven control. This work aims to lay the foundation for a more transparent and reproducible research culture in the design and deployment of intelligent mobility systems.

preprint2024arXiv

Sample-Efficient Safety Assurances using Conformal Prediction

When deploying machine learning models in high-stakes robotics applications, the ability to detect unsafe situations is crucial. Early warning systems can provide alerts when an unsafe situation is imminent (in the absence of corrective action). To reliably improve safety, these warning systems should have a provable false negative rate; i.e. of the situations that are unsafe, fewer than $ε$ will occur without an alert. In this work, we present a framework that combines a statistical inference technique known as conformal prediction with a simulator of robot/environment dynamics, in order to tune warning systems to provably achieve an $ε$ false negative rate using as few as $1/ε$ data points. We apply our framework to a driver warning system and a robotic grasping application, and empirically demonstrate guaranteed false negative rate while also observing low false detection (positive) rate.

preprint2024arXiv

Transformers for Trajectory Optimization with Application to Spacecraft Rendezvous

Reliable and efficient trajectory optimization methods are a fundamental need for autonomous dynamical systems, effectively enabling applications including rocket landing, hypersonic reentry, spacecraft rendezvous, and docking. Within such safety-critical application areas, the complexity of the emerging trajectory optimization problems has motivated the application of AI-based techniques to enhance the performance of traditional approaches. However, current AI-based methods either attempt to fully replace traditional control algorithms, thus lacking constraint satisfaction guarantees and incurring in expensive simulation, or aim to solely imitate the behavior of traditional methods via supervised learning. To address these limitations, this paper proposes the Autonomous Rendezvous Transformer (ART) and assesses the capability of modern generative models to solve complex trajectory optimization problems, both from a forecasting and control standpoint. Specifically, this work assesses the capabilities of Transformers to (i) learn near-optimal policies from previously collected data, and (ii) warm-start a sequential optimizer for the solution of non-convex optimal control problems, thus guaranteeing hard constraint satisfaction. From a forecasting perspective, results highlight how ART outperforms other learning-based architectures at predicting known fuel-optimal trajectories. From a control perspective, empirical analyses show how policies learned through Transformers are able to generate near-optimal warm-starts, achieving trajectories that are (i) more fuel-efficient, (ii) obtained in fewer sequential optimizer iterations, and (iii) computed with an overall runtime comparable to benchmarks based on convex optimization.

preprint2022arXiv

A Simple and Efficient Sampling-based Algorithm for General Reachability Analysis

In this work, we analyze an efficient sampling-based algorithm for general-purpose reachability analysis, which remains a notoriously challenging problem with applications ranging from neural network verification to safety analysis of dynamical systems. By sampling inputs, evaluating their images in the true reachable set, and taking their $ε$-padded convex hull as a set estimator, this algorithm applies to general problem settings and is simple to implement. Our main contribution is the derivation of asymptotic and finite-sample accuracy guarantees using random set theory. This analysis informs algorithmic design to obtain an $ε$-close reachable set approximation with high probability, provides insights into which reachability problems are most challenging, and motivates safety-critical applications of the technique. On a neural network verification task, we show that this approach is more accurate and significantly faster than prior work. Informed by our analysis, we also design a robust model predictive controller that we demonstrate in hardware experiments.

preprint2022arXiv

A Unified View of SDP-based Neural Network Verification through Completely Positive Programming

Verifying that input-output relationships of a neural network conform to prescribed operational specifications is a key enabler towards deploying these networks in safety-critical applications. Semidefinite programming (SDP)-based approaches to Rectified Linear Unit (ReLU) network verification transcribe this problem into an optimization problem, where the accuracy of any such formulation reflects the level of fidelity in how the neural network computation is represented, as well as the relaxations of intractable constraints. While the literature contains much progress on improving the tightness of SDP formulations while maintaining tractability, comparatively little work has been devoted to the other extreme, i.e., how to most accurately capture the original verification problem before SDP relaxation. In this work, we develop an exact, convex formulation of verification as a completely positive program (CPP), and provide analysis showing that our formulation is minimal -- the removal of any constraint fundamentally misrepresents the neural network computation. We leverage our formulation to provide a unifying view of existing approaches, and give insight into the source of large relaxation gaps observed in some cases.

preprint2022arXiv

Analysis of Theoretical and Numerical Properties of Sequential Convex Programming for Continuous-Time Optimal Control

Sequential Convex Programming (SCP) has recently gained significant popularity as an effective method for solving optimal control problems and has been successfully applied in several different domains. However, the theoretical analysis of SCP has received comparatively limited attention, and it is often restricted to discrete-time formulations. In this paper, we present a unifying theoretical analysis of a fairly general class of SCP procedures for continuous-time optimal control problems. In addition to the derivation of convergence guarantees in a continuous-time setting, our analysis reveals two new numerical and practical insights. First, we show how one can more easily account for manifold-type constraints, which are a defining feature of optimal control of mechanical systems. Second, we show how our theoretical analysis can be leveraged to accelerate SCP-based optimal control methods by infusing techniques from indirect optimal control.

preprint2022arXiv

Balancing Fairness and Efficiency in Traffic Routing via Interpolated Traffic Assignment

System optimum (SO) routing, wherein the total travel time of all users is minimized, is a holy grail for transportation authorities. However, SO routing may discriminate against users who incur much larger travel times than others to achieve high system efficiency, i.e., low total travel times. To address the inherent unfairness of SO routing, we study the $β$-fair SO problem whose goal is to minimize the total travel time while guaranteeing a ${β\geq 1}$ level of unfairness, which specifies the maximum possible ratio between the travel times of different users with shared origins and destinations. To obtain feasible solutions to the $β$-fair SO problem while achieving high system efficiency, we develop a new convex program, the Interpolated Traffic Assignment Problem (I-TAP), which interpolates between a fairness-promoting and an efficiency-promoting traffic-assignment objective. We evaluate the efficacy of I-TAP through theoretical bounds on the total system travel time and level of unfairness in terms of its interpolation parameter, as well as present a numerical comparison between I-TAP and a state-of-the-art algorithm on a range of transportation networks. The numerical results indicate that our approach is faster by several orders of magnitude as compared to the benchmark algorithm, while achieving higher system efficiency for all desirable levels of unfairness. We further leverage the structure of I-TAP to develop two pricing mechanisms to collectively enforce the I-TAP solution in the presence of selfish homogeneous and heterogeneous users, respectively, that independently choose routes to minimize their own travel costs. We mention that this is the first study of pricing in the context of fair routing for general road networks (as opposed to, e.g., parallel road networks).

preprint2022arXiv

BITS: Bi-level Imitation for Traffic Simulation

Simulation is the key to scaling up validation and verification for robotic systems such as autonomous vehicles. Despite advances in high-fidelity physics and sensor simulation, a critical gap remains in simulating realistic behaviors of road users. This is because, unlike simulating physics and graphics, devising first principle models for human-like behaviors is generally infeasible. In this work, we take a data-driven approach and propose a method that can learn to generate traffic behaviors from real-world driving logs. The method achieves high sample efficiency and behavior diversity by exploiting the bi-level hierarchy of driving behaviors by decoupling the traffic simulation problem into high-level intent inference and low-level driving behavior imitation. The method also incorporates a planning module to obtain stable long-horizon behaviors. We empirically validate our method, named Bi-level Imitation for Traffic Simulation (BITS), with scenarios from two large-scale driving datasets and show that BITS achieves balanced traffic simulation performance in realism, diversity, and long-horizon stability. We also explore ways to evaluate behavior realism and introduce a suite of evaluation metrics for traffic simulation. Finally, as part of our core contributions, we develop and open source a software tool that unifies data formats across different driving datasets and converts scenes from existing datasets into interactive simulation environments. For additional information and videos, see https://sites.google.com/view/nvr-bits2022/home

preprint2022arXiv

Control-oriented meta-learning

Real-time adaptation is imperative to the control of robots operating in complex, dynamic environments. Adaptive control laws can endow even nonlinear systems with good trajectory tracking performance, provided that any uncertain dynamics terms are linearly parameterizable with known nonlinear features. However, it is often difficult to specify such features a priori, such as for aerodynamic disturbances on rotorcraft or interaction forces between a manipulator arm and various objects. In this paper, we turn to data-driven modeling with neural networks to learn, offline from past data, an adaptive controller with an internal parametric model of these nonlinear features. Our key insight is that we can better prepare the controller for deployment with control-oriented meta-learning of features in closed-loop simulation, rather than regression-oriented meta-learning of features to fit input-output data. Specifically, we meta-learn the adaptive controller with closed-loop tracking simulation as the base-learner and the average tracking error as the meta-objective. With both fully-actuated and underactuated nonlinear planar rotorcraft subject to wind, we demonstrate that our adaptive controller outperforms other controllers trained with regression-oriented meta-learning when deployed in closed-loop for trajectory tracking control.

preprint2022arXiv

Coordinated Multi-Agent Pathfinding for Drones and Trucks over Road Networks

We address the problem of routing a team of drones and trucks over large-scale urban road networks. To conserve their limited flight energy, drones can use trucks as temporary modes of transit en route to their own destinations. Such coordination can yield significant savings in total vehicle distance traveled, i.e., truck travel distance and drone flight distance, compared to operating drones and trucks independently. But it comes at the potentially prohibitive computational cost of deciding which trucks and drones should coordinate and when and where it is most beneficial to do so. We tackle this fundamental trade-off by decoupling our overall intractable problem into tractable sub-problems that we solve stage-wise. The first stage solves only for trucks, by computing paths that make them more likely to be useful transit options for drones. The second stage solves only for drones, by routing them over a composite of the road network and the transit network defined by truck paths from the first stage. We design a comprehensive algorithmic framework that frames each stage as a multi-agent path-finding problem and implement two distinct methods for solving them. We evaluate our approach on extensive simulations with up to $100$ agents on the real-world Manhattan road network containing nearly $4500$ vertices and $10000$ edges. Our framework saves on more than $50\%$ of vehicle distance traveled compared to independently solving for trucks and drones, and computes solutions for all settings within $5$ minutes on commodity hardware.

preprint2022arXiv

Data-Driven Chance Constrained Control using Kernel Distribution Embeddings

We present a data-driven algorithm for efficiently computing stochastic control policies for general joint chance constrained optimal control problems. Our approach leverages the theory of kernel distribution embeddings, which allows representing expectation operators as inner products in a reproducing kernel Hilbert space. This framework enables approximately reformulating the original problem using a dataset of observed trajectories from the system without imposing prior assumptions on the parameterization of the system dynamics or the structure of the uncertainty. By optimizing over a finite subset of stochastic open-loop control trajectories, we relax the original problem to a linear program over the control parameters that can be efficiently solved using standard convex optimization techniques. We demonstrate our proposed approach in simulation on a system with nonlinear non-Markovian dynamics navigating in a cluttered environment.

preprint2022arXiv

Graph Meta-Reinforcement Learning for Transferable Autonomous Mobility-on-Demand

Autonomous Mobility-on-Demand (AMoD) systems represent an attractive alternative to existing transportation paradigms, currently challenged by urbanization and increasing travel needs. By centrally controlling a fleet of self-driving vehicles, these systems provide mobility service to customers and are currently starting to be deployed in a number of cities around the world. Current learning-based approaches for controlling AMoD systems are limited to the single-city scenario, whereby the service operator is allowed to take an unlimited amount of operational decisions within the same transportation system. However, real-world system operators can hardly afford to fully re-train AMoD controllers for every city they operate in, as this could result in a high number of poor-quality decisions during training, making the single-city strategy a potentially impractical solution. To address these limitations, we propose to formalize the multi-city AMoD problem through the lens of meta-reinforcement learning (meta-RL) and devise an actor-critic algorithm based on recurrent graph neural networks. In our approach, AMoD controllers are explicitly trained such that a small amount of experience within a new city will produce good system performance. Empirically, we show how control policies learned through meta-RL are able to achieve near-optimal performance on unseen cities by learning rapidly adaptable policies, thus making them more robust not only to novel environments, but also to distribution shifts common in real-world operations, such as special events, unexpected congestion, and dynamic pricing schemes.

preprint2022arXiv

Heterogeneous-Agent Trajectory Forecasting Incorporating Class Uncertainty

Reasoning about the future behavior of other agents is critical to safe robot navigation. The multiplicity of plausible futures is further amplified by the uncertainty inherent to agent state estimation from data, including positions, velocities, and semantic class. Forecasting methods, however, typically neglect class uncertainty, conditioning instead only on the agent's most likely class, even though perception models often return full class distributions. To exploit this information, we present HAICU, a method for heterogeneous-agent trajectory forecasting that explicitly incorporates agents' class probabilities. We additionally present PUP, a new challenging real-world autonomous driving dataset, to investigate the impact of Perceptual Uncertainty in Prediction. It contains challenging crowded scenes with unfiltered agent class probabilities that reflect the long-tail of current state-of-the-art perception systems. We demonstrate that incorporating class probabilities in trajectory forecasting significantly improves performance in the face of uncertainty, and enables new forecasting capabilities such as counterfactual predictions.

preprint2022arXiv

Interaction-Dynamics-Aware Perception Zones for Obstacle Detection Safety Evaluation

To enable safe autonomous vehicle (AV) operations, it is critical that an AV's obstacle detection module can reliably detect obstacles that pose a safety threat (i.e., are safety-critical). It is therefore desirable that the evaluation metric for the perception system captures the safety-criticality of objects. Unfortunately, existing perception evaluation metrics tend to make strong assumptions about the objects and ignore the dynamic interactions between agents, and thus do not accurately capture the safety risks in reality. To address these shortcomings, we introduce an interaction-dynamics-aware obstacle detection evaluation metric by accounting for closed-loop dynamic interactions between an ego vehicle and obstacles in the scene. By borrowing existing theory from optimal control theory, namely Hamilton-Jacobi reachability, we present a computationally tractable method for constructing a ``safety zone'': a region in state space that defines where safety-critical obstacles lie for the purpose of defining safety metrics. Our proposed safety zone is mathematically complete, and can be easily computed to reflect a variety of safety requirements. Using an off-the-shelf detection algorithm from the nuScenes detection challenge leaderboard, we demonstrate that our approach is computationally lightweight, and can better capture safety-critical perception errors than a baseline approach.

preprint2022arXiv

Learning Deep SDF Maps Online for Robot Navigation and Exploration

We propose an algorithm to (i) learn online a deep signed distance function (SDF) with a LiDAR-equipped robot to represent the 3D environment geometry, and (ii) plan collision-free trajectories given this deep learned map. Our algorithm takes a stream of incoming LiDAR scans and continually optimizes a neural network to represent the SDF of the environment around its current vicinity. When the SDF network quality saturates, we cache a copy of the network, along with a learned confidence metric, and initialize a new SDF network to continue mapping new regions of the environment. We then concatenate all the cached local SDFs through a confidence-weighted scheme to give a global SDF for planning. For planning, we make use of a sequential convex model predictive control (MPC) algorithm. The MPC planner optimizes a dynamically feasible trajectory for the robot while enforcing no collisions with obstacles mapped in the global SDF. We show that our online mapping algorithm produces higher-quality maps than existing methods for online SDF training. In the WeBots simulator, we further showcase the combined mapper and planner running online -- navigating autonomously and without collisions in an unknown environment.

preprint2022arXiv

Local Calibration: Metrics and Recalibration

Probabilistic classifiers output confidence scores along with their predictions, and these confidence scores should be calibrated, i.e., they should reflect the reliability of the prediction. Confidence scores that minimize standard metrics such as the expected calibration error (ECE) accurately measure the reliability on average across the entire population. However, it is in general impossible to measure the reliability of an individual prediction. In this work, we propose the local calibration error (LCE) to span the gap between average and individual reliability. For each individual prediction, the LCE measures the average reliability of a set of similar predictions, where similarity is quantified by a kernel function on a pretrained feature space and by a binning scheme over predicted model confidences. We show theoretically that the LCE can be estimated sample-efficiently from data, and empirically find that it reveals miscalibration modes that are more fine-grained than the ECE can detect. Our key result is a novel local recalibration method LoRe, to improve confidence scores for individual predictions and decrease the LCE. Experimentally, we show that our recalibration method produces more accurate confidence scores, which improves downstream fairness and decision making on classification tasks with both image and tabular data.

preprint2022arXiv

Matching with Transfers under Distributional Constraints

We study two-sided many-to-one matching markets with transferable utilities, e.g., labor and rental housing markets, in which money can exchange hands between agents, subject to distributional constraints on the set of feasible allocations. In such markets, we establish the efficiency of equilibrium arrangements, specified by an assignment and transfers between agents on the two sides of the market, and study the conditions on the distributional constraints and agent preferences under which equilibria exist and can be computed efficiently. To this end, we first consider the setting when the number of institutions (e.g., firms in a labor market) is one and show that equilibrium arrangements exist irrespective of the nature of the constraint structure or the agents' preferences. However, equilibrium arrangements may not exist in markets with multiple institutions even when agents on each side have linear (or additively separable) preferences over agents on the other side. Thus, for markets with linear preferences, we study sufficient conditions on the constraint structure that guarantee the existence of equilibria using linear programming duality. Our linear programming approach not only generalizes that of Shapley and Shubik (1971) in the one-to-one matching setting to the many-to-one matching setting under distributional constraints but also provides a method to compute market equilibria efficiently.

preprint2022arXiv

Motron: Multimodal Probabilistic Human Motion Forecasting

Autonomous systems and humans are increasingly sharing the same space. Robots work side by side or even hand in hand with humans to balance each other's limitations. Such cooperative interactions are ever more sophisticated. Thus, the ability to reason not just about a human's center of gravity position, but also its granular motion is an important prerequisite for human-robot interaction. Though, many algorithms ignore the multimodal nature of humans or neglect uncertainty in their motion forecasts. We present Motron, a multimodal, probabilistic, graph-structured model, that captures human's multimodality using probabilistic methods while being able to output deterministic maximum-likelihood motions and corresponding confidence values for each mode. Our model aims to be tightly integrated with the robotic planning-control-interaction loop; outputting physically feasible human motions and being computationally efficient. We demonstrate the performance of our model on several challenging real-world motion forecasting datasets, outperforming a wide array of generative/variational methods while providing state-of-the-art single-output motions if required. Both using significantly less computational power than state-of-the art algorithms.

preprint2022arXiv

Online Learning for Traffic Routing under Unknown Preferences

In transportation networks, users typically choose routes in a decentralized and self-interested manner to minimize their individual travel costs, which, in practice, often results in inefficient overall outcomes for society. As a result, there has been a growing interest in designing road tolling schemes to cope with these efficiency losses and steer users toward a system-efficient traffic pattern. However, the efficacy of road tolling schemes often relies on having access to complete information on users' trip attributes, such as their origin-destination (O-D) travel information and their values of time, which may not be available in practice. Motivated by this practical consideration, we propose an online learning approach to set tolls in a traffic network to drive heterogeneous users with different values of time toward a system-efficient traffic pattern. In particular, we develop a simple yet effective algorithm that adjusts tolls at each time period solely based on the observed aggregate flows on the roads of the network without relying on any additional trip attributes of users, thereby preserving user privacy. In the setting where the O-D pairs and values of time of users are drawn i.i.d. at each period, we show that our approach obtains an expected regret and road capacity violation of $O(\sqrt{T})$, where $T$ is the number of periods over which tolls are updated. Our regret guarantee is relative to an offline oracle that has complete information on users' trip attributes. We further establish a $Ω(\sqrt{T})$ lower bound on the regret of any algorithm, which establishes that our algorithm is optimal up to constants. Finally, we demonstrate the superior performance of our approach relative to several benchmarks on a real-world transportation network, thereby highlighting its practical applicability.

preprint2022arXiv

Private Location Sharing for Decentralized Routing services

Data-driven methodologies offer many exciting upsides, but they also introduce new challenges, particularly in the realm of user privacy. Specifically, the way data is collected can pose privacy risks to end users. In many routing services, a single entity (e.g., the routing service provider) collects and manages user trajectory data. When it comes to user privacy, these systems have a central point of failure since users have to trust that this entity will not sell or use their data to infer sensitive private information. Unfortunately, in practice many advertising companies offer to buy such data for the sake of targeted advertisements. With this as motivation, we study the problem of using location data for routing services in a privacy-preserving way. Rather than having users report their location to a central operator, we present a protocol in which users participate in a decentralized and privacy-preserving computation to estimate travel times for the roads in the network in a way that no individuals' location is ever observed by any other party. The protocol uses the Laplace mechanism in conjunction with secure multi-party computation to ensure that it is cryptogrpahically secure and that its output is differentially private. A natural question is if privacy necessitates degradation in accuracy or system performance. We show that if a road has sufficiently high capacity, then the travel time estimated by our protocol is provably close to the ground truth travel time. We validate the protocol through numerical experiments which show that using the protocol as a routing service provides privacy guarantees with minimal overhead to user travel time.

preprint2022arXiv

Propagating State Uncertainty Through Trajectory Forecasting

Uncertainty pervades through the modern robotic autonomy stack, with nearly every component (e.g., sensors, detection, classification, tracking, behavior prediction) producing continuous or discrete probabilistic distributions. Trajectory forecasting, in particular, is surrounded by uncertainty as its inputs are produced by (noisy) upstream perception and its outputs are predictions that are often probabilistic for use in downstream planning. However, most trajectory forecasting methods do not account for upstream uncertainty, instead taking only the most-likely values. As a result, perceptual uncertainties are not propagated through forecasting and predictions are frequently overconfident. To address this, we present a novel method for incorporating perceptual state uncertainty in trajectory forecasting, a key component of which is a new statistical distance-based loss function which encourages predicting uncertainties that better match upstream perception. We evaluate our approach both in illustrative simulations and on large-scale, real-world data, demonstrating its efficacy in propagating perceptual state uncertainty through prediction and producing more calibrated predictions.

preprint2022arXiv

Risk-sensitive safety analysis using Conditional Value-at-Risk

This paper develops a safety analysis method for stochastic systems that is sensitive to the possibility and severity of rare harmful outcomes. We define risk-sensitive safe sets as sub-level sets of the solution to a non-standard optimal control problem, where a random maximum cost is assessed via Conditional Value-at-Risk (CVaR). The objective function represents the maximum extent of constraint violation of the state trajectory, averaged over a given percentage of worst cases. This problem is well-motivated but difficult to solve tractably because the temporal decomposition for CVaR is history-dependent. Our primary theoretical contribution is to derive computationally tractable under-approximations to risk-sensitive safe sets. Our method provides a novel, theoretically guaranteed, parameter-dependent upper bound to the CVaR of a maximum cost without the need to augment the state space. For a fixed parameter value, the solution to only one Markov decision process problem is required to obtain the under-approximations for any family of risk-sensitivity levels. In addition, we propose a second definition for risk-sensitive safe sets and provide a tractable method for their estimation without using a parameter-dependent upper bound. The second definition is expressed in terms of a new coherent risk functional, which is inspired by CVaR. We demonstrate our primary theoretical contribution via numerical examples.

preprint2022arXiv

Robust Trajectory Prediction against Adversarial Attacks

Trajectory prediction using deep neural networks (DNNs) is an essential component of autonomous driving (AD) systems. However, these methods are vulnerable to adversarial attacks, leading to serious consequences such as collisions. In this work, we identify two key ingredients to defend trajectory prediction models against adversarial attacks including (1) designing effective adversarial training methods and (2) adding domain-specific data augmentation to mitigate the performance degradation on clean data. We demonstrate that our method is able to improve the performance by 46% on adversarial data and at the cost of only 3% performance degradation on clean data, compared to the model trained with clean data. Additionally, compared to existing robust methods, our method can improve performance by 21% on adversarial examples and 9% on clean data. Our robust model is evaluated with a planner to study its downstream impacts. We demonstrate that our model can significantly reduce the severe accident rates (e.g., collisions and off-road driving).

preprint2022arXiv

Safe Active Dynamics Learning and Control: A Sequential Exploration-Exploitation Framework

Safe deployment of autonomous robots in diverse scenarios requires agents that are capable of efficiently adapting to new environments while satisfying constraints. In this work, we propose a practical and theoretically-justified approach to maintaining safety in the presence of dynamics uncertainty. Our approach leverages Bayesian meta-learning with last-layer adaptation. The expressiveness of neural-network features trained offline, paired with efficient last-layer online adaptation, enables the derivation of tight confidence sets which contract around the true dynamics as the model adapts online. We exploit these confidence sets to plan trajectories that guarantee the safety of the system. Our approach handles problems with high dynamics uncertainty, where reaching the goal safely is potentially initially infeasible, by first \textit{exploring} to gather data and reduce uncertainty, before autonomously \textit{exploiting} the acquired information to safely perform the task. Under reasonable assumptions, we prove that our framework guarantees the high-probability satisfaction of all constraints at all times jointly, i.e. over the total task duration. This theoretical analysis also motivates two regularizers of last-layer meta-learning models that improve online adaptation capabilities as well as performance by reducing the size of the confidence sets. We extensively demonstrate our approach in simulation and on hardware.

preprint2022arXiv

ScePT: Scene-consistent, Policy-based Trajectory Predictions for Planning

Trajectory prediction is a critical functionality of autonomous systems that share environments with uncontrolled agents, one prominent example being self-driving vehicles. Currently, most prediction methods do not enforce scene consistency, i.e., there are a substantial amount of self-collisions between predicted trajectories of different agents in the scene. Moreover, many approaches generate individual trajectory predictions per agent instead of joint trajectory predictions of the whole scene, which makes downstream planning difficult. In this work, we present ScePT, a policy planning-based trajectory prediction model that generates accurate, scene-consistent trajectory predictions suitable for autonomous system motion planning. It explicitly enforces scene consistency and learns an agent interaction policy that can be used for conditional prediction. Experiments on multiple real-world pedestrians and autonomous vehicle datasets show that ScePT} matches current state-of-the-art prediction accuracy with significantly improved scene consistency. We also demonstrate ScePT's ability to work with a downstream contingency planner.

preprint2022arXiv

Second-Order Sensitivity Analysis for Bilevel Optimization

In this work we derive a second-order approach to bilevel optimization, a type of mathematical programming in which the solution to a parameterized optimization problem (the "lower" problem) is itself to be optimized (in the "upper" problem) as a function of the parameters. Many existing approaches to bilevel optimization employ first-order sensitivity analysis, based on the implicit function theorem (IFT), for the lower problem to derive a gradient of the lower problem solution with respect to its parameters; this IFT gradient is then used in a first-order optimization method for the upper problem. This paper extends this sensitivity analysis to provide second-order derivative information of the lower problem (which we call the IFT Hessian), enabling the usage of faster-converging second-order optimization methods at the upper level. Our analysis shows that (i) much of the computation already used to produce the IFT gradient can be reused for the IFT Hessian, (ii) errors bounds derived for the IFT gradient readily apply to the IFT Hessian, (iii) computing IFT Hessians can significantly reduce overall computation by extracting more information from each lower level solve. We corroborate our findings and demonstrate the broad range of applications of our method by applying it to problem instances of least squares hyperparameter auto-tuning, multi-class SVM auto-tuning, and inverse optimal control.

preprint2022arXiv

Semi-Supervised Trajectory-Feedback Controller Synthesis for Signal Temporal Logic Specifications

There are spatio-temporal rules that dictate how robots should operate in complex environments, e.g., road rules govern how (self-driving) vehicles should behave on the road. However, seamlessly incorporating such rules into a robot control policy remains challenging especially for real-time applications. In this work, given a desired spatio-temporal specification expressed in the Signal Temporal Logic (STL) language, we propose a semi-supervised controller synthesis technique that is attuned to human-like behaviors while satisfying desired STL specifications. Offline, we synthesize a trajectory-feedback neural network controller via an adversarial training scheme that summarizes past spatio-temporal behaviors when computing controls, and then online, we perform gradient steps to improve specification satisfaction. Central to the offline phase is an imitation-based regularization component that fosters better policy exploration and helps induce naturalistic human behaviors. Our experiments demonstrate that having imitation-based regularization leads to higher qualitative and quantitative performance compared to optimizing an STL objective only as done in prior work. We demonstrate the efficacy of our approach with an illustrative case study and show that our proposed controller outperforms a state-of-the-art shooting method in both performance and computation time.

preprint2022arXiv

Towards Data-Driven Synthesis of Autonomous Vehicle Safety Concepts

As safety-critical autonomous vehicles (AVs) will soon become pervasive in our society, a number of safety concepts for trusted AV deployment have recently been proposed throughout industry and academia. Yet, achieving consensus on an appropriate safety concept is still an elusive task. In this paper, we advocate for the use of Hamilton-Jacobi (HJ) reachability as a unifying mathematical framework for comparing existing safety concepts, and through elements of this framework propose ways to tailor safety concepts (and thus expand their applicability) to scenarios with implicit expectations on agent behavior in a data-driven fashion. Specifically, we show that (i) existing predominant safety concepts can be embedded in the HJ reachability framework, thereby enabling a common language for comparing and contrasting modeling assumptions, and (ii) HJ reachability can serve as an inductive bias to effectively reason, in a learning context, about two critical, yet often overlooked aspects of safety: responsibility and context-dependency.

preprint2022arXiv

Using Spectral Submanifolds for Nonlinear Periodic Control

Very high dimensional nonlinear systems arise in many engineering problems due to semi-discretization of the governing partial differential equations, e.g. through finite element methods. The complexity of these systems present computational challenges for direct application to automatic control. While model reduction has seen ubiquitous applications in control, the use of nonlinear model reduction methods in this setting remains difficult. The problem lies in preserving the structure of the nonlinear dynamics in the reduced order model for high-fidelity control. In this work, we leverage recent advances in Spectral Submanifold (SSM) theory to enable model reduction under well-defined assumptions for the purpose of efficiently synthesizing feedback controllers.

preprint2021arXiv

Control Barrier Functions for Cyber-Physical Systems and Applications to NMPC

Tractable safety-ensuring algorithms for cyber-physical systems are important in critical applications. Approaches based on Control Barrier Functions assume continuous enforcement, which is not possible in an online fashion. This paper presents two tractable algorithms to ensure forward invariance of discrete-time controlled cyber-physical systems. Both approaches are based on Control Barrier Functions to provide strict mathematical safety guarantees. The first algorithm exploits Lipschitz continuity and formulates the safety condition as a robust program which is subsequently relaxed to a set of affine conditions. The second algorithm is inspired by tube-NMPC and uses an affine Control Barrier Function formulation in conjunction with an auxiliary controller to guarantee safety of the system. We combine an approximate NMPC controller with the second algorithm to guarantee strict safety despite approximated constraints and show its effectiveness experimentally on a mini-Segway.

preprint2021arXiv

Efficient Large-Scale Multi-Drone Delivery Using Transit Networks

We consider the problem of controlling a large fleet of drones to deliver packages simultaneously across broad urban areas. To conserve energy, drones hop between public transit vehicles (e.g., buses and trams). We design a comprehensive algorithmic framework that strives to minimize the maximum time to complete any delivery. We address the multifaceted complexity of the problem through a two-layer approach. First, the upper layer assigns drones to package delivery sequences with a near-optimal polynomial-time task allocation algorithm. Then, the lower layer executes the allocation by periodically routing the fleet over the transit network while employing efficient bounded-suboptimal multi-agent pathfinding techniques tailored to our setting. Experiments demonstrate the efficiency of our approach on settings with up to $200$ drones, $5000$ packages, and transit networks with up to $8000$ stops in San Francisco and Washington DC. Our results show that the framework computes solutions typically within a few seconds on commodity hardware, and that drones travel up to $360 \%$ of their flight range with public transit.

preprint2021arXiv

Evidential Sparsification of Multimodal Latent Spaces in Conditional Variational Autoencoders

Discrete latent spaces in variational autoencoders have been shown to effectively capture the data distribution for many real-world problems such as natural language understanding, human intent prediction, and visual scene representation. However, discrete latent spaces need to be sufficiently large to capture the complexities of real-world data, rendering downstream tasks computationally challenging. For instance, performing motion planning in a high-dimensional latent representation of the environment could be intractable. We consider the problem of sparsifying the discrete latent space of a trained conditional variational autoencoder, while preserving its learned multimodality. As a post hoc latent space reduction technique, we use evidential theory to identify the latent classes that receive direct evidence from a particular input condition and filter out those that do not. Experiments on diverse tasks, such as image generation and human behavior prediction, demonstrate the effectiveness of our proposed technique at reducing the discrete latent sample space size of a model while maintaining its learned multimodality.

preprint2021arXiv

MATS: An Interpretable Trajectory Forecasting Representation for Planning and Control

Reasoning about human motion is a core component of modern human-robot interactive systems. In particular, one of the main uses of behavior prediction in autonomous systems is to inform robot motion planning and control. However, a majority of planning and control algorithms reason about system dynamics rather than the predicted agent tracklets (i.e., ordered sets of waypoints) that are commonly output by trajectory forecasting methods, which can hinder their integration. Towards this end, we propose Mixtures of Affine Time-varying Systems (MATS) as an output representation for trajectory forecasting that is more amenable to downstream planning and control use. Our approach leverages successful ideas from probabilistic trajectory forecasting works to learn dynamical system representations that are well-studied in the planning and control literature. We integrate our predictions with a proposed multimodal planning methodology and demonstrate significant computational efficiency improvements on a large-scale autonomous driving dataset.

preprint2021arXiv

On the Co-Design of AV-Enabled Mobility Systems

The design of autonomous vehicles (AVs) and the design of AV-enabled mobility systems are closely coupled. Indeed, knowledge about the intended service of AVs would impact their design and deployment process, whilst insights about their technological development could significantly affect transportation management decisions. This calls for tools to study such a coupling and co-design AVs and AV-enabled mobility systems in terms of different objectives. In this paper, we instantiate a framework to address such co-design problems. In particular, we leverage the recently developed theory of co-design to frame and solve the problem of designing and deploying an intermodal Autonomous Mobility-on-Demand system, whereby AVs service travel demands jointly with public transit, in terms of fleet sizing, vehicle autonomy, and public transit service frequency. Our framework is modular and compositional, allowing one to describe the design problem as the interconnection of its individual components and to tackle it from a system-level perspective. To showcase our methodology, we present a real-world case study for Washington D.C., USA. Our work suggests that it is possible to create user-friendly optimization tools to systematically assess costs and benefits of interventions, and that such analytical techniques might gain a momentous role in policy-making in the future.

preprint2021arXiv

Sketching Curvature for Efficient Out-of-Distribution Detection for Deep Neural Networks

In order to safely deploy Deep Neural Networks (DNNs) within the perception pipelines of real-time decision making systems, there is a need for safeguards that can detect out-of-training-distribution (OoD) inputs both efficiently and accurately. Building on recent work leveraging the local curvature of DNNs to reason about epistemic uncertainty, we propose Sketching Curvature of OoD Detection (SCOD), an architecture-agnostic framework for equipping any trained DNN with a task-relevant epistemic uncertainty estimate. Offline, given a trained model and its training data, SCOD employs tools from matrix sketching to tractably compute a low-rank approximation of the Fisher information matrix, which characterizes which directions in the weight space are most influential on the predictions over the training data. Online, we estimate uncertainty by measuring how much perturbations orthogonal to these directions can alter predictions at a new test input. We apply SCOD to pre-trained networks of varying architectures on several tasks, ranging from regression to classification. We demonstrate that SCOD achieves comparable or better OoD detection performance with lower computational burden relative to existing baselines.

preprint2021arXiv

Trajectron++: Dynamically-Feasible Trajectory Forecasting With Heterogeneous Data

Reasoning about human motion is an important prerequisite to safe and socially-aware robotic navigation. As a result, multi-agent behavior prediction has become a core component of modern human-robot interactive systems, such as self-driving cars. While there exist many methods for trajectory forecasting, most do not enforce dynamic constraints and do not account for environmental information (e.g., maps). Towards this end, we present Trajectron++, a modular, graph-structured recurrent model that forecasts the trajectories of a general number of diverse agents while incorporating agent dynamics and heterogeneous data (e.g., semantic maps). Trajectron++ is designed to be tightly integrated with robotic planning and control frameworks; for example, it can produce predictions that are optionally conditioned on ego-agent motion plans. We demonstrate its performance on several challenging real-world trajectory forecasting datasets, outperforming a wide array of state-of-the-art deterministic and generative methods.

preprint2020arXiv

A Simple and Efficient Tube-based Robust Output Feedback Model Predictive Control Scheme

The control of constrained systems using model predictive control (MPC) becomes more challenging when full state information is not available and when the nominal system model and measurements are corrupted by noise. Since these conditions are often seen in practical scenarios, techniques such as robust output feedback MPC have been developed to address them. However, existing approaches to robust output feedback MPC are challenged by increased complexity of the online optimization problem, increased computational requirements for controller synthesis, or both. In this work we present a simple and efficient methodology for synthesizing a tube-based robust output feedback MPC scheme for linear, discrete, time-invariant systems subject to bounded, additive disturbances. Specifically, we first formulate a scheme where the online MPC optimization problem has the same complexity as in the nominal full state feedback MPC by using a single tube with constant cross-section. This makes our proposed approach simpler to implement and less computationally demanding than previous methods for both online implementation and offline controller synthesis. Secondly, we propose a novel and simple procedure for the computation of robust positively invariant (RPI) sets that are approximations of the minimal RPI set, which can be used to define the tube in the proposed control scheme.

preprint2020arXiv

Congestion-aware Routing and Rebalancing of Autonomous Mobility-on-Demand Systems in Mixed Traffic

This paper studies congestion-aware route-planning policies for Autonomous Mobility-on-Demand (AMoD) systems, whereby a fleet of autonomous vehicles provides on-demand mobility under mixed traffic conditions. Specifically, we first devise a network flow model to optimize the AMoD routing and rebalancing strategies in a congestion-aware fashion by accounting for the endogenous impact of AMoD flows on travel time. Second, we capture reactive exogenous traffic consisting of private vehicles selfishly adapting to the AMoD flows in a user-centric fashion by leveraging an iterative approach. Finally, we showcase the effectiveness of our framework with two case-studies considering the transportation sub-networks in Eastern Massachusetts and New York City. Our results suggest that for high levels of demand, pure AMoD travel can be detrimental due to the additional traffic stemming from its rebalancing flows, while the combination of AMoD with walking or micromobility options can significantly improve the overall system performance.

preprint2020arXiv

Error Bounds for Reduced Order Model Predictive Control

Model predictive control is a powerful framework for enabling optimal control of constrained systems. However, for systems that are described by high-dimensional state spaces this framework can be too computationally demanding for real-time control. Reduced order model predictive control (ROMPC) frameworks address this issue by leveraging model reduction techniques to compress the state space model used in the online optimal control problem. While this can enable real-time control by decreasing the online computational requirements, these model reductions introduce approximation errors that must be accounted for to guarantee constraint satisfaction and closed-loop stability for the controlled high-dimensional system. In this work we propose an offline methodology for efficiently computing error bounds arising from model reduction, and show how they can be used to guarantee constraint satisfaction in a previously proposed ROMPC framework. This work considers linear, discrete, time-invariant systems that are compressed by Petrov-Galerkin projections, and considers output-feedback settings where the system is also subject to bounded disturbances.

preprint2020arXiv

Infusing Reachability-Based Safety into Planning and Control for Multi-agent Interactions

Within a robot autonomy stack, the planner and controller are typically designed separately, and serve different purposes. As such, there is often a diffusion of responsibilities when it comes to ensuring safety for the robot. We propose that a planner and controller should share the same interpretation of safety but apply this knowledge in a different yet complementary way. To achieve this, we use Hamilton-Jacobi (HJ) reachability theory at the planning level to provide the robot planner with the foresight to avoid entering regions with possible inevitable collision. However, this alone does not guarantee safety. In conjunction with this HJ reachability-infused planner, we propose a minimally-interventional multi-agent safety-preserving controller also derived via HJ-reachability theory. The safety controller maintains safety for the robot without unduly impacting planner performance. We demonstrate the benefits of our proposed approach in a multi-agent highway scenario where a robot car is rewarded to navigate through traffic as fast as possible, and we show that our approach provides strong safety assurances yet achieves the highest performance compared to other safety controllers.

preprint2020arXiv

On Local Computation for Optimization in Multi-Agent Systems

A number of prototypical optimization problems in multi-agent systems (e.g., task allocation and network load-sharing) exhibit a highly local structure: that is, each agent's decision variables are only directly coupled to few other agent's variables through the objective function or the constraints. Nevertheless, existing algorithms for distributed optimization generally do not exploit the locality structure of the problem, requiring all agents to compute or exchange the full set of decision variables. In this paper, we develop a rigorous notion of "locality" that quantifies the degree to which agents can compute their portion of the global solution based solely on information in their local neighborhood. This notion provides a theoretical basis for a rather simple algorithm in which agents individually solve a truncated sub-problem of the global problem, where the size of the sub-problem used depends on the locality of the problem, and the desired accuracy. Numerical results show that the proposed theoretical bounds are remarkably tight for well-conditioned problems.

preprint2020arXiv

Planning and Operations of Mixed Fleets in Mobility-on-Demand Systems

Automated vehicles (AVs) are expected to be beneficial for Mobility-on-Demand (MoD), thanks to their ability of being globally coordinated. To facilitate the steady transition towards full autonomy, we consider the transition period of AV deployment, whereby an MoD system operates a mixed fleet of automated vehicles (AVs) and human-driven vehicles (HVs). In such systems, AVs are centrally coordinated by the operator, and the HVs might strategically respond to the coordination of AVs. We devise computationally tractable strategies to coordinate mixed fleets in MoD systems. Specifically, we model an MoD system with a mixed fleet using a Stackelberg framework where the MoD operator serves as the leader and human-driven vehicles serve as the followers. We develop two models: 1) a steady-state model to analyze the properties of the problem and determine the planning variables (e.g., compensations, prices, and the fleet size of AVs), and 2) a time-varying model to design a real-time coordination algorithm for AVs. The proposed models are validated using a case study inspired by real operational data of a MoD service in Singapore. Results show that the proposed algorithms can significantly improve system performance.

preprint2020arXiv

Revisiting the Asymptotic Optimality of RRT$^*$

RRT* is one of the most widely used sampling-based algorithms for asymptotically-optimal motion planning. This algorithm laid the foundations for optimality in motion planning as a whole, and inspired the development of numerous new algorithms in the field, many of which build upon RRT* itself. In this paper, we first identify a logical gap in the optimality proof of RRT*, which was developed in Karaman and Frazzoli (2011). Then, we present an alternative and mathematically-rigorous proof for asymptotic optimality. Our proof suggests that the connection radius used by RRT* should be increased from $γ\left(\frac{\log n}{n}\right)^{1/d}$ to $γ' \left(\frac{\log n}{n}\right)^{1/(d+1)}$ in order to account for the additional dimension of time that dictates the samples' ordering. Here $γ$, $γ'$, are constants, and $n$, $d$, are the number of samples and the dimension of the problem, respectively.

preprint2020arXiv

Risk-sensitive safety specifications for stochastic systems using Conditional Value-at-Risk

This paper proposes a safety analysis method that facilitates a tunable balance between the worst-case and risk-neutral perspectives. First, we define a risk-sensitive safe set to specify the degree of safety attained by a stochastic system. This set is defined as a sublevel set of the solution to an optimal control problem that is expressed using the Conditional Value-at-Risk (CVaR) measure. This problem does not satisfy Bellman's Principle, thus our next contribution is to show how risk-sensitive safe sets can be under-approximated by the solution to a CVaR-Markov Decision Process. We adopt an existing value iteration algorithm to find an approximate solution to the reduced problem for a class of linear systems. Then, we develop a realistic numerical example of a stormwater system to show that this approach can be applied to non-linear systems. Finally, we compare the CVaR criterion to the exponential disutility criterion. The latter allocates control effort evenly across the cost distribution to reduce variance, while the CVaR criterion focuses control effort on a given worst-case quantile--where it matters most for safety.

preprint2020arXiv

Risk-Sensitive Sequential Action Control with Multi-Modal Human Trajectory Forecasting for Safe Crowd-Robot Interaction

This paper presents a novel online framework for safe crowd-robot interaction based on risk-sensitive stochastic optimal control, wherein the risk is modeled by the entropic risk measure. The sampling-based model predictive control relies on mode insertion gradient optimization for this risk measure as well as Trajectron++, a state-of-the-art generative model that produces multimodal probabilistic trajectory forecasts for multiple interacting agents. Our modular approach decouples the crowd-robot interaction into learning-based prediction and model-based control, which is advantageous compared to end-to-end policy learning methods in that it allows the robot's desired behavior to be specified at run time. In particular, we show that the robot exhibits diverse interaction behavior by varying the risk sensitivity parameter. A simulation study and a real-world experiment show that the proposed online framework can accomplish safe and efficient navigation while avoiding collisions with more than 50 humans in the scene.

preprint2020arXiv

Shapeshifter: A Multi-Agent, Multi-Modal Robotic Platform for Exploration of Titan

In this paper we present a mission architecture and a robotic platform, the Shapeshifter, that allow multi-domain and redundant mobility on Saturn's moon Titan, and potentially other bodies with atmospheres. The Shapeshifter is a collection of simple and affordable robotic units, called Cobots, comparable to personal palm-size quadcopters. By attaching and detaching with each other, multiple Cobots can shape-shift into novel structures, capable of (a) rolling on the surface, to increase the traverse range, (b) flying in a flight array formation, and (c) swimming on or under liquid. A ground station complements the robotic platform, hosting science instrumentation and providing power to recharge the batteries of the Cobots. In the first part of this paper we experimentally show the flying, docking and rolling capabilities of a Shapeshifter constituted by two Cobots, presenting ad-hoc control algorithms. We additionally evaluate the energy-efficiency of the rolling-based mobility strategy by deriving an analytic model of the power consumption and by integrating it in a high-fidelity simulation environment. In the second part we tailor our mission architecture to the exploration of Titan. We show that the properties of the Shapeshifter allow the exploration of the possible cryovolcano Sotra Patera, Titan's Mare and canyons.

preprint2020arXiv

Soft Tensegrity Systems for Planetary Landing and Exploration

During the last decade, tensegrity systems have been the focus of numerous investigations exploring the possibility of adopting them for planetary landing and exploration applications. Early approaches mainly focused on locomotion aspects related to tensegrity systems, where mobility was achieved by actuating the cable members of the system. Later efforts focused on understanding energy storage mechanisms of tensegrity systems undergoing landing events. More precisely, it was shown that under highly dynamic events, buckling of individual members of a tensegrity structure does not necessarily imply structural failure, suggesting that efficient structural design of planetary landers could be achieved by allowing its compression members to buckle. In this work, we combine both aspects of previous research on tensegrity structures, showing a possible lattice-like structural configuration able to withstand impact events, store pre-impact kinetic energy, and utilize a part of that energy for the locomotion process. Our work shows the feasibility of this proposed approach via both experimental and computational means.

preprint2020arXiv

The Shapeshifter: a Morphing, Multi-Agent,Multi-Modal Robotic Platform for the Exploration of Titan (preprint version)

In this report for the Nasa NIAC Phase I study, we present a mission architecture and a robotic platform, the Shapeshifter, that allow multi-domain and redundant mobility on Saturn's moon Titan, and potentially other bodies with atmospheres. The Shapeshifter is a collection of simple and affordable robotic units, called Cobots, comparable to personal palm-size quadcopters. By attaching and detaching with each other, multiple Cobots can shape-shift into novel structures, capable of (a) rolling on the surface, to increase the traverse range, (b) flying in a flight array formation, and (c) swimming on or under liquid. A ground station complements the robotic platform, hosting science instrumentation and providing power to recharge the batteries of the Cobots. Our Phase I study had the objective of providing an initial assessment of the feasibility of the proposed robotic platform architecture, and in particular (a) to characterize the expected science return of a mission to the Sotra-Patera region on Titan; (b) to verify the mechanical and algorithmic feasibility of building a multi-agent platform capable of flying, docking, rolling and un-docking; (c) to evaluate the increased range and efficiency of rolling on Titan w.r.t to flying; (d) to define a case-study of a mission for the exploration of the cryovolcano Sotra-Patera on Titan, whose expected variety of geological features challenges conventional mobility platforms.

preprint2020arXiv

Towards a Co-Design Framework for Future Mobility Systems

The design of Autonomous Vehicles (AVs) and the design of AVs-enabled mobility systems are closely coupled. Indeed, knowledge about the intended service of AVs would impact their design and deployment process, whilst insights about their technological development could significantly affect transportation management decisions. This calls for tools to study such a coupling and co-design AVs and AVs-enabled mobility systems in terms of different objectives. In this paper, we instantiate a framework to address such co-design problems. In particular, we leverage the recently developed theory of co-design to frame and solve the problem of designing and deploying an intermodal Autonomous Mobility-on-Demand system, whereby AVs service travel demands jointly with public transit, in terms of fleet sizing, vehicle autonomy, and public transit service frequency. Our framework is modular and compositional, allowing to describe the design problem as the interconnection of its individual components and to tackle it from a system-level perspective. Moreover, it only requires very general monotonicity assumptions and it naturally handles multiple objectives, delivering the rational solutions on the Pareto front and thus enabling policy makers to select a solution through political criteria. To showcase our methodology, we present a real-world case study for Washington D.C., USA. Our work suggests that it is possible to create user-friendly optimization tools to systematically assess the costs and benefits of interventions, and that such analytical techniques might gain a momentous role in policy-making in the future.

preprint2016arXiv

Congestion-Aware Randomized Routing in Autonomous Mobility-on-Demand Systems

In this paper we study the routing and rebalancing problem for a fleet of autonomous vehicles providing on-demand transportation within a congested urban road network (that is, a road network where traffic speed depends on vehicle density). We show that the congestion-free routing and rebalancing problem is NP-hard and provide a randomized algorithm which finds a low-congestion solution to the routing and rebalancing problem that approximately minimizes the number of vehicles on the road in polynomial time. We provide theoretical bounds on the probability of violating the congestion constraints; we also characterize the expected number of vehicles required by the solution with a commonly-used empirical congestion model and provide a bound on the approximation factor of the algorithm. Numerical experiments on a realistic road network with real-world customer demands show that our algorithm introduces very small amounts of congestion. The performance of our algorithm in terms of travel times and required number of vehicles is very close to (and sometimes better than) the optimal congestion-free solution.

preprint2016arXiv

Deterministic Sampling-Based Motion Planning: Optimality, Complexity, and Performance

Probabilistic sampling-based algorithms, such as the probabilistic roadmap (PRM) and the rapidly-exploring random tree (RRT) algorithms, represent one of the most successful approaches to robotic motion planning, due to their strong theoretical properties (in terms of probabilistic completeness or even asymptotic optimality) and remarkable practical performance. Such algorithms are probabilistic in that they compute a path by connecting independently and identically distributed random points in the configuration space. Their randomization aspect, however, makes several tasks challenging, including certification for safety-critical applications and use of offline computation to improve real-time execution. Hence, an important open question is whether similar (or better) theoretical guarantees and practical performance could be obtained by considering deterministic, as opposed to random sampling sequences. The objective of this paper is to provide a rigorous answer to this question. Specifically, we first show that PRM, for a certain selection of tuning parameters and deterministic low-dispersion sampling sequences, is deterministically asymptotically optimal. Second, we characterize the convergence rate, and we find that the factor of sub-optimality can be very explicitly upper-bounded in terms of the l2-dispersion of the sampling sequence and the connection radius of PRM. Third, we show that an asymptotically optimal version of PRM exists with computational and space complexity arbitrarily close to O(n) (the theoretical lower bound), where n is the number of points in the sequence. This is in stark contrast to the O(n logn) complexity results for existing asymptotically-optimal probabilistic planners. Finally, through numerical experiments, we show that planning with deterministic low-dispersion sampling generally provides superior performance in terms of path cost and success rate.

preprint2016arXiv

Fast, Safe, and Propellant-Efficient Spacecraft Planning under Clohessy-Wiltshire-Hill Dynamics

This paper presents a sampling-based motion planning algorithm for real-time and propellant-optimized autonomous spacecraft trajectory generation in near-circular orbits. Specifically, this paper leverages recent algorithmic advances in the field of robot motion planning to the problem of impulsively-actuated, propellant-optimized rendezvous and proximity operations under the Clohessy-Wiltshire-Hill (CWH) dynamics model. The approach calls upon a modified version of the Fast Marching Tree (FMT*) algorithm to grow a set of feasible trajectories over a deterministic, low-dispersion set of sample points covering the free state space. To enforce safety, the tree is only grown over the subset of actively-safe samples, from which there exists a feasible one-burn collision avoidance maneuver that can safely circularize the spacecraft orbit along its coasting arc under a given set of potential thruster failures. Key features of the proposed algorithm include: (i) theoretical guarantees in terms of trajectory safety and performance, (ii) amenability to real-time implementation, and (iii) generality, in the sense that a large class of constraints can be handled directly. As a result, the proposed algorithm offers the potential for widespread application, ranging from on-orbit satellite servicing to orbital debris removal and autonomous inspection missions.

preprint2016arXiv

Mixed Strategy for Constrained Stochastic Optimal Control

Choosing control inputs randomly can result in a reduced expected cost in optimal control problems with stochastic constraints, such as stochastic model predictive control (SMPC). We consider a controller with initial randomization, meaning that the controller randomly chooses from K+1 control sequences at the beginning (called K-randimization).It is known that, for a finite-state, finite-action Markov Decision Process (MDP) with K constraints, K-randimization is sufficient to achieve the minimum cost. We found that the same result holds for stochastic optimal control problems with continuous state and action spaces.Furthermore, we show the randomization of control input can result in reduced cost when the optimization problem is nonconvex, and the cost reduction is equal to the duality gap. We then provide the necessary and sufficient conditions for the optimality of a randomized solution, and develop an efficient solution method based on dual optimization. Furthermore, in a special case with K=1 such as a joint chance-constrained problem, the dual optimization can be solved even more efficiently by root finding. Finally, we test the theories and demonstrate the solution method on multiple practical problems ranging from path planning to the planning of entry, descent, and landing (EDL) for future Mars missions.

preprint2016arXiv

Optimized and Trusted Collision Avoidance for Unmanned Aerial Vehicles using Approximate Dynamic Programming (Technical Report)

Safely integrating unmanned aerial vehicles into civil airspace is contingent upon development of a trustworthy collision avoidance system. This paper proposes an approach whereby a parameterized resolution logic that is considered trusted for a given range of its parameters is adaptively tuned online. Specifically, to address the potential conservatism of the resolution logic with static parameters, we present a dynamic programming approach for adapting the parameters dynamically based on the encounter state. We compute the adaptation policy offline using a simulation-based approximate dynamic programming method that accommodates the high dimensionality of the problem. Numerical experiments show that this approach improves safety and operational performance compared to the baseline resolution logic, while retaining trustworthiness.

preprint2016arXiv

Risk Aversion in Finite Markov Decision Processes Using Total Cost Criteria and Average Value at Risk

In this paper we present an algorithm to compute risk averse policies in Markov Decision Processes (MDP) when the total cost criterion is used together with the average value at risk (AVaR) metric. Risk averse policies are needed when large deviations from the expected behavior may have detrimental effects, and conventional MDP algorithms usually ignore this aspect. We provide conditions for the structure of the underlying MDP ensuring that approximations for the exact problem can be derived and solved efficiently. Our findings are novel inasmuch as average value at risk has not previously been considered in association with the total cost criterion. Our method is demonstrated in a rapid deployment scenario, whereby a robot is tasked with the objective of reaching a target location within a temporal deadline where increased speed is associated with increased probability of failure. We demonstrate that the proposed algorithm not only produces a risk averse policy reducing the probability of exceeding the expected temporal deadline, but also provides the statistical distribution of costs, thus offering a valuable analysis tool.

preprint2016arXiv

The Team Surviving Orienteers Problem: Routing Robots in Uncertain Environments with Survival Constraints

In this paper we study the following multi-robot coordination problem: given a graph, where each edge is weighted by the probability of surviving while traversing it, find a set of paths for $K$ robots that maximizes the expected number of nodes collectively visited, subject to constraints on the probability that each robot survives to its destination. We call this problem the Team Surviving Orienteers (TSO) problem. The TSO problem is motivated by scenarios where a team of robots must traverse a dangerous, uncertain environment, such as aid delivery in disaster or war zones. We present the TSO problem formally along with several variants, which represent "survivability-aware" counterparts for a wide range of multi-robot coordination problems such as vehicle routing, patrolling, and informative path planning. We propose an approximate greedy approach for selecting paths, and prove that the value of its output is bounded within a factor $1-e^{-p_s/λ}$ of the optimum where $p_s$ is the per-robot survival probability threshold, and $1/λ\le 1$ is the approximation factor of an oracle routine for the well-known orienteering problem. Our approach has linear time complexity in the team size and polynomial complexity in the graph size. Using numerical simulations, we verify that our approach is close to the optimum in practice and that it scales to problems with hundreds of nodes and tens of robots.

preprint2015arXiv

A Convex Optimization Approach to Smooth Trajectories for Motion Planning with Car-Like Robots

In the recent past, several sampling-based algorithms have been proposed to compute trajectories that are collision-free and dynamically-feasible. However, the outputs of such algorithms are notoriously jagged. In this paper, by focusing on robots with car-like dynamics, we present a fast and simple heuristic algorithm, named Convex Elastic Smoothing (CES) algorithm, for trajectory smoothing and speed optimization. The CES algorithm is inspired by earlier work on elastic band planning and iteratively performs shape and speed optimization. The key feature of the algorithm is that both optimization problems can be solved via convex programming, making CES particularly fast. A range of numerical experiments show that the CES algorithm returns high-quality solutions in a matter of a few hundreds of milliseconds and hence appears amenable to a real-time implementation.

preprint2015arXiv

A Framework for Time-Consistent, Risk-Averse Model Predictive Control: Theory and Algorithms

In this paper we present a framework for risk-averse model predictive control (MPC) of linear systems affected by multiplicative uncertainty. Our key innovation is to consider time-consistent, dynamic risk metrics as objective functions to be minimized. This framework is axiomatically justified in terms of time-consistency of risk preferences, is amenable to dynamic optimization, and is unifying in the sense that it captures a full range of risk assessments from risk-neutral to worst case. Within this framework, we propose and analyze an online risk-averse MPC algorithm that is provably stabilizing. Furthermore, by exploiting the dual representation of time-consistent, dynamic risk metrics, we cast the computation of the MPC control law as a convex optimization problem amenable to implementation on embedded systems. Simulation results are presented and discussed.

preprint2015arXiv

A Time Consistent Formulation of Risk Constrained Stochastic Optimal Control

Time-consistency is an essential requirement in risk sensitive optimal control problems to make rational decisions. An optimization problem is time consistent if its solution policy does not depend on the time sequence of solving the optimization problem. On the other hand, a dynamic risk measure is time consistent if a certain outcome is considered less risky in the future implies this outcome is also less risky at current stage. In this paper, we study time-consistency of risk constrained problem where the risk metric is time consistent. From the Bellman optimality condition in [1], we establish an analytical "risk-to-go" that results in a time consistent optimal policy. Finally we demonstrate the effectiveness of the analytical solution by solving Haviv's counter-example [2] in time inconsistent planning.

preprint2015arXiv

A Uniform-grid Discretization Algorithm for Stochastic Control with Risk Constraints

In this paper, we present a discretization algorithm for finite horizon risk constrained dynamic programming algorithm in [Chow_Pavone_13]. Although in a theoretical standpoint, Bellman's recursion provides a systematic way to find optimal value functions and generate optimal history dependent policies, there is a serious computational issue. Even if the state space and action space of this constrained stochastic optimal control problem are finite, the spaces of risk threshold and the feasible risk update are closed bounded subset of real numbers. This prohibits any direct applications of unconstrained finite state iterative methods in dynamic programming found in [Bertsekas_05]. In order to approximate Bellman's operator derived in [Chow_Pavone_13], we discretize the continuous action spaces and formulate a finite space approximation for the exact dynamic programming algorithm. We will also prove that the approximation error bound of optimal value functions is bound linearly by the step size of discretization. Finally, details for implementations and possible modifications are discussed.

preprint2015arXiv

An Asymptotically-Optimal Sampling-Based Algorithm for Bi-directional Motion Planning

Bi-directional search is a widely used strategy to increase the success and convergence rates of sampling-based motion planning algorithms. Yet, few results are available that merge both bi-directional search and asymptotic optimality into existing optimal planners, such as PRM*, RRT*, and FMT*. The objective of this paper is to fill this gap. Specifically, this paper presents a bi-directional, sampling-based, asymptotically-optimal algorithm named Bi-directional FMT* (BFMT*) that extends the Fast Marching Tree (FMT*) algorithm to bi-directional search while preserving its key properties, chiefly lazy search and asymptotic optimality through convergence in probability. BFMT* performs a two-source, lazy dynamic programming recursion over a set of randomly-drawn samples, correspondingly generating two search trees: one in cost-to-come space from the initial configuration and another in cost-to-go space from the goal configuration. Numerical experiments illustrate the advantages of BFMT* over its unidirectional counterpart, as well as a number of other state-of-the-art planners.

preprint2015arXiv

Decentralized Algorithms for 3D Symmetric Formations in Robotic Networks: a Contraction Theory Approach

This paper presents decentralized algorithms for formation control of multiple robots in three dimensions. Specifically, we leverage the mathematical properties of cyclic pursuit along with results from contraction and partial contraction theory to design decentralized control algorithms that ensure global convergence to symmetric formations. We first consider regular polygon formations as a base case, and then extend the results to Johnson solid and other polygonal mesh formations. The algorithms are further augmented to allow control over formation size and avoid collisions with other robots in the formation. The robustness properties of the algorithms are assessed in the presence of bounded additive disturbances and their effect on the quality of the formation is quantified. Finally, we present a general methodology for embedding the control laws on complex dynamical systems, in this case, quadcopters, and validate this approach via simulations and experiments on a fleet of quadcopters.

preprint2015arXiv

Fast Marching Tree: a Fast Marching Sampling-Based Method for Optimal Motion Planning in Many Dimensions

In this paper we present a novel probabilistic sampling-based motion planning algorithm called the Fast Marching Tree algorithm (FMT*). The algorithm is specifically aimed at solving complex motion planning problems in high-dimensional configuration spaces. This algorithm is proven to be asymptotically optimal and is shown to converge to an optimal solution faster than its state-of-the-art counterparts, chiefly PRM* and RRT*. The FMT* algorithm performs a "lazy" dynamic programming recursion on a predetermined number of probabilistically-drawn samples to grow a tree of paths, which moves steadily outward in cost-to-arrive space. As a departure from previous analysis approaches that are based on the notion of almost sure convergence, the FMT* algorithm is analyzed under the notion of convergence in probability: the extra mathematical flexibility of this approach allows for convergence rate bounds--the first in the field of optimal sampling-based motion planning. Specifically, for a certain selection of tuning parameters and configuration spaces, we obtain a convergence rate bound of order $O(n^{-1/d+ρ})$, where $n$ is the number of sampled points, $d$ is the dimension of the configuration space, and $ρ$ is an arbitrarily small constant. We go on to demonstrate asymptotic optimality for a number of variations on FMT*, namely when the configuration space is sampled non-uniformly, when the cost is not arc length, and when connections are made based on the number of nearest neighbors instead of a fixed connection radius. Numerical experiments over a range of dimensions and obstacle configurations confirm our theoretical and heuristic arguments by showing that FMT*, for a given execution time, returns substantially better solutions than either PRM* or RRT*, especially in high-dimensional configuration spaces and in scenarios where collision-checking is expensive.

preprint2015arXiv

Monte Carlo Motion Planning for Robot Trajectory Optimization Under Uncertainty

This article presents a novel approach, named MCMP (Monte Carlo Motion Planning), to the problem of motion planning under uncertainty, i.e., to the problem of computing a low-cost path that fulfills probabilistic collision avoidance constraints. MCMP estimates the collision probability (CP) of a given path by sampling via Monte Carlo the execution of a reference tracking controller (in this paper we consider LQG). The key algorithmic contribution of this paper is the design of statistical variance-reduction techniques, namely control variates and importance sampling, to make such a sampling procedure amenable to real-time implementation. MCMP applies this CP estimation procedure to motion planning by iteratively (i) computing an (approximately) optimal path for the deterministic version of the problem (here, using the FMT* algorithm), (ii) computing the CP of this path, and (iii) inflating or deflating the obstacles by a common factor depending on whether the CP is higher or lower than a target value. The advantages of MCMP are threefold: (i) asymptotic correctness of CP estimation, as opposed to most current approximations, which, as shown in this paper, can be off by large multiples and hinder the computation of feasible plans; (ii) speed and parallelizability, and (iii) generality, i.e., the approach is applicable to virtually any planning problem provided that a path tracking controller and a notion of distance to obstacles in the configuration space are available. Numerical results illustrate the correctness (in terms of feasibility), efficiency (in terms of path cost), and computational speed of MCMP.

preprint2015arXiv

Optimal Sampling-Based Motion Planning under Differential Constraints: the Drift Case with Linear Affine Dynamics

In this paper we provide a thorough, rigorous theoretical framework to assess optimality guarantees of sampling-based algorithms for drift control systems: systems that, loosely speaking, can not stop instantaneously due to momentum. We exploit this framework to design and analyze a sampling-based algorithm (the Differential Fast Marching Tree algorithm) that is asymptotically optimal, that is, it is guaranteed to converge, as the number of samples increases, to an optimal solution. In addition, our approach allows us to provide concrete bounds on the rate of this convergence. The focus of this paper is on mixed time/control energy cost functions and on linear affine dynamical systems, which encompass a range of models of interest to applications (e.g., double-integrators) and represent a necessary step to design, via successive linearization, sampling-based and provably-correct algorithms for non-linear drift control systems. Our analysis relies on an original perturbation analysis for two-point boundary value problems, which could be of independent interest.

preprint2015arXiv

Optimal Sampling-Based Motion Planning under Differential Constraints: the Driftless Case

Motion planning under differential constraints is a classic problem in robotics. To date, the state of the art is represented by sampling-based techniques, with the Rapidly-exploring Random Tree algorithm as a leading example. Yet, the problem is still open in many aspects, including guarantees on the quality of the obtained solution. In this paper we provide a thorough theoretical framework to assess optimality guarantees of sampling-based algorithms for planning under differential constraints. We exploit this framework to design and analyze two novel sampling-based algorithms that are guaranteed to converge, as the number of samples increases, to an optimal solution (namely, the Differential Probabilistic RoadMap algorithm and the Differential Fast Marching Tree algorithm). Our focus is on driftless control-affine dynamical models, which accurately model a large class of robotic systems. In this paper we use the notion of convergence in probability (as opposed to convergence almost surely): the extra mathematical flexibility of this approach yields convergence rate bounds - a first in the field of optimal sampling-based motion planning under differential constraints. Numerical experiments corroborating our theoretical results are presented and discussed.

preprint2015arXiv

Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach

In this paper we address the problem of decision making within a Markov decision process (MDP) framework where risk and modeling errors are taken into account. Our approach is to minimize a risk-sensitive conditional-value-at-risk (CVaR) objective, as opposed to a standard risk-neutral expectation. We refer to such problem as CVaR MDP. Our first contribution is to show that a CVaR objective, besides capturing risk sensitivity, has an alternative interpretation as expected cost under worst-case modeling errors, for a given error budget. This result, which is of independent interest, motivates CVaR MDPs as a unifying framework for risk-sensitive and robust decision making. Our second contribution is to present an approximate value-iteration algorithm for CVaR MDPs and analyze its convergence rate. To our knowledge, this is the first solution algorithm for CVaR MDPs that enjoys error guarantees. Finally, we present results from numerical experiments that corroborate our theoretical findings and show the practicality of our approach.

preprint2015arXiv

Stochastic Optimal Control With Dynamic, Time-Consistent Risk Constraints

In this paper we present a dynamic programing approach to stochastic optimal control problems with dynamic, time-consistent risk constraints. Constrained stochastic optimal control problems, which naturally arise when one has to consider multiple objectives, have been extensively investigated in the past 20 years, however, in most formulations, the constraints are formulated as either risk-neutral (i.e., by considering an expected cost), or by applying static, single-period risk metrics with limited attention to "time-consistency" (i.e., to whether such metrics ensure rational consistency of risk preferences across multiple periods). Recently, significant strides have been made in the development of a rigorous theory of dynamic, \emph{time-consistent} risk metrics for multi-period (risk-sensitive) decision processes, however, their integration within constrained stochastic optimal control problems has received little attention. The goal of this paper is to bridge this gap. First, we formulate the stochastic optimal control problem with dynamic, time-consistent risk constraints and we characterize the tail subproblems (which requires the addition of a Markovian structure to the risk metrics). Second, we develop a dynamic programming approach for its solution, which allows to compute the optimal costs by value iteration. Finally, we discuss both theoretical and practical features of our approach, such as generalizations, construction of optimal control policies, and computational aspects. A simple, two-state example is given to illustrate the problem setup and the solution approach.

preprint2015arXiv

Trading Safety Versus Performance: Rapid Deployment of Robotic Swarms with Robust Performance Constraints

In this paper we consider a stochastic deployment problem, where a robotic swarm is tasked with the objective of positioning at least one robot at each of a set of pre-assigned targets while meeting a temporal deadline. Travel times and failure rates are stochastic but related, inasmuch as failure rates increase with speed. To maximize chances of success while meeting the deadline, a control strategy has therefore to balance safety and performance. Our approach is to cast the problem within the theory of constrained Markov Decision Processes, whereby we seek to compute policies that maximize the probability of successful deployment while ensuring that the expected duration of the task is bounded by a given deadline. To account for uncertainties in the problem parameters, we consider a robust formulation and we propose efficient solution algorithms, which are of independent interest. Numerical experiments confirming our theoretical results are presented and discussed.

preprint2015arXiv

Two Phase $Q-$learning for Bidding-based Vehicle Sharing

We consider one-way vehicle sharing systems where customers can rent a car at one station and drop it off at another. The problem we address is to optimize the distribution of cars, and quality of service, by pricing rentals appropriately. We propose a bidding approach that is inspired from auctions and takes into account the significant uncertainty inherent in the problem data (e.g., pick-up and drop-off locations, time of requests, and duration of trips). Specifically, in contrast to current vehicle sharing systems, the operator does not set prices. Instead, customers submit bids and the operator decides whether to rent or not. The operator can even accept negative bids to motivate drivers to rebalance available cars to unpopular destinations within a city. We model the operator's sequential decision-making problem as a \emph{constrained Markov decision problem} (CMDP) and propose and rigorously analyze a novel two phase $Q$-learning algorithm for its solution. Numerical experiments are presented and discussed.

preprint2014arXiv

A Queueing Network Approach to the Analysis and Control of Mobility-On-Demand Systems

This paper presents a queueing network approach to the analysis and control of mobility-on-demand (MoD) systems for urban personal transportation. A MoD system consists of a fleet of vehicles providing one-way car sharing service and a team of drivers to rebalance such vehicles. The drivers then rebalance themselves by driving select customers similar to a taxi service. We model the MoD system as two coupled closed Jackson networks with passenger loss. We show that the system can be approximately balanced by solving two decoupled linear programs and exactly balanced through nonlinear optimization. The rebalancing techniques are applied to a system sizing example using taxi data in three neighborhoods of Manhattan, which suggests that the optimal vehicle-to-driver ratio in a MoD system is between 3 and 5. Lastly, we formulate a real-time closed-loop rebalancing policy for drivers and demonstrate its stability (in terms of customer wait times) for typical system loads.

preprint2014arXiv

Control of Robotic Mobility-On-Demand Systems: a Queueing-Theoretical Perspective

In this paper we present and analyze a queueing-theoretical model for autonomous mobility-on-demand (MOD) systems where robotic, self-driving vehicles transport customers within an urban environment and rebalance themselves to ensure acceptable quality of service throughout the entire network. We cast an autonomous MOD system within a closed Jackson network model with passenger loss. It is shown that an optimal rebalancing algorithm minimizing the number of (autonomously) rebalancing vehicles and keeping vehicles availabilities balanced throughout the network can be found by solving a linear program. The theoretical insights are used to design a robust, real-time rebalancing algorithm, which is applied to a case study of New York City. The case study shows that the current taxi demand in Manhattan can be met with about 8,000 robotic vehicles (roughly 60% of the size of the current taxi fleet). Finally, we extend our queueing-theoretical setup to include congestion effects, and we study the impact of autonomously rebalancing vehicles on overall congestion. Collectively, this paper provides a rigorous approach to the problem of system-wide coordination of autonomously driving vehicles, and provides one of the first characterizations of the sustainability benefits of robotic transportation networks.

preprint2014arXiv

Distributed consensus with mixed time/communication bandwidth performance metrics

In this paper we study the inherent trade-off between time and communication complexity for the distributed consensus problem. In our model, communication complexity is measured as the maximum data throughput (in bits per second) sent through the network at a given instant. Such a notion of communication complexity, referred to as bandwidth complexity, is related to the frequency bandwidth a designer should collectively allocate to the agents if they were to communicate via a wireless channel, which represents an important constraint for dense robotic networks. We prove a lower bound on the bandwidth complexity of the consensus problem and provide a consensus algorithm that is bandwidth-optimal for a wide class of consensus functions. We then propose a distributed algorithm that can trade communication complexity versus time complexity as a function of a tunable parameter, which can be adjusted by a system designer as a function of the properties of the wireless communication channel. We rigorously characterize the tunable algorithm's worst-case bandwidth complexity and show that it compares favorably with the bandwidth complexity of well-known consensus algorithm.

preprint2013arXiv

Rebalancing the Rebalancers: Optimally Routing Vehicles and Drivers in Mobility-on-Demand Systems

In this paper we study rebalancing strategies for a mobility-on-demand urban transportation system blending customer-driven vehicles with a taxi service. In our system, a customer arrives at one of many designated stations and is transported to any other designated station, either by driving themselves, or by being driven by an employed driver. The system allows for one-way trips, so that customers do not have to return to their origin. When some origins and destinations are more popular than others, vehicles will become unbalanced, accumulating at some stations and becoming depleted at others. This problem is addressed by employing rebalancing drivers to drive vehicles from the popular destinations to the unpopular destinations. However, with this approach the rebalancing drivers themselves become unbalanced, and we need to "rebalance the rebalancers" by letting them travel back to the popular destinations with a customer. Accordingly, in this paper we study how to optimally route the rebalancing vehicles and drivers so that stability (in terms of boundedness of the number of waiting customers) is ensured while minimizing the number of rebalancing vehicles traveling in the network and the number of rebalancing drivers needed; surprisingly, these two objectives are aligned, and one can find the optimal rebalancing strategy by solving two decoupled linear programs. Leveraging our analysis, we determine the minimum number of drivers and minimum number of vehicles needed to ensure stability in the system. Interestingly, our simulations suggest that, in Euclidean network topologies, one would need between 1/3 and 1/4 as many drivers as vehicles.

preprint2012arXiv

Asymptotically Optimal Algorithms for Pickup and Delivery Problems with Application to Large-Scale Transportation Systems

The Stacker Crane Problem is NP-Hard and the best known approximation algorithm only provides a 9/5 approximation ratio. The objective of this paper is threefold. First, by embedding the problem within a stochastic framework, we present a novel algorithm for the SCP that: (i) is asymptotically optimal, i.e., it produces, almost surely, a solution approaching the optimal one as the number of pickups/deliveries goes to infinity; and (ii) has computational complexity $O(n^{2+\eps})$, where $n$ is the number of pickup/delivery pairs and $\eps$ is an arbitrarily small positive constant. Second, we asymptotically characterize the length of the optimal SCP tour. Finally, we study a dynamic version of the SCP, whereby pickup and delivery requests arrive according to a Poisson process, and which serves as a model for large-scale demand-responsive transport (DRT) systems. For such a dynamic counterpart of the SCP, we derive a necessary and sufficient condition for the existence of stable vehicle routing policies, which depends only on the workspace geometry, the stochastic distributions of pickup and delivery points, the arrival rate of requests, and the number of vehicles. Our results leverage a novel connection between the Euclidean Bipartite Matching Problem and the theory of random permutations, and, for the dynamic setting, exhibit novel features that are absent in traditional spatially-distributed queueing systems.

preprint2009arXiv

Distributed and Adaptive Algorithms for Vehicle Routing in a Stochastic and Dynamic Environment

In this paper we present distributed and adaptive algorithms for motion coordination of a group of m autonomous vehicles. The vehicles operate in a convex environment with bounded velocity and must service demands whose time of arrival, location and on-site service are stochastic; the objective is to minimize the expected system time (wait plus service) of the demands. The general problem is known as the m-vehicle Dynamic Traveling Repairman Problem (m-DTRP). The best previously known control algorithms rely on centralized a-priori task assignment and are not robust against changes in the environment, e.g. changes in load conditions; therefore, they are of limited applicability in scenarios involving ad-hoc networks of autonomous vehicles operating in a time-varying environment. First, we present a new class of policies for the 1-DTRP problem that: (i) are provably optimal both in light- and heavy-load condition, and (ii) are adaptive, in particular, they are robust against changes in load conditions. Second, we show that partitioning policies, whereby the environment is partitioned among the vehicles and each vehicle follows a certain set of rules in its own region, are optimal in heavy-load conditions. Finally, by combining the new class of algorithms for the 1-DTRP with suitable partitioning policies, we design distributed algorithms for the m-DTRP problem that (i) are spatially distributed, scalable to large networks, and adaptive to network changes, (ii) are within a constant-factor of optimal in heavy-load conditions and stabilize the system in any load condition. Simulation results are presented and discussed.

Marco Pavone

What is connected

Connect this record

See the researcher in context

Building this map preview

79 published item(s)

Agile Tradespace Exploration for Space Rendezvous Mission Design via Transformers

Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail

Counterfactual VLA: Self-Reflective Vision-Language-Action Model with Adaptive Reasoning

Reproducibility in the Control of Autonomous Mobility-on-Demand Systems

Sample-Efficient Safety Assurances using Conformal Prediction

Transformers for Trajectory Optimization with Application to Spacecraft Rendezvous

A Simple and Efficient Sampling-based Algorithm for General Reachability Analysis

A Unified View of SDP-based Neural Network Verification through Completely Positive Programming

Analysis of Theoretical and Numerical Properties of Sequential Convex Programming for Continuous-Time Optimal Control

Balancing Fairness and Efficiency in Traffic Routing via Interpolated Traffic Assignment

BITS: Bi-level Imitation for Traffic Simulation

Control-oriented meta-learning

Coordinated Multi-Agent Pathfinding for Drones and Trucks over Road Networks

Data-Driven Chance Constrained Control using Kernel Distribution Embeddings

Graph Meta-Reinforcement Learning for Transferable Autonomous Mobility-on-Demand

Heterogeneous-Agent Trajectory Forecasting Incorporating Class Uncertainty

Interaction-Dynamics-Aware Perception Zones for Obstacle Detection Safety Evaluation

Learning Deep SDF Maps Online for Robot Navigation and Exploration

Local Calibration: Metrics and Recalibration

Matching with Transfers under Distributional Constraints

Motron: Multimodal Probabilistic Human Motion Forecasting

Online Learning for Traffic Routing under Unknown Preferences

Private Location Sharing for Decentralized Routing services

Propagating State Uncertainty Through Trajectory Forecasting

Risk-sensitive safety analysis using Conditional Value-at-Risk

Robust Trajectory Prediction against Adversarial Attacks

Safe Active Dynamics Learning and Control: A Sequential Exploration-Exploitation Framework

ScePT: Scene-consistent, Policy-based Trajectory Predictions for Planning

Second-Order Sensitivity Analysis for Bilevel Optimization

Semi-Supervised Trajectory-Feedback Controller Synthesis for Signal Temporal Logic Specifications

Towards Data-Driven Synthesis of Autonomous Vehicle Safety Concepts

Using Spectral Submanifolds for Nonlinear Periodic Control

Control Barrier Functions for Cyber-Physical Systems and Applications to NMPC

Efficient Large-Scale Multi-Drone Delivery Using Transit Networks

Evidential Sparsification of Multimodal Latent Spaces in Conditional Variational Autoencoders

MATS: An Interpretable Trajectory Forecasting Representation for Planning and Control

On the Co-Design of AV-Enabled Mobility Systems

Sketching Curvature for Efficient Out-of-Distribution Detection for Deep Neural Networks

Trajectron++: Dynamically-Feasible Trajectory Forecasting With Heterogeneous Data

A Simple and Efficient Tube-based Robust Output Feedback Model Predictive Control Scheme

Congestion-aware Routing and Rebalancing of Autonomous Mobility-on-Demand Systems in Mixed Traffic

Error Bounds for Reduced Order Model Predictive Control

Infusing Reachability-Based Safety into Planning and Control for Multi-agent Interactions

On Local Computation for Optimization in Multi-Agent Systems

Planning and Operations of Mixed Fleets in Mobility-on-Demand Systems

Revisiting the Asymptotic Optimality of RRT$^*$

Risk-sensitive safety specifications for stochastic systems using Conditional Value-at-Risk

Risk-Sensitive Sequential Action Control with Multi-Modal Human Trajectory Forecasting for Safe Crowd-Robot Interaction

Shapeshifter: A Multi-Agent, Multi-Modal Robotic Platform for Exploration of Titan

Soft Tensegrity Systems for Planetary Landing and Exploration

The Shapeshifter: a Morphing, Multi-Agent,Multi-Modal Robotic Platform for the Exploration of Titan (preprint version)

Towards a Co-Design Framework for Future Mobility Systems

Congestion-Aware Randomized Routing in Autonomous Mobility-on-Demand Systems

Deterministic Sampling-Based Motion Planning: Optimality, Complexity, and Performance

Fast, Safe, and Propellant-Efficient Spacecraft Planning under Clohessy-Wiltshire-Hill Dynamics

Mixed Strategy for Constrained Stochastic Optimal Control

Optimized and Trusted Collision Avoidance for Unmanned Aerial Vehicles using Approximate Dynamic Programming (Technical Report)

Risk Aversion in Finite Markov Decision Processes Using Total Cost Criteria and Average Value at Risk

The Team Surviving Orienteers Problem: Routing Robots in Uncertain Environments with Survival Constraints

A Convex Optimization Approach to Smooth Trajectories for Motion Planning with Car-Like Robots

A Framework for Time-Consistent, Risk-Averse Model Predictive Control: Theory and Algorithms

A Time Consistent Formulation of Risk Constrained Stochastic Optimal Control

A Uniform-grid Discretization Algorithm for Stochastic Control with Risk Constraints

An Asymptotically-Optimal Sampling-Based Algorithm for Bi-directional Motion Planning

Decentralized Algorithms for 3D Symmetric Formations in Robotic Networks: a Contraction Theory Approach

Fast Marching Tree: a Fast Marching Sampling-Based Method for Optimal Motion Planning in Many Dimensions

Monte Carlo Motion Planning for Robot Trajectory Optimization Under Uncertainty

Optimal Sampling-Based Motion Planning under Differential Constraints: the Drift Case with Linear Affine Dynamics

Optimal Sampling-Based Motion Planning under Differential Constraints: the Driftless Case

Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach

Stochastic Optimal Control With Dynamic, Time-Consistent Risk Constraints

Trading Safety Versus Performance: Rapid Deployment of Robotic Swarms with Robust Performance Constraints

Two Phase $Q-$learning for Bidding-based Vehicle Sharing

A Queueing Network Approach to the Analysis and Control of Mobility-On-Demand Systems