Researcher profile

Marco Pavone

Marco Pavone contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
52works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

52 published item(s)

preprint2026arXiv

Agile Tradespace Exploration for Space Rendezvous Mission Design via Transformers

Spacecraft rendezvous enables on-orbit servicing, debris removal, and crewed docking, forming the foundation for a scalable space economy. Designing such missions requires rapid exploration of the tradespace between control cost and flight time across multiple candidate targets. However, multi-objective optimization in this setting is challenging, as the underlying constraints are often nonconvex, and mission designers must balance accuracy (e.g., solving the full problem) with efficiency (e.g., convex relaxations), slowing iteration and limiting design agility. To address these challenges, this paper proposes an AI-powered framework that enables agile and generalized rendezvous mission design. Given the orbital information of the target spacecraft, boundary conditions of the servicer, and a range of flight times, a transformer model generates a set of near-Pareto optimal trajectories across varying flight times in a single parallelized inference step, thereby enabling rapid mission trade studies. The model is further extended to accommodate variable flight times and perturbed orbital dynamics, supporting realistic multi-objective trade-offs. Validation on chance-constrained rendezvous problems in Earth orbits with passive safety constraints demonstrates that the model generalizes across both flight times and dynamics, consistently providing high-quality initial guesses that converge to superior solutions in fewer iterations. Moreover, the framework efficiently approximates the Pareto front, achieving runtimes comparable to convex relaxation by exploiting parallelized inference. Together, these results position the proposed framework as a practical surrogate for nonconvex trajectory generation and mark an important step toward AI-driven trajectory design for accelerating preliminary mission planning in real-world rendezvous applications.

preprint2026arXiv

Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail

End-to-end architectures trained via imitation learning have advanced autonomous driving by scaling model size and data, yet performance remains brittle in safety-critical long-tail scenarios where supervision is sparse and causal understanding is limited. We introduce Alpamayo-R1 (AR1), a vision-language-action model (VLA) that integrates Chain of Causation reasoning with trajectory planning for complex driving scenarios. Our approach features three key innovations: (1) the Chain of Causation (CoC) dataset, built through a hybrid auto-labeling and human-in-the-loop pipeline producing decision-grounded, causally linked reasoning traces aligned with driving behaviors; (2) a modular VLA architecture combining Cosmos-Reason, a vision-language model pre-trained for Physical AI, with a diffusion-based trajectory decoder that generates dynamically feasible trajectories in real time; (3) a multi-stage training strategy using supervised fine-tuning to elicit reasoning and reinforcement learning (RL) to enforce reasoning-action consistency and optimize reasoning quality. AR1 achieves up to a 12% improvement in planning accuracy on challenging cases compared to a trajectory-only baseline, with a 35% reduction in close encounter rate in closed-loop simulation. RL post-training improves reasoning quality by 45% and reasoning-action consistency by 37%. Model scaling from 0.5B to 7B parameters shows consistent improvements. On-vehicle road tests confirm real-time performance (99 ms latency) and successful urban deployment. By bridging interpretable reasoning with precise control, AR1 demonstrates a practical path towards Level 4 autonomous driving. Model weights are available at https://huggingface.co/nvidia/Alpamayo-R1-10B with inference code at https://github.com/NVlabs/alpamayo.

preprint2025arXiv

Counterfactual VLA: Self-Reflective Vision-Language-Action Model with Adaptive Reasoning

Recent reasoning-augmented Vision-Language-Action (VLA) models have improved the interpretability of end-to-end autonomous driving by generating intermediate reasoning traces. Yet these models primarily describe what they perceive and intend to do, rarely questioning whether their planned actions are safe or appropriate. This work introduces Counterfactual VLA (CF-VLA), a self-reflective VLA framework that enables the model to reason about and revise its planned actions before execution. CF-VLA first generates time-segmented meta-actions that summarize driving intent, and then performs counterfactual reasoning conditioned on both the meta-actions and the visual context. This step simulates potential outcomes, identifies unsafe behaviors, and outputs corrected meta-actions that guide the final trajectory generation. To efficiently obtain such self-reflective capabilities, we propose a rollout-filter-label pipeline that mines high-value scenes from a base (non-counterfactual) VLA's rollouts and labels counterfactual reasoning traces for subsequent training rounds. Experiments on large-scale driving datasets show that CF-VLA improves trajectory accuracy by up to 17.6%, enhances safety metrics by 20.5%, and exhibits adaptive thinking: it only enables counterfactual reasoning in challenging scenarios. By transforming reasoning traces from one-shot descriptions to causal self-correction signals, CF-VLA takes a step toward self-reflective autonomous driving agents that learn to think before they act.

preprint2025arXiv

Reproducibility in the Control of Autonomous Mobility-on-Demand Systems

Autonomous Mobility-on-Demand (AMoD) systems, powered by advances in robotics, control, and Machine Learning (ML), offer a promising paradigm for future urban transportation. AMoD offers fast and personalized travel services by leveraging centralized control of autonomous vehicle fleets to optimize operations and enhance service performance. However, the rapid growth of this field has outpaced the development of standardized practices for evaluating and reporting results, leading to significant challenges in reproducibility. As AMoD control algorithms become increasingly complex and data-driven, a lack of transparency in modeling assumptions, experimental setups, and algorithmic implementation hinders scientific progress and undermines confidence in the results. This paper presents a systematic study of reproducibility in AMoD research. We identify key components across the research pipeline, spanning system modeling, control problems, simulation design, algorithm specification, and evaluation, and analyze common sources of irreproducibility. We survey prevalent practices in the literature, highlight gaps, and propose a structured framework to assess and improve reproducibility. Specifically, concrete guidelines are offered, along with a "reproducibility checklist", to support future work in achieving replicable, comparable, and extensible results. While focused on AMoD, the principles and practices we advocate generalize to a broader class of cyber-physical systems that rely on networked autonomy and data-driven control. This work aims to lay the foundation for a more transparent and reproducible research culture in the design and deployment of intelligent mobility systems.

preprint2024arXiv

Sample-Efficient Safety Assurances using Conformal Prediction

When deploying machine learning models in high-stakes robotics applications, the ability to detect unsafe situations is crucial. Early warning systems can provide alerts when an unsafe situation is imminent (in the absence of corrective action). To reliably improve safety, these warning systems should have a provable false negative rate; i.e. of the situations that are unsafe, fewer than $ε$ will occur without an alert. In this work, we present a framework that combines a statistical inference technique known as conformal prediction with a simulator of robot/environment dynamics, in order to tune warning systems to provably achieve an $ε$ false negative rate using as few as $1/ε$ data points. We apply our framework to a driver warning system and a robotic grasping application, and empirically demonstrate guaranteed false negative rate while also observing low false detection (positive) rate.

preprint2024arXiv

Transformers for Trajectory Optimization with Application to Spacecraft Rendezvous

Reliable and efficient trajectory optimization methods are a fundamental need for autonomous dynamical systems, effectively enabling applications including rocket landing, hypersonic reentry, spacecraft rendezvous, and docking. Within such safety-critical application areas, the complexity of the emerging trajectory optimization problems has motivated the application of AI-based techniques to enhance the performance of traditional approaches. However, current AI-based methods either attempt to fully replace traditional control algorithms, thus lacking constraint satisfaction guarantees and incurring in expensive simulation, or aim to solely imitate the behavior of traditional methods via supervised learning. To address these limitations, this paper proposes the Autonomous Rendezvous Transformer (ART) and assesses the capability of modern generative models to solve complex trajectory optimization problems, both from a forecasting and control standpoint. Specifically, this work assesses the capabilities of Transformers to (i) learn near-optimal policies from previously collected data, and (ii) warm-start a sequential optimizer for the solution of non-convex optimal control problems, thus guaranteeing hard constraint satisfaction. From a forecasting perspective, results highlight how ART outperforms other learning-based architectures at predicting known fuel-optimal trajectories. From a control perspective, empirical analyses show how policies learned through Transformers are able to generate near-optimal warm-starts, achieving trajectories that are (i) more fuel-efficient, (ii) obtained in fewer sequential optimizer iterations, and (iii) computed with an overall runtime comparable to benchmarks based on convex optimization.

preprint2022arXiv

A Simple and Efficient Sampling-based Algorithm for General Reachability Analysis

In this work, we analyze an efficient sampling-based algorithm for general-purpose reachability analysis, which remains a notoriously challenging problem with applications ranging from neural network verification to safety analysis of dynamical systems. By sampling inputs, evaluating their images in the true reachable set, and taking their $ε$-padded convex hull as a set estimator, this algorithm applies to general problem settings and is simple to implement. Our main contribution is the derivation of asymptotic and finite-sample accuracy guarantees using random set theory. This analysis informs algorithmic design to obtain an $ε$-close reachable set approximation with high probability, provides insights into which reachability problems are most challenging, and motivates safety-critical applications of the technique. On a neural network verification task, we show that this approach is more accurate and significantly faster than prior work. Informed by our analysis, we also design a robust model predictive controller that we demonstrate in hardware experiments.

preprint2022arXiv

A Unified View of SDP-based Neural Network Verification through Completely Positive Programming

Verifying that input-output relationships of a neural network conform to prescribed operational specifications is a key enabler towards deploying these networks in safety-critical applications. Semidefinite programming (SDP)-based approaches to Rectified Linear Unit (ReLU) network verification transcribe this problem into an optimization problem, where the accuracy of any such formulation reflects the level of fidelity in how the neural network computation is represented, as well as the relaxations of intractable constraints. While the literature contains much progress on improving the tightness of SDP formulations while maintaining tractability, comparatively little work has been devoted to the other extreme, i.e., how to most accurately capture the original verification problem before SDP relaxation. In this work, we develop an exact, convex formulation of verification as a completely positive program (CPP), and provide analysis showing that our formulation is minimal -- the removal of any constraint fundamentally misrepresents the neural network computation. We leverage our formulation to provide a unifying view of existing approaches, and give insight into the source of large relaxation gaps observed in some cases.

preprint2022arXiv

Analysis of Theoretical and Numerical Properties of Sequential Convex Programming for Continuous-Time Optimal Control

Sequential Convex Programming (SCP) has recently gained significant popularity as an effective method for solving optimal control problems and has been successfully applied in several different domains. However, the theoretical analysis of SCP has received comparatively limited attention, and it is often restricted to discrete-time formulations. In this paper, we present a unifying theoretical analysis of a fairly general class of SCP procedures for continuous-time optimal control problems. In addition to the derivation of convergence guarantees in a continuous-time setting, our analysis reveals two new numerical and practical insights. First, we show how one can more easily account for manifold-type constraints, which are a defining feature of optimal control of mechanical systems. Second, we show how our theoretical analysis can be leveraged to accelerate SCP-based optimal control methods by infusing techniques from indirect optimal control.

preprint2022arXiv

Balancing Fairness and Efficiency in Traffic Routing via Interpolated Traffic Assignment

System optimum (SO) routing, wherein the total travel time of all users is minimized, is a holy grail for transportation authorities. However, SO routing may discriminate against users who incur much larger travel times than others to achieve high system efficiency, i.e., low total travel times. To address the inherent unfairness of SO routing, we study the $β$-fair SO problem whose goal is to minimize the total travel time while guaranteeing a ${β\geq 1}$ level of unfairness, which specifies the maximum possible ratio between the travel times of different users with shared origins and destinations. To obtain feasible solutions to the $β$-fair SO problem while achieving high system efficiency, we develop a new convex program, the Interpolated Traffic Assignment Problem (I-TAP), which interpolates between a fairness-promoting and an efficiency-promoting traffic-assignment objective. We evaluate the efficacy of I-TAP through theoretical bounds on the total system travel time and level of unfairness in terms of its interpolation parameter, as well as present a numerical comparison between I-TAP and a state-of-the-art algorithm on a range of transportation networks. The numerical results indicate that our approach is faster by several orders of magnitude as compared to the benchmark algorithm, while achieving higher system efficiency for all desirable levels of unfairness. We further leverage the structure of I-TAP to develop two pricing mechanisms to collectively enforce the I-TAP solution in the presence of selfish homogeneous and heterogeneous users, respectively, that independently choose routes to minimize their own travel costs. We mention that this is the first study of pricing in the context of fair routing for general road networks (as opposed to, e.g., parallel road networks).

preprint2022arXiv

BITS: Bi-level Imitation for Traffic Simulation

Simulation is the key to scaling up validation and verification for robotic systems such as autonomous vehicles. Despite advances in high-fidelity physics and sensor simulation, a critical gap remains in simulating realistic behaviors of road users. This is because, unlike simulating physics and graphics, devising first principle models for human-like behaviors is generally infeasible. In this work, we take a data-driven approach and propose a method that can learn to generate traffic behaviors from real-world driving logs. The method achieves high sample efficiency and behavior diversity by exploiting the bi-level hierarchy of driving behaviors by decoupling the traffic simulation problem into high-level intent inference and low-level driving behavior imitation. The method also incorporates a planning module to obtain stable long-horizon behaviors. We empirically validate our method, named Bi-level Imitation for Traffic Simulation (BITS), with scenarios from two large-scale driving datasets and show that BITS achieves balanced traffic simulation performance in realism, diversity, and long-horizon stability. We also explore ways to evaluate behavior realism and introduce a suite of evaluation metrics for traffic simulation. Finally, as part of our core contributions, we develop and open source a software tool that unifies data formats across different driving datasets and converts scenes from existing datasets into interactive simulation environments. For additional information and videos, see https://sites.google.com/view/nvr-bits2022/home

preprint2022arXiv

Control-oriented meta-learning

Real-time adaptation is imperative to the control of robots operating in complex, dynamic environments. Adaptive control laws can endow even nonlinear systems with good trajectory tracking performance, provided that any uncertain dynamics terms are linearly parameterizable with known nonlinear features. However, it is often difficult to specify such features a priori, such as for aerodynamic disturbances on rotorcraft or interaction forces between a manipulator arm and various objects. In this paper, we turn to data-driven modeling with neural networks to learn, offline from past data, an adaptive controller with an internal parametric model of these nonlinear features. Our key insight is that we can better prepare the controller for deployment with control-oriented meta-learning of features in closed-loop simulation, rather than regression-oriented meta-learning of features to fit input-output data. Specifically, we meta-learn the adaptive controller with closed-loop tracking simulation as the base-learner and the average tracking error as the meta-objective. With both fully-actuated and underactuated nonlinear planar rotorcraft subject to wind, we demonstrate that our adaptive controller outperforms other controllers trained with regression-oriented meta-learning when deployed in closed-loop for trajectory tracking control.

preprint2022arXiv

Coordinated Multi-Agent Pathfinding for Drones and Trucks over Road Networks

We address the problem of routing a team of drones and trucks over large-scale urban road networks. To conserve their limited flight energy, drones can use trucks as temporary modes of transit en route to their own destinations. Such coordination can yield significant savings in total vehicle distance traveled, i.e., truck travel distance and drone flight distance, compared to operating drones and trucks independently. But it comes at the potentially prohibitive computational cost of deciding which trucks and drones should coordinate and when and where it is most beneficial to do so. We tackle this fundamental trade-off by decoupling our overall intractable problem into tractable sub-problems that we solve stage-wise. The first stage solves only for trucks, by computing paths that make them more likely to be useful transit options for drones. The second stage solves only for drones, by routing them over a composite of the road network and the transit network defined by truck paths from the first stage. We design a comprehensive algorithmic framework that frames each stage as a multi-agent path-finding problem and implement two distinct methods for solving them. We evaluate our approach on extensive simulations with up to $100$ agents on the real-world Manhattan road network containing nearly $4500$ vertices and $10000$ edges. Our framework saves on more than $50\%$ of vehicle distance traveled compared to independently solving for trucks and drones, and computes solutions for all settings within $5$ minutes on commodity hardware.

preprint2022arXiv

Data-Driven Chance Constrained Control using Kernel Distribution Embeddings

We present a data-driven algorithm for efficiently computing stochastic control policies for general joint chance constrained optimal control problems. Our approach leverages the theory of kernel distribution embeddings, which allows representing expectation operators as inner products in a reproducing kernel Hilbert space. This framework enables approximately reformulating the original problem using a dataset of observed trajectories from the system without imposing prior assumptions on the parameterization of the system dynamics or the structure of the uncertainty. By optimizing over a finite subset of stochastic open-loop control trajectories, we relax the original problem to a linear program over the control parameters that can be efficiently solved using standard convex optimization techniques. We demonstrate our proposed approach in simulation on a system with nonlinear non-Markovian dynamics navigating in a cluttered environment.

preprint2022arXiv

Graph Meta-Reinforcement Learning for Transferable Autonomous Mobility-on-Demand

Autonomous Mobility-on-Demand (AMoD) systems represent an attractive alternative to existing transportation paradigms, currently challenged by urbanization and increasing travel needs. By centrally controlling a fleet of self-driving vehicles, these systems provide mobility service to customers and are currently starting to be deployed in a number of cities around the world. Current learning-based approaches for controlling AMoD systems are limited to the single-city scenario, whereby the service operator is allowed to take an unlimited amount of operational decisions within the same transportation system. However, real-world system operators can hardly afford to fully re-train AMoD controllers for every city they operate in, as this could result in a high number of poor-quality decisions during training, making the single-city strategy a potentially impractical solution. To address these limitations, we propose to formalize the multi-city AMoD problem through the lens of meta-reinforcement learning (meta-RL) and devise an actor-critic algorithm based on recurrent graph neural networks. In our approach, AMoD controllers are explicitly trained such that a small amount of experience within a new city will produce good system performance. Empirically, we show how control policies learned through meta-RL are able to achieve near-optimal performance on unseen cities by learning rapidly adaptable policies, thus making them more robust not only to novel environments, but also to distribution shifts common in real-world operations, such as special events, unexpected congestion, and dynamic pricing schemes.

preprint2022arXiv

Heterogeneous-Agent Trajectory Forecasting Incorporating Class Uncertainty

Reasoning about the future behavior of other agents is critical to safe robot navigation. The multiplicity of plausible futures is further amplified by the uncertainty inherent to agent state estimation from data, including positions, velocities, and semantic class. Forecasting methods, however, typically neglect class uncertainty, conditioning instead only on the agent's most likely class, even though perception models often return full class distributions. To exploit this information, we present HAICU, a method for heterogeneous-agent trajectory forecasting that explicitly incorporates agents' class probabilities. We additionally present PUP, a new challenging real-world autonomous driving dataset, to investigate the impact of Perceptual Uncertainty in Prediction. It contains challenging crowded scenes with unfiltered agent class probabilities that reflect the long-tail of current state-of-the-art perception systems. We demonstrate that incorporating class probabilities in trajectory forecasting significantly improves performance in the face of uncertainty, and enables new forecasting capabilities such as counterfactual predictions.

preprint2022arXiv

Interaction-Dynamics-Aware Perception Zones for Obstacle Detection Safety Evaluation

To enable safe autonomous vehicle (AV) operations, it is critical that an AV's obstacle detection module can reliably detect obstacles that pose a safety threat (i.e., are safety-critical). It is therefore desirable that the evaluation metric for the perception system captures the safety-criticality of objects. Unfortunately, existing perception evaluation metrics tend to make strong assumptions about the objects and ignore the dynamic interactions between agents, and thus do not accurately capture the safety risks in reality. To address these shortcomings, we introduce an interaction-dynamics-aware obstacle detection evaluation metric by accounting for closed-loop dynamic interactions between an ego vehicle and obstacles in the scene. By borrowing existing theory from optimal control theory, namely Hamilton-Jacobi reachability, we present a computationally tractable method for constructing a ``safety zone'': a region in state space that defines where safety-critical obstacles lie for the purpose of defining safety metrics. Our proposed safety zone is mathematically complete, and can be easily computed to reflect a variety of safety requirements. Using an off-the-shelf detection algorithm from the nuScenes detection challenge leaderboard, we demonstrate that our approach is computationally lightweight, and can better capture safety-critical perception errors than a baseline approach.

preprint2022arXiv

Learning Deep SDF Maps Online for Robot Navigation and Exploration

We propose an algorithm to (i) learn online a deep signed distance function (SDF) with a LiDAR-equipped robot to represent the 3D environment geometry, and (ii) plan collision-free trajectories given this deep learned map. Our algorithm takes a stream of incoming LiDAR scans and continually optimizes a neural network to represent the SDF of the environment around its current vicinity. When the SDF network quality saturates, we cache a copy of the network, along with a learned confidence metric, and initialize a new SDF network to continue mapping new regions of the environment. We then concatenate all the cached local SDFs through a confidence-weighted scheme to give a global SDF for planning. For planning, we make use of a sequential convex model predictive control (MPC) algorithm. The MPC planner optimizes a dynamically feasible trajectory for the robot while enforcing no collisions with obstacles mapped in the global SDF. We show that our online mapping algorithm produces higher-quality maps than existing methods for online SDF training. In the WeBots simulator, we further showcase the combined mapper and planner running online -- navigating autonomously and without collisions in an unknown environment.

preprint2022arXiv

Local Calibration: Metrics and Recalibration

Probabilistic classifiers output confidence scores along with their predictions, and these confidence scores should be calibrated, i.e., they should reflect the reliability of the prediction. Confidence scores that minimize standard metrics such as the expected calibration error (ECE) accurately measure the reliability on average across the entire population. However, it is in general impossible to measure the reliability of an individual prediction. In this work, we propose the local calibration error (LCE) to span the gap between average and individual reliability. For each individual prediction, the LCE measures the average reliability of a set of similar predictions, where similarity is quantified by a kernel function on a pretrained feature space and by a binning scheme over predicted model confidences. We show theoretically that the LCE can be estimated sample-efficiently from data, and empirically find that it reveals miscalibration modes that are more fine-grained than the ECE can detect. Our key result is a novel local recalibration method LoRe, to improve confidence scores for individual predictions and decrease the LCE. Experimentally, we show that our recalibration method produces more accurate confidence scores, which improves downstream fairness and decision making on classification tasks with both image and tabular data.

preprint2022arXiv

Matching with Transfers under Distributional Constraints

We study two-sided many-to-one matching markets with transferable utilities, e.g., labor and rental housing markets, in which money can exchange hands between agents, subject to distributional constraints on the set of feasible allocations. In such markets, we establish the efficiency of equilibrium arrangements, specified by an assignment and transfers between agents on the two sides of the market, and study the conditions on the distributional constraints and agent preferences under which equilibria exist and can be computed efficiently. To this end, we first consider the setting when the number of institutions (e.g., firms in a labor market) is one and show that equilibrium arrangements exist irrespective of the nature of the constraint structure or the agents' preferences. However, equilibrium arrangements may not exist in markets with multiple institutions even when agents on each side have linear (or additively separable) preferences over agents on the other side. Thus, for markets with linear preferences, we study sufficient conditions on the constraint structure that guarantee the existence of equilibria using linear programming duality. Our linear programming approach not only generalizes that of Shapley and Shubik (1971) in the one-to-one matching setting to the many-to-one matching setting under distributional constraints but also provides a method to compute market equilibria efficiently.

preprint2022arXiv

Motron: Multimodal Probabilistic Human Motion Forecasting

Autonomous systems and humans are increasingly sharing the same space. Robots work side by side or even hand in hand with humans to balance each other's limitations. Such cooperative interactions are ever more sophisticated. Thus, the ability to reason not just about a human's center of gravity position, but also its granular motion is an important prerequisite for human-robot interaction. Though, many algorithms ignore the multimodal nature of humans or neglect uncertainty in their motion forecasts. We present Motron, a multimodal, probabilistic, graph-structured model, that captures human's multimodality using probabilistic methods while being able to output deterministic maximum-likelihood motions and corresponding confidence values for each mode. Our model aims to be tightly integrated with the robotic planning-control-interaction loop; outputting physically feasible human motions and being computationally efficient. We demonstrate the performance of our model on several challenging real-world motion forecasting datasets, outperforming a wide array of generative/variational methods while providing state-of-the-art single-output motions if required. Both using significantly less computational power than state-of-the art algorithms.

preprint2022arXiv

Online Learning for Traffic Routing under Unknown Preferences

In transportation networks, users typically choose routes in a decentralized and self-interested manner to minimize their individual travel costs, which, in practice, often results in inefficient overall outcomes for society. As a result, there has been a growing interest in designing road tolling schemes to cope with these efficiency losses and steer users toward a system-efficient traffic pattern. However, the efficacy of road tolling schemes often relies on having access to complete information on users' trip attributes, such as their origin-destination (O-D) travel information and their values of time, which may not be available in practice. Motivated by this practical consideration, we propose an online learning approach to set tolls in a traffic network to drive heterogeneous users with different values of time toward a system-efficient traffic pattern. In particular, we develop a simple yet effective algorithm that adjusts tolls at each time period solely based on the observed aggregate flows on the roads of the network without relying on any additional trip attributes of users, thereby preserving user privacy. In the setting where the O-D pairs and values of time of users are drawn i.i.d. at each period, we show that our approach obtains an expected regret and road capacity violation of $O(\sqrt{T})$, where $T$ is the number of periods over which tolls are updated. Our regret guarantee is relative to an offline oracle that has complete information on users' trip attributes. We further establish a $Ω(\sqrt{T})$ lower bound on the regret of any algorithm, which establishes that our algorithm is optimal up to constants. Finally, we demonstrate the superior performance of our approach relative to several benchmarks on a real-world transportation network, thereby highlighting its practical applicability.

preprint2022arXiv

Private Location Sharing for Decentralized Routing services

Data-driven methodologies offer many exciting upsides, but they also introduce new challenges, particularly in the realm of user privacy. Specifically, the way data is collected can pose privacy risks to end users. In many routing services, a single entity (e.g., the routing service provider) collects and manages user trajectory data. When it comes to user privacy, these systems have a central point of failure since users have to trust that this entity will not sell or use their data to infer sensitive private information. Unfortunately, in practice many advertising companies offer to buy such data for the sake of targeted advertisements. With this as motivation, we study the problem of using location data for routing services in a privacy-preserving way. Rather than having users report their location to a central operator, we present a protocol in which users participate in a decentralized and privacy-preserving computation to estimate travel times for the roads in the network in a way that no individuals' location is ever observed by any other party. The protocol uses the Laplace mechanism in conjunction with secure multi-party computation to ensure that it is cryptogrpahically secure and that its output is differentially private. A natural question is if privacy necessitates degradation in accuracy or system performance. We show that if a road has sufficiently high capacity, then the travel time estimated by our protocol is provably close to the ground truth travel time. We validate the protocol through numerical experiments which show that using the protocol as a routing service provides privacy guarantees with minimal overhead to user travel time.

preprint2022arXiv

Propagating State Uncertainty Through Trajectory Forecasting

Uncertainty pervades through the modern robotic autonomy stack, with nearly every component (e.g., sensors, detection, classification, tracking, behavior prediction) producing continuous or discrete probabilistic distributions. Trajectory forecasting, in particular, is surrounded by uncertainty as its inputs are produced by (noisy) upstream perception and its outputs are predictions that are often probabilistic for use in downstream planning. However, most trajectory forecasting methods do not account for upstream uncertainty, instead taking only the most-likely values. As a result, perceptual uncertainties are not propagated through forecasting and predictions are frequently overconfident. To address this, we present a novel method for incorporating perceptual state uncertainty in trajectory forecasting, a key component of which is a new statistical distance-based loss function which encourages predicting uncertainties that better match upstream perception. We evaluate our approach both in illustrative simulations and on large-scale, real-world data, demonstrating its efficacy in propagating perceptual state uncertainty through prediction and producing more calibrated predictions.

preprint2022arXiv

Risk-sensitive safety analysis using Conditional Value-at-Risk

This paper develops a safety analysis method for stochastic systems that is sensitive to the possibility and severity of rare harmful outcomes. We define risk-sensitive safe sets as sub-level sets of the solution to a non-standard optimal control problem, where a random maximum cost is assessed via Conditional Value-at-Risk (CVaR). The objective function represents the maximum extent of constraint violation of the state trajectory, averaged over a given percentage of worst cases. This problem is well-motivated but difficult to solve tractably because the temporal decomposition for CVaR is history-dependent. Our primary theoretical contribution is to derive computationally tractable under-approximations to risk-sensitive safe sets. Our method provides a novel, theoretically guaranteed, parameter-dependent upper bound to the CVaR of a maximum cost without the need to augment the state space. For a fixed parameter value, the solution to only one Markov decision process problem is required to obtain the under-approximations for any family of risk-sensitivity levels. In addition, we propose a second definition for risk-sensitive safe sets and provide a tractable method for their estimation without using a parameter-dependent upper bound. The second definition is expressed in terms of a new coherent risk functional, which is inspired by CVaR. We demonstrate our primary theoretical contribution via numerical examples.

preprint2022arXiv

Robust Trajectory Prediction against Adversarial Attacks

Trajectory prediction using deep neural networks (DNNs) is an essential component of autonomous driving (AD) systems. However, these methods are vulnerable to adversarial attacks, leading to serious consequences such as collisions. In this work, we identify two key ingredients to defend trajectory prediction models against adversarial attacks including (1) designing effective adversarial training methods and (2) adding domain-specific data augmentation to mitigate the performance degradation on clean data. We demonstrate that our method is able to improve the performance by 46% on adversarial data and at the cost of only 3% performance degradation on clean data, compared to the model trained with clean data. Additionally, compared to existing robust methods, our method can improve performance by 21% on adversarial examples and 9% on clean data. Our robust model is evaluated with a planner to study its downstream impacts. We demonstrate that our model can significantly reduce the severe accident rates (e.g., collisions and off-road driving).

preprint2022arXiv

Safe Active Dynamics Learning and Control: A Sequential Exploration-Exploitation Framework

Safe deployment of autonomous robots in diverse scenarios requires agents that are capable of efficiently adapting to new environments while satisfying constraints. In this work, we propose a practical and theoretically-justified approach to maintaining safety in the presence of dynamics uncertainty. Our approach leverages Bayesian meta-learning with last-layer adaptation. The expressiveness of neural-network features trained offline, paired with efficient last-layer online adaptation, enables the derivation of tight confidence sets which contract around the true dynamics as the model adapts online. We exploit these confidence sets to plan trajectories that guarantee the safety of the system. Our approach handles problems with high dynamics uncertainty, where reaching the goal safely is potentially initially infeasible, by first \textit{exploring} to gather data and reduce uncertainty, before autonomously \textit{exploiting} the acquired information to safely perform the task. Under reasonable assumptions, we prove that our framework guarantees the high-probability satisfaction of all constraints at all times jointly, i.e. over the total task duration. This theoretical analysis also motivates two regularizers of last-layer meta-learning models that improve online adaptation capabilities as well as performance by reducing the size of the confidence sets. We extensively demonstrate our approach in simulation and on hardware.

preprint2022arXiv

ScePT: Scene-consistent, Policy-based Trajectory Predictions for Planning

Trajectory prediction is a critical functionality of autonomous systems that share environments with uncontrolled agents, one prominent example being self-driving vehicles. Currently, most prediction methods do not enforce scene consistency, i.e., there are a substantial amount of self-collisions between predicted trajectories of different agents in the scene. Moreover, many approaches generate individual trajectory predictions per agent instead of joint trajectory predictions of the whole scene, which makes downstream planning difficult. In this work, we present ScePT, a policy planning-based trajectory prediction model that generates accurate, scene-consistent trajectory predictions suitable for autonomous system motion planning. It explicitly enforces scene consistency and learns an agent interaction policy that can be used for conditional prediction. Experiments on multiple real-world pedestrians and autonomous vehicle datasets show that ScePT} matches current state-of-the-art prediction accuracy with significantly improved scene consistency. We also demonstrate ScePT's ability to work with a downstream contingency planner.

preprint2022arXiv

Second-Order Sensitivity Analysis for Bilevel Optimization

In this work we derive a second-order approach to bilevel optimization, a type of mathematical programming in which the solution to a parameterized optimization problem (the "lower" problem) is itself to be optimized (in the "upper" problem) as a function of the parameters. Many existing approaches to bilevel optimization employ first-order sensitivity analysis, based on the implicit function theorem (IFT), for the lower problem to derive a gradient of the lower problem solution with respect to its parameters; this IFT gradient is then used in a first-order optimization method for the upper problem. This paper extends this sensitivity analysis to provide second-order derivative information of the lower problem (which we call the IFT Hessian), enabling the usage of faster-converging second-order optimization methods at the upper level. Our analysis shows that (i) much of the computation already used to produce the IFT gradient can be reused for the IFT Hessian, (ii) errors bounds derived for the IFT gradient readily apply to the IFT Hessian, (iii) computing IFT Hessians can significantly reduce overall computation by extracting more information from each lower level solve. We corroborate our findings and demonstrate the broad range of applications of our method by applying it to problem instances of least squares hyperparameter auto-tuning, multi-class SVM auto-tuning, and inverse optimal control.

preprint2022arXiv

Semi-Supervised Trajectory-Feedback Controller Synthesis for Signal Temporal Logic Specifications

There are spatio-temporal rules that dictate how robots should operate in complex environments, e.g., road rules govern how (self-driving) vehicles should behave on the road. However, seamlessly incorporating such rules into a robot control policy remains challenging especially for real-time applications. In this work, given a desired spatio-temporal specification expressed in the Signal Temporal Logic (STL) language, we propose a semi-supervised controller synthesis technique that is attuned to human-like behaviors while satisfying desired STL specifications. Offline, we synthesize a trajectory-feedback neural network controller via an adversarial training scheme that summarizes past spatio-temporal behaviors when computing controls, and then online, we perform gradient steps to improve specification satisfaction. Central to the offline phase is an imitation-based regularization component that fosters better policy exploration and helps induce naturalistic human behaviors. Our experiments demonstrate that having imitation-based regularization leads to higher qualitative and quantitative performance compared to optimizing an STL objective only as done in prior work. We demonstrate the efficacy of our approach with an illustrative case study and show that our proposed controller outperforms a state-of-the-art shooting method in both performance and computation time.

preprint2022arXiv

Towards Data-Driven Synthesis of Autonomous Vehicle Safety Concepts

As safety-critical autonomous vehicles (AVs) will soon become pervasive in our society, a number of safety concepts for trusted AV deployment have recently been proposed throughout industry and academia. Yet, achieving consensus on an appropriate safety concept is still an elusive task. In this paper, we advocate for the use of Hamilton-Jacobi (HJ) reachability as a unifying mathematical framework for comparing existing safety concepts, and through elements of this framework propose ways to tailor safety concepts (and thus expand their applicability) to scenarios with implicit expectations on agent behavior in a data-driven fashion. Specifically, we show that (i) existing predominant safety concepts can be embedded in the HJ reachability framework, thereby enabling a common language for comparing and contrasting modeling assumptions, and (ii) HJ reachability can serve as an inductive bias to effectively reason, in a learning context, about two critical, yet often overlooked aspects of safety: responsibility and context-dependency.

preprint2022arXiv

Using Spectral Submanifolds for Nonlinear Periodic Control

Very high dimensional nonlinear systems arise in many engineering problems due to semi-discretization of the governing partial differential equations, e.g. through finite element methods. The complexity of these systems present computational challenges for direct application to automatic control. While model reduction has seen ubiquitous applications in control, the use of nonlinear model reduction methods in this setting remains difficult. The problem lies in preserving the structure of the nonlinear dynamics in the reduced order model for high-fidelity control. In this work, we leverage recent advances in Spectral Submanifold (SSM) theory to enable model reduction under well-defined assumptions for the purpose of efficiently synthesizing feedback controllers.

preprint2021arXiv

Control Barrier Functions for Cyber-Physical Systems and Applications to NMPC

Tractable safety-ensuring algorithms for cyber-physical systems are important in critical applications. Approaches based on Control Barrier Functions assume continuous enforcement, which is not possible in an online fashion. This paper presents two tractable algorithms to ensure forward invariance of discrete-time controlled cyber-physical systems. Both approaches are based on Control Barrier Functions to provide strict mathematical safety guarantees. The first algorithm exploits Lipschitz continuity and formulates the safety condition as a robust program which is subsequently relaxed to a set of affine conditions. The second algorithm is inspired by tube-NMPC and uses an affine Control Barrier Function formulation in conjunction with an auxiliary controller to guarantee safety of the system. We combine an approximate NMPC controller with the second algorithm to guarantee strict safety despite approximated constraints and show its effectiveness experimentally on a mini-Segway.

preprint2021arXiv

Efficient Large-Scale Multi-Drone Delivery Using Transit Networks

We consider the problem of controlling a large fleet of drones to deliver packages simultaneously across broad urban areas. To conserve energy, drones hop between public transit vehicles (e.g., buses and trams). We design a comprehensive algorithmic framework that strives to minimize the maximum time to complete any delivery. We address the multifaceted complexity of the problem through a two-layer approach. First, the upper layer assigns drones to package delivery sequences with a near-optimal polynomial-time task allocation algorithm. Then, the lower layer executes the allocation by periodically routing the fleet over the transit network while employing efficient bounded-suboptimal multi-agent pathfinding techniques tailored to our setting. Experiments demonstrate the efficiency of our approach on settings with up to $200$ drones, $5000$ packages, and transit networks with up to $8000$ stops in San Francisco and Washington DC. Our results show that the framework computes solutions typically within a few seconds on commodity hardware, and that drones travel up to $360 \%$ of their flight range with public transit.

preprint2021arXiv

Evidential Sparsification of Multimodal Latent Spaces in Conditional Variational Autoencoders

Discrete latent spaces in variational autoencoders have been shown to effectively capture the data distribution for many real-world problems such as natural language understanding, human intent prediction, and visual scene representation. However, discrete latent spaces need to be sufficiently large to capture the complexities of real-world data, rendering downstream tasks computationally challenging. For instance, performing motion planning in a high-dimensional latent representation of the environment could be intractable. We consider the problem of sparsifying the discrete latent space of a trained conditional variational autoencoder, while preserving its learned multimodality. As a post hoc latent space reduction technique, we use evidential theory to identify the latent classes that receive direct evidence from a particular input condition and filter out those that do not. Experiments on diverse tasks, such as image generation and human behavior prediction, demonstrate the effectiveness of our proposed technique at reducing the discrete latent sample space size of a model while maintaining its learned multimodality.

preprint2021arXiv

MATS: An Interpretable Trajectory Forecasting Representation for Planning and Control

Reasoning about human motion is a core component of modern human-robot interactive systems. In particular, one of the main uses of behavior prediction in autonomous systems is to inform robot motion planning and control. However, a majority of planning and control algorithms reason about system dynamics rather than the predicted agent tracklets (i.e., ordered sets of waypoints) that are commonly output by trajectory forecasting methods, which can hinder their integration. Towards this end, we propose Mixtures of Affine Time-varying Systems (MATS) as an output representation for trajectory forecasting that is more amenable to downstream planning and control use. Our approach leverages successful ideas from probabilistic trajectory forecasting works to learn dynamical system representations that are well-studied in the planning and control literature. We integrate our predictions with a proposed multimodal planning methodology and demonstrate significant computational efficiency improvements on a large-scale autonomous driving dataset.

preprint2021arXiv

On the Co-Design of AV-Enabled Mobility Systems

The design of autonomous vehicles (AVs) and the design of AV-enabled mobility systems are closely coupled. Indeed, knowledge about the intended service of AVs would impact their design and deployment process, whilst insights about their technological development could significantly affect transportation management decisions. This calls for tools to study such a coupling and co-design AVs and AV-enabled mobility systems in terms of different objectives. In this paper, we instantiate a framework to address such co-design problems. In particular, we leverage the recently developed theory of co-design to frame and solve the problem of designing and deploying an intermodal Autonomous Mobility-on-Demand system, whereby AVs service travel demands jointly with public transit, in terms of fleet sizing, vehicle autonomy, and public transit service frequency. Our framework is modular and compositional, allowing one to describe the design problem as the interconnection of its individual components and to tackle it from a system-level perspective. To showcase our methodology, we present a real-world case study for Washington D.C., USA. Our work suggests that it is possible to create user-friendly optimization tools to systematically assess costs and benefits of interventions, and that such analytical techniques might gain a momentous role in policy-making in the future.

preprint2021arXiv

Sketching Curvature for Efficient Out-of-Distribution Detection for Deep Neural Networks

In order to safely deploy Deep Neural Networks (DNNs) within the perception pipelines of real-time decision making systems, there is a need for safeguards that can detect out-of-training-distribution (OoD) inputs both efficiently and accurately. Building on recent work leveraging the local curvature of DNNs to reason about epistemic uncertainty, we propose Sketching Curvature of OoD Detection (SCOD), an architecture-agnostic framework for equipping any trained DNN with a task-relevant epistemic uncertainty estimate. Offline, given a trained model and its training data, SCOD employs tools from matrix sketching to tractably compute a low-rank approximation of the Fisher information matrix, which characterizes which directions in the weight space are most influential on the predictions over the training data. Online, we estimate uncertainty by measuring how much perturbations orthogonal to these directions can alter predictions at a new test input. We apply SCOD to pre-trained networks of varying architectures on several tasks, ranging from regression to classification. We demonstrate that SCOD achieves comparable or better OoD detection performance with lower computational burden relative to existing baselines.

preprint2021arXiv

Trajectron++: Dynamically-Feasible Trajectory Forecasting With Heterogeneous Data

Reasoning about human motion is an important prerequisite to safe and socially-aware robotic navigation. As a result, multi-agent behavior prediction has become a core component of modern human-robot interactive systems, such as self-driving cars. While there exist many methods for trajectory forecasting, most do not enforce dynamic constraints and do not account for environmental information (e.g., maps). Towards this end, we present Trajectron++, a modular, graph-structured recurrent model that forecasts the trajectories of a general number of diverse agents while incorporating agent dynamics and heterogeneous data (e.g., semantic maps). Trajectron++ is designed to be tightly integrated with robotic planning and control frameworks; for example, it can produce predictions that are optionally conditioned on ego-agent motion plans. We demonstrate its performance on several challenging real-world trajectory forecasting datasets, outperforming a wide array of state-of-the-art deterministic and generative methods.

preprint2020arXiv

A Simple and Efficient Tube-based Robust Output Feedback Model Predictive Control Scheme

The control of constrained systems using model predictive control (MPC) becomes more challenging when full state information is not available and when the nominal system model and measurements are corrupted by noise. Since these conditions are often seen in practical scenarios, techniques such as robust output feedback MPC have been developed to address them. However, existing approaches to robust output feedback MPC are challenged by increased complexity of the online optimization problem, increased computational requirements for controller synthesis, or both. In this work we present a simple and efficient methodology for synthesizing a tube-based robust output feedback MPC scheme for linear, discrete, time-invariant systems subject to bounded, additive disturbances. Specifically, we first formulate a scheme where the online MPC optimization problem has the same complexity as in the nominal full state feedback MPC by using a single tube with constant cross-section. This makes our proposed approach simpler to implement and less computationally demanding than previous methods for both online implementation and offline controller synthesis. Secondly, we propose a novel and simple procedure for the computation of robust positively invariant (RPI) sets that are approximations of the minimal RPI set, which can be used to define the tube in the proposed control scheme.

preprint2020arXiv

Congestion-aware Routing and Rebalancing of Autonomous Mobility-on-Demand Systems in Mixed Traffic

This paper studies congestion-aware route-planning policies for Autonomous Mobility-on-Demand (AMoD) systems, whereby a fleet of autonomous vehicles provides on-demand mobility under mixed traffic conditions. Specifically, we first devise a network flow model to optimize the AMoD routing and rebalancing strategies in a congestion-aware fashion by accounting for the endogenous impact of AMoD flows on travel time. Second, we capture reactive exogenous traffic consisting of private vehicles selfishly adapting to the AMoD flows in a user-centric fashion by leveraging an iterative approach. Finally, we showcase the effectiveness of our framework with two case-studies considering the transportation sub-networks in Eastern Massachusetts and New York City. Our results suggest that for high levels of demand, pure AMoD travel can be detrimental due to the additional traffic stemming from its rebalancing flows, while the combination of AMoD with walking or micromobility options can significantly improve the overall system performance.

preprint2020arXiv

Error Bounds for Reduced Order Model Predictive Control

Model predictive control is a powerful framework for enabling optimal control of constrained systems. However, for systems that are described by high-dimensional state spaces this framework can be too computationally demanding for real-time control. Reduced order model predictive control (ROMPC) frameworks address this issue by leveraging model reduction techniques to compress the state space model used in the online optimal control problem. While this can enable real-time control by decreasing the online computational requirements, these model reductions introduce approximation errors that must be accounted for to guarantee constraint satisfaction and closed-loop stability for the controlled high-dimensional system. In this work we propose an offline methodology for efficiently computing error bounds arising from model reduction, and show how they can be used to guarantee constraint satisfaction in a previously proposed ROMPC framework. This work considers linear, discrete, time-invariant systems that are compressed by Petrov-Galerkin projections, and considers output-feedback settings where the system is also subject to bounded disturbances.

preprint2020arXiv

Infusing Reachability-Based Safety into Planning and Control for Multi-agent Interactions

Within a robot autonomy stack, the planner and controller are typically designed separately, and serve different purposes. As such, there is often a diffusion of responsibilities when it comes to ensuring safety for the robot. We propose that a planner and controller should share the same interpretation of safety but apply this knowledge in a different yet complementary way. To achieve this, we use Hamilton-Jacobi (HJ) reachability theory at the planning level to provide the robot planner with the foresight to avoid entering regions with possible inevitable collision. However, this alone does not guarantee safety. In conjunction with this HJ reachability-infused planner, we propose a minimally-interventional multi-agent safety-preserving controller also derived via HJ-reachability theory. The safety controller maintains safety for the robot without unduly impacting planner performance. We demonstrate the benefits of our proposed approach in a multi-agent highway scenario where a robot car is rewarded to navigate through traffic as fast as possible, and we show that our approach provides strong safety assurances yet achieves the highest performance compared to other safety controllers.

preprint2020arXiv

On Local Computation for Optimization in Multi-Agent Systems

A number of prototypical optimization problems in multi-agent systems (e.g., task allocation and network load-sharing) exhibit a highly local structure: that is, each agent's decision variables are only directly coupled to few other agent's variables through the objective function or the constraints. Nevertheless, existing algorithms for distributed optimization generally do not exploit the locality structure of the problem, requiring all agents to compute or exchange the full set of decision variables. In this paper, we develop a rigorous notion of "locality" that quantifies the degree to which agents can compute their portion of the global solution based solely on information in their local neighborhood. This notion provides a theoretical basis for a rather simple algorithm in which agents individually solve a truncated sub-problem of the global problem, where the size of the sub-problem used depends on the locality of the problem, and the desired accuracy. Numerical results show that the proposed theoretical bounds are remarkably tight for well-conditioned problems.

preprint2020arXiv

Planning and Operations of Mixed Fleets in Mobility-on-Demand Systems

Automated vehicles (AVs) are expected to be beneficial for Mobility-on-Demand (MoD), thanks to their ability of being globally coordinated. To facilitate the steady transition towards full autonomy, we consider the transition period of AV deployment, whereby an MoD system operates a mixed fleet of automated vehicles (AVs) and human-driven vehicles (HVs). In such systems, AVs are centrally coordinated by the operator, and the HVs might strategically respond to the coordination of AVs. We devise computationally tractable strategies to coordinate mixed fleets in MoD systems. Specifically, we model an MoD system with a mixed fleet using a Stackelberg framework where the MoD operator serves as the leader and human-driven vehicles serve as the followers. We develop two models: 1) a steady-state model to analyze the properties of the problem and determine the planning variables (e.g., compensations, prices, and the fleet size of AVs), and 2) a time-varying model to design a real-time coordination algorithm for AVs. The proposed models are validated using a case study inspired by real operational data of a MoD service in Singapore. Results show that the proposed algorithms can significantly improve system performance.

preprint2020arXiv

Revisiting the Asymptotic Optimality of RRT$^*$

RRT* is one of the most widely used sampling-based algorithms for asymptotically-optimal motion planning. This algorithm laid the foundations for optimality in motion planning as a whole, and inspired the development of numerous new algorithms in the field, many of which build upon RRT* itself. In this paper, we first identify a logical gap in the optimality proof of RRT*, which was developed in Karaman and Frazzoli (2011). Then, we present an alternative and mathematically-rigorous proof for asymptotic optimality. Our proof suggests that the connection radius used by RRT* should be increased from $γ\left(\frac{\log n}{n}\right)^{1/d}$ to $γ' \left(\frac{\log n}{n}\right)^{1/(d+1)}$ in order to account for the additional dimension of time that dictates the samples' ordering. Here $γ$, $γ'$, are constants, and $n$, $d$, are the number of samples and the dimension of the problem, respectively.

preprint2020arXiv

Risk-sensitive safety specifications for stochastic systems using Conditional Value-at-Risk

This paper proposes a safety analysis method that facilitates a tunable balance between the worst-case and risk-neutral perspectives. First, we define a risk-sensitive safe set to specify the degree of safety attained by a stochastic system. This set is defined as a sublevel set of the solution to an optimal control problem that is expressed using the Conditional Value-at-Risk (CVaR) measure. This problem does not satisfy Bellman's Principle, thus our next contribution is to show how risk-sensitive safe sets can be under-approximated by the solution to a CVaR-Markov Decision Process. We adopt an existing value iteration algorithm to find an approximate solution to the reduced problem for a class of linear systems. Then, we develop a realistic numerical example of a stormwater system to show that this approach can be applied to non-linear systems. Finally, we compare the CVaR criterion to the exponential disutility criterion. The latter allocates control effort evenly across the cost distribution to reduce variance, while the CVaR criterion focuses control effort on a given worst-case quantile--where it matters most for safety.

preprint2020arXiv

Risk-Sensitive Sequential Action Control with Multi-Modal Human Trajectory Forecasting for Safe Crowd-Robot Interaction

This paper presents a novel online framework for safe crowd-robot interaction based on risk-sensitive stochastic optimal control, wherein the risk is modeled by the entropic risk measure. The sampling-based model predictive control relies on mode insertion gradient optimization for this risk measure as well as Trajectron++, a state-of-the-art generative model that produces multimodal probabilistic trajectory forecasts for multiple interacting agents. Our modular approach decouples the crowd-robot interaction into learning-based prediction and model-based control, which is advantageous compared to end-to-end policy learning methods in that it allows the robot's desired behavior to be specified at run time. In particular, we show that the robot exhibits diverse interaction behavior by varying the risk sensitivity parameter. A simulation study and a real-world experiment show that the proposed online framework can accomplish safe and efficient navigation while avoiding collisions with more than 50 humans in the scene.

preprint2020arXiv

Shapeshifter: A Multi-Agent, Multi-Modal Robotic Platform for Exploration of Titan

In this paper we present a mission architecture and a robotic platform, the Shapeshifter, that allow multi-domain and redundant mobility on Saturn's moon Titan, and potentially other bodies with atmospheres. The Shapeshifter is a collection of simple and affordable robotic units, called Cobots, comparable to personal palm-size quadcopters. By attaching and detaching with each other, multiple Cobots can shape-shift into novel structures, capable of (a) rolling on the surface, to increase the traverse range, (b) flying in a flight array formation, and (c) swimming on or under liquid. A ground station complements the robotic platform, hosting science instrumentation and providing power to recharge the batteries of the Cobots. In the first part of this paper we experimentally show the flying, docking and rolling capabilities of a Shapeshifter constituted by two Cobots, presenting ad-hoc control algorithms. We additionally evaluate the energy-efficiency of the rolling-based mobility strategy by deriving an analytic model of the power consumption and by integrating it in a high-fidelity simulation environment. In the second part we tailor our mission architecture to the exploration of Titan. We show that the properties of the Shapeshifter allow the exploration of the possible cryovolcano Sotra Patera, Titan's Mare and canyons.

preprint2020arXiv

Soft Tensegrity Systems for Planetary Landing and Exploration

During the last decade, tensegrity systems have been the focus of numerous investigations exploring the possibility of adopting them for planetary landing and exploration applications. Early approaches mainly focused on locomotion aspects related to tensegrity systems, where mobility was achieved by actuating the cable members of the system. Later efforts focused on understanding energy storage mechanisms of tensegrity systems undergoing landing events. More precisely, it was shown that under highly dynamic events, buckling of individual members of a tensegrity structure does not necessarily imply structural failure, suggesting that efficient structural design of planetary landers could be achieved by allowing its compression members to buckle. In this work, we combine both aspects of previous research on tensegrity structures, showing a possible lattice-like structural configuration able to withstand impact events, store pre-impact kinetic energy, and utilize a part of that energy for the locomotion process. Our work shows the feasibility of this proposed approach via both experimental and computational means.

preprint2020arXiv

The Shapeshifter: a Morphing, Multi-Agent,Multi-Modal Robotic Platform for the Exploration of Titan (preprint version)

In this report for the Nasa NIAC Phase I study, we present a mission architecture and a robotic platform, the Shapeshifter, that allow multi-domain and redundant mobility on Saturn's moon Titan, and potentially other bodies with atmospheres. The Shapeshifter is a collection of simple and affordable robotic units, called Cobots, comparable to personal palm-size quadcopters. By attaching and detaching with each other, multiple Cobots can shape-shift into novel structures, capable of (a) rolling on the surface, to increase the traverse range, (b) flying in a flight array formation, and (c) swimming on or under liquid. A ground station complements the robotic platform, hosting science instrumentation and providing power to recharge the batteries of the Cobots. Our Phase I study had the objective of providing an initial assessment of the feasibility of the proposed robotic platform architecture, and in particular (a) to characterize the expected science return of a mission to the Sotra-Patera region on Titan; (b) to verify the mechanical and algorithmic feasibility of building a multi-agent platform capable of flying, docking, rolling and un-docking; (c) to evaluate the increased range and efficiency of rolling on Titan w.r.t to flying; (d) to define a case-study of a mission for the exploration of the cryovolcano Sotra-Patera on Titan, whose expected variety of geological features challenges conventional mobility platforms.

preprint2020arXiv

Towards a Co-Design Framework for Future Mobility Systems

The design of Autonomous Vehicles (AVs) and the design of AVs-enabled mobility systems are closely coupled. Indeed, knowledge about the intended service of AVs would impact their design and deployment process, whilst insights about their technological development could significantly affect transportation management decisions. This calls for tools to study such a coupling and co-design AVs and AVs-enabled mobility systems in terms of different objectives. In this paper, we instantiate a framework to address such co-design problems. In particular, we leverage the recently developed theory of co-design to frame and solve the problem of designing and deploying an intermodal Autonomous Mobility-on-Demand system, whereby AVs service travel demands jointly with public transit, in terms of fleet sizing, vehicle autonomy, and public transit service frequency. Our framework is modular and compositional, allowing to describe the design problem as the interconnection of its individual components and to tackle it from a system-level perspective. Moreover, it only requires very general monotonicity assumptions and it naturally handles multiple objectives, delivering the rational solutions on the Pareto front and thus enabling policy makers to select a solution through political criteria. To showcase our methodology, we present a real-world case study for Washington D.C., USA. Our work suggests that it is possible to create user-friendly optimization tools to systematically assess the costs and benefits of interventions, and that such analytical techniques might gain a momentous role in policy-making in the future.