Source author record

Claire J. Tomlin

Claire J. Tomlin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Systems and Control eess.SY math.OC Robotics Machine Learning math.DS Multiagent Systems Artificial Intelligence Computer Science and Game Theory

Catalog footprint

What is connected

35works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Computation of Regions of Attraction for Hybrid Limit Cycles Using Reachability: An Application to Walking Robots

Contact-rich robotic systems, such as legged robots and manipulators, are often represented as hybrid systems. However, the stability analysis and region-of-attraction computation for these systems are often challenging because of the discontinuous state changes upon contact (also referred to as state resets). In this work, we cast the computation of region-ofattraction as a Hamilton-Jacobi (HJ) reachability problem. This enables us to leverage HJ reachability tools that are compatible with general nonlinear system dynamics, and can formally deal with state and input constraints as well as bounded disturbances. Our main contribution is the generalization of HJ reachability framework to account for the discontinuous state changes originating from state resets, which has remained a challenge until now. We apply our approach for computing region-of-attractions for several underactuated walking robots and demonstrate that the proposed approach can (a) recover a bigger region-of-attraction than state-of-the-art approaches, (b) handle state resets, nonlinear dynamics, external disturbances, and input constraints, and (c) also provides a stabilizing controller for the system that can leverage the state resets for enhancing system stability.

preprint2022arXiv

Koopman-Based Neural Lyapunov Functions for General Attractors

Koopman spectral theory has grown in the past decade as a powerful tool for dynamical systems analysis and control. In this paper, we show how recent data-driven techniques for estimating Koopman-Invariant subspaces with neural networks can be leveraged to extract Lyapunov certificates for the underlying system. In our work, we specifically focus on systems with a limit-cycle, beyond just an isolated equilibrium point, and use Koopman eigenfunctions to efficiently parameterize candidate Lyapunov functions to construct forward-invariant sets under some (unknown) attractor dynamics. Additionally, when the dynamics are polynomial and when neural networks are replaced by polynomials as a choice of function approximators in our approach, one can further leverage Sum-of-Squares programs and/or nonlinear programs to yield provably correct Lyapunov certificates. In such a polynomial case, our Koopman-based approach for constructing Lyapunov functions uses significantly fewer decision variables compared to directly formulating and solving a Sum-of-Squares optimization problem.

preprint2022arXiv

Risk-sensitive safety analysis using Conditional Value-at-Risk

This paper develops a safety analysis method for stochastic systems that is sensitive to the possibility and severity of rare harmful outcomes. We define risk-sensitive safe sets as sub-level sets of the solution to a non-standard optimal control problem, where a random maximum cost is assessed via Conditional Value-at-Risk (CVaR). The objective function represents the maximum extent of constraint violation of the state trajectory, averaged over a given percentage of worst cases. This problem is well-motivated but difficult to solve tractably because the temporal decomposition for CVaR is history-dependent. Our primary theoretical contribution is to derive computationally tractable under-approximations to risk-sensitive safe sets. Our method provides a novel, theoretically guaranteed, parameter-dependent upper bound to the CVaR of a maximum cost without the need to augment the state space. For a fixed parameter value, the solution to only one Markov decision process problem is required to obtain the under-approximations for any family of risk-sensitivity levels. In addition, we propose a second definition for risk-sensitive safe sets and provide a tractable method for their estimation without using a parameter-dependent upper bound. The second definition is expressed in terms of a new coherent risk functional, which is inspired by CVaR. We demonstrate our primary theoretical contribution via numerical examples.

preprint2022arXiv

Stability and Robustness of a Hybrid Control Law for the Half-bridge Inverter

Hybrid systems combine both discrete and continuous state dynamics. Power electronic inverters are inherently hybrid systems: they are controlled via discrete-valued switching inputs which determine the evolution of the continuous-valued current and voltage state dynamics. Hybrid systems analysis could prove increasingly useful as large numbers of renewable energy sources are incorporated to the grid with inverters as their interface. In this work, we explore a hybrid systems approach for the stability analysis of power and power electronic systems. We provide an analytical proof showing that the use of a hybrid model for the half-bridge inverter allows the derivation of a control law that drives the system states to desired sinusoidal voltage and current references. We derive an analytical expression for a global Lyapunov function for the dynamical system in terms of the system parameters, which proves uniform, global, and asymptotic stability of the origin in error coordinates. Moreover, we demonstrate robustness to parameter changes through this Lyapunov function. We validate these results via simulation. Finally, we show empirically the incorporation of droop control with this hybrid systems approach. In the low-inertia grid community, the juxtaposition of droop control with the hybrid switching control can be considered a grid-forming control strategy using a switched inverter model.

preprint2021arXiv

FaSTrack: a Modular Framework for Fast and Guaranteed Safe Motion Planning

Fast and safe navigation of dynamical systems through a priori unknown cluttered environments is vital to many applications of autonomous systems. However, trajectory planning for autonomous systems is computationally intensive, often requiring simplified dynamics that sacrifice safety and dynamic feasibility in order to plan efficiently. Conversely, safe trajectories can be computed using more sophisticated dynamic models, but this is typically too slow to be used for real-time planning. We propose a new algorithm FaSTrack: Fast and Safe Tracking for High Dimensional systems. A path or trajectory planner using simplified dynamics to plan quickly can be incorporated into the FaSTrack framework, which provides a safety controller for the vehicle along with a guaranteed tracking error bound. This bound captures all possible deviations due to high dimensional dynamics and external disturbances. Note that FaSTrack is modular and can be used with most current path or trajectory planners. We demonstrate this framework using a 10D nonlinear quadrotor model tracking a 3D path obtained from an RRT planner.

preprint2021arXiv

Safety and Liveness Guarantees through Reach-Avoid Reinforcement Learning

Reach-avoid optimal control problems, in which the system must reach certain goal conditions while staying clear of unacceptable failure modes, are central to safety and liveness assurance for autonomous robotic systems, but their exact solutions are intractable for complex dynamics and environments. Recent successes in reinforcement learning methods to approximately solve optimal control problems with performance objectives make their application to certification problems attractive; however, the Lagrange-type objective used in reinforcement learning is not suitable to encode temporal logic requirements. Recent work has shown promise in extending the reinforcement learning machinery to safety-type problems, whose objective is not a sum, but a minimum (or maximum) over time. In this work, we generalize the reinforcement learning formulation to handle all optimal control problems in the reach-avoid category. We derive a time-discounted reach-avoid Bellman backup with contraction mapping properties and prove that the resulting reach-avoid Q-learning algorithm converges under analogous conditions to the traditional Lagrange-type problem, yielding an arbitrarily tight conservative approximation to the reach-avoid set. We further demonstrate the use of this formulation with deep reinforcement learning methods, retaining zero-violation guarantees by treating the approximate solutions as untrusted oracles in a model-predictive supervisory control framework. We evaluate our proposed framework on a range of nonlinear systems, validating the results against analytic and numerical solutions, and through Monte Carlo simulation in previously intractable problems. Our results open the door to a range of learning-based methods for safe-and-live autonomous behavior, with applications across robotics and automation. See https://github.com/SafeRoboticsLab/safety_rl for code and supplementary material.

preprint2020arXiv

A Hamilton-Jacobi Reachability-Based Framework for Predicting and Analyzing Human Motion for Safe Planning

Real-world autonomous systems often employ probabilistic predictive models of human behavior during planning to reason about their future motion. Since accurately modeling human behavior a priori is challenging, such models are often parameterized, enabling the robot to adapt predictions based on observations by maintaining a distribution over the model parameters. Although this enables data and priors to improve the human model, observation models are difficult to specify and priors may be incorrect, leading to erroneous state predictions that can degrade the safety of the robot motion plan. In this work, we seek to design a predictor which is more robust to misspecified models and priors, but can still leverage human behavioral data online to reduce conservatism in a safe way. To do this, we cast human motion prediction as a Hamilton-Jacobi reachability problem in the joint state space of the human and the belief over the model parameters. We construct a new continuous-time dynamical system, where the inputs are the observations of human behavior, and the dynamics include how the belief over the model parameters change. The results of this reachability computation enable us to both analyze the effect of incorrect priors on future predictions in continuous state and time, as well as to make predictions of the human state in the future. We compare our approach to the worst-case forward reachable set and a stochastic predictor which uses Bayesian inference and produces full future state distributions. Our comparisons in simulation and in hardware demonstrate how our framework can enable robust planning while not being overly conservative, even when the human model is inaccurate.

preprint2020arXiv

An Iterative Quadratic Method for General-Sum Differential Games with Feedback Linearizable Dynamics

Iterative linear-quadratic (ILQ) methods are widely used in the nonlinear optimal control community. Recent work has applied similar methodology in the setting of multiplayer general-sum differential games. Here, ILQ methods are capable of finding local equilibria in interactive motion planning problems in real-time. As in most iterative procedures, however, this approach can be sensitive to initial conditions and hyperparameter choices, which can result in poor computational performance or even unsafe trajectories. In this paper, we focus our attention on a broad class of dynamical systems which are feedback linearizable, and exploit this structure to improve both algorithmic reliability and runtime. We showcase our new algorithm in three distinct traffic scenarios, and observe that in practice our method converges significantly more often and more quickly than was possible without exploiting the feedback linearizable structure.

preprint2020arXiv

Dynamically Computing Adversarial Perturbations for Recurrent Neural Networks

Convolutional and recurrent neural networks have been widely employed to achieve state-of-the-art performance on classification tasks. However, it has also been noted that these networks can be manipulated adversarially with relative ease, by carefully crafted additive perturbations to the input. Though several experimentally established prior works exist on crafting and defending against attacks, it is also desirable to have theoretical guarantees on the existence of adversarial examples and robustness margins of the network to such examples. We provide both in this paper. We focus specifically on recurrent architectures and draw inspiration from dynamical systems theory to naturally cast this as a control problem, allowing us to dynamically compute adversarial perturbations at each timestep of the input sequence, thus resembling a feedback controller. Illustrative examples are provided to supplement the theoretical discussions.

preprint2020arXiv

Efficient Iterative Linear-Quadratic Approximations for Nonlinear Multi-Player General-Sum Differential Games

Many problems in robotics involve multiple decision making agents. To operate efficiently in such settings, a robot must reason about the impact of its decisions on the behavior of other agents. Differential games offer an expressive theoretical framework for formulating these types of multi-agent problems. Unfortunately, most numerical solution techniques scale poorly with state dimension and are rarely used in real-time applications. For this reason, it is common to predict the future decisions of other agents and solve the resulting decoupled, i.e., single-agent, optimal control problem. This decoupling neglects the underlying interactive nature of the problem; however, efficient solution techniques do exist for broad classes of optimal control problems. We take inspiration from one such technique, the iterative linear-quadratic regulator (ILQR), which solves repeated approximations with linear dynamics and quadratic costs. Similarly, our proposed algorithm solves repeated linear-quadratic games. We experimentally benchmark our algorithm in several examples with a variety of initial conditions and show that the resulting strategies exhibit complex interactive behavior. Our results indicate that our algorithm converges reliably and runs in real-time. In a three-player, 14-state simulated intersection problem, our algorithm initially converges in < 0.25s. Receding horizon invocations converge in < 50 ms in a hardware collision-avoidance test.

preprint2020arXiv

Feedback Linearization for Unknown Systems via Reinforcement Learning

We present a novel approach to control design for nonlinear systems which leverages model-free policy optimization techniques to learn a linearizing controller for a physical plant with unknown dynamics. Feedback linearization is a technique from nonlinear control which renders the input-output dynamics of a nonlinear plant \emph{linear} under application of an appropriate feedback controller. Once a linearizing controller has been constructed, desired output trajectories for the nonlinear plant can be tracked using a variety of linear control techniques. However, the calculation of a linearizing controller requires a precise dynamics model for the system. As a result, model-based approaches for learning exact linearizing controllers generally require a simple, highly structured model of the system with easily identifiable parameters. In contrast, the model-free approach presented in this paper is able to approximate the linearizing controller for the plant using general function approximation architectures. Specifically, we formulate a continuous-time optimization problem over the parameters of a learned linearizing controller whose optima are the set of parameters which best linearize the plant. We derive conditions under which the learning problem is (strongly) convex and provide guarantees which ensure the true linearizing controller for the plant is recovered. We then discuss how model-free policy optimization algorithms can be used to solve a discrete-time approximation to the problem using data collected from the real-world plant. The utility of the framework is demonstrated in simulation and on a real-world robotic platform.

preprint2020arXiv

Improving Input-Output Linearizing Controllers for Bipedal Robots via Reinforcement Learning

The main drawbacks of input-output linearizing controllers are the need for precise dynamics models and not being able to account for input constraints. Model uncertainty is common in almost every robotic application and input saturation is present in every real world system. In this paper, we address both challenges for the specific case of bipedal robot control by the use of reinforcement learning techniques. Taking the structure of a standard input-output linearizing controller, we use an additive learned term that compensates for model uncertainty. Moreover, by adding constraints to the learning problem we manage to boost the performance of the final controller when input limits are present. We demonstrate the effectiveness of the designed framework for different levels of uncertainty on the five-link planar walking robot RABBIT.

preprint2020arXiv

Inference-Based Strategy Alignment for General-Sum Differential Games

In many settings where multiple agents interact, the optimal choices for each agent depend heavily on the choices of the others. These coupled interactions are well-described by a general-sum differential game, in which players have differing objectives, the state evolves in continuous time, and optimal play may be characterized by one of many equilibrium concepts, e.g., a Nash equilibrium. Often, problems admit multiple equilibria. From the perspective of a single agent in such a game, this multiplicity of solutions can introduce uncertainty about how other agents will behave. This paper proposes a general framework for resolving ambiguity between equilibria by reasoning about the equilibrium other agents are aiming for. We demonstrate this framework in simulations of a multi-player human-robot navigation problem that yields two main conclusions: First, by inferring which equilibrium humans are operating at, the robot is able to predict trajectories more accurately, and second, by discovering and aligning itself to this equilibrium the robot is able to reduce the cost for all players.

preprint2020arXiv

Reinforcement Learning for Safety-Critical Control under Model Uncertainty, using Control Lyapunov Functions and Control Barrier Functions

In this paper, the issue of model uncertainty in safety-critical control is addressed with a data-driven approach. For this purpose, we utilize the structure of an input-ouput linearization controller based on a nominal model along with a Control Barrier Function and Control Lyapunov Function based Quadratic Program (CBF-CLF-QP). Specifically, we propose a novel reinforcement learning framework which learns the model uncertainty present in the CBF and CLF constraints, as well as other control-affine dynamic constraints in the quadratic program. The trained policy is combined with the nominal model-based CBF-CLF-QP, resulting in the Reinforcement Learning-based CBF-CLF-QP (RL-CBF-CLF-QP), which addresses the problem of model uncertainty in the safety constraints. The performance of the proposed method is validated by testing it on an underactuated nonlinear bipedal robot walking on randomly spaced stepping stones with one step preview, obtaining stable and safe walking under model uncertainty.

preprint2020arXiv

Risk-sensitive safety specifications for stochastic systems using Conditional Value-at-Risk

This paper proposes a safety analysis method that facilitates a tunable balance between the worst-case and risk-neutral perspectives. First, we define a risk-sensitive safe set to specify the degree of safety attained by a stochastic system. This set is defined as a sublevel set of the solution to an optimal control problem that is expressed using the Conditional Value-at-Risk (CVaR) measure. This problem does not satisfy Bellman's Principle, thus our next contribution is to show how risk-sensitive safe sets can be under-approximated by the solution to a CVaR-Markov Decision Process. We adopt an existing value iteration algorithm to find an approximate solution to the reduced problem for a class of linear systems. Then, we develop a realistic numerical example of a stormwater system to show that this approach can be applied to non-linear systems. Finally, we compare the CVaR criterion to the exponential disutility criterion. The latter allocates control effort evenly across the cost distribution to reduce variance, while the CVaR criterion focuses control effort on a given worst-case quantile--where it matters most for safety.

preprint2020arXiv

Technical Report: Adaptive Control for Linearizable Systems Using On-Policy Reinforcement Learning

This paper proposes a framework for adaptively learning a feedback linearization-based tracking controller for an unknown system using discrete-time model-free policy-gradient parameter update rules. The primary advantage of the scheme over standard model-reference adaptive control techniques is that it does not require the learned inverse model to be invertible at all instances of time. This enables the use of general function approximators to approximate the linearizing controller for the system without having to worry about singularities. However, the discrete-time and stochastic nature of these algorithms precludes the direct application of standard machinery from the adaptive control literature to provide deterministic stability proofs for the system. Nevertheless, we leverage these techniques alongside tools from the stochastic approximation literature to demonstrate that with high probability the tracking and parameter errors concentrate near zero when a certain persistence of excitation condition is satisfied. A simulated example of a double pendulum demonstrates the utility of the proposed theory. 1

preprint2019arXiv

A Classification-based Approach for Approximate Reachability

Hamilton-Jacobi (HJ) reachability analysis has been developed over the past decades into a widely-applicable tool for determining goal satisfaction and safety verification in nonlinear systems. While HJ reachability can be formulated very generally, computational complexity can be a serious impediment for many systems of practical interest. Much prior work has been devoted to computing approximate solutions to large reachability problems, yet many of these methods may only apply to very restrictive problem classes, do not generate controllers, and/or can be extremely conservative. In this paper, we present a new method for approximating the optimal controller of the HJ reachability problem for control-affine systems. While also a specific problem class, many dynamical systems of interest are, or can be well approximated, by control-affine models. We explicitly avoid storing a representation of the reachability value function, and instead learn a controller as a sequence of simple binary classifiers. We compare our approach to existing grid-based methodologies in HJ reachability and demonstrate its utility on several examples, including a physical quadrotor navigation task.

preprint2016arXiv

Building Model Identification during Regular Operation - Empirical Results and Challenges

The inter-temporal consumption flexibility of commercial buildings can be harnessed to improve the energy efficiency of buildings, or to provide ancillary service to the power grid. To do so, a predictive model of the building's thermal dynamics is required. In this paper, we identify a physics-based model of a multi-purpose commercial building including its heating, ventilation and air conditioning system during regular operation. We present our empirical results and show that large uncertainties in internal heat gains, due to occupancy and equipment, present several challenges in utilizing the building model for long-term prediction. In addition, we show that by learning these uncertain loads online and dynamically updating the building model, prediction accuracy is improved significantly.

preprint2016arXiv

Exact and Efficient Hamilton-Jacobi Reachability for Decoupled Systems

Reachability analysis is important for studying optimal control problems and differential games, which are powerful theoretical tools for analyzing and modeling many practical problems in robotics, aircraft control, among other application areas. In reachability analysis, one is interested in computing the reachable set, defined as the set of states from which there exists a control, despite the worst disturbance, that can drive the system into a set of target states. The target states can be used to model either unsafe or desirable configurations, depending on the application. Many Hamilton-Jacobi formulations allow the computation of reachable sets; however, due to the exponential complexity scaling in computation time and space, problems involving approximately 5 dimensions become intractable. A number of methods that compute an approximate solution exist in the literature, but these methods trade off complexity for optimality. In this paper, we eliminate complexity-optimality trade-offs for time-invariant decoupled systems using a decoupled Hamilton-Jacobi formulation that enables the exact reconstruction of high dimensional solutions via low dimensional solutions of the decoupled subsystems. Our formulation is compatible with existing numerical tools, and we show the accuracy, computation benefits, and an application of our novel approach using two numerical examples.

preprint2016arXiv

Exact and Efficient Hamilton-Jacobi-based Guaranteed Safety Analysis via System Decomposition

Hamilton-Jacobi (HJ) reachability is a method that provides rigorous analyses of the safety properties of dynamical systems. This method has been successfully applied to many low-dimensional dynamical system models such as coarse models of aircraft and quadrotors in order to provide safety guarantees in potentially dangerous scenarios. These guarantees can be provided by the computation of a backward reachable set (BRS), which represents the set of states from which the system may be driven into violating safety properties despite the system's best effort to remain safe. Unfortunately, HJ reachability is not practical for high-dimensional systems because the complexity of the BRS computation scales exponentially with the number of state dimensions. Although numerous approximation techniques are able to tractably provide conservative estimates of the BRS, they often require restrictive assumptions about system dynamics without providing an exact solution. In this paper we propose a general method for decomposing dynamical systems. Even when the resulting subsystems are coupled, relatively high-dimensional BRSs that were previously intractable or expensive to compute can now be quickly and exactly computed in lower-dimensional subspaces. As a result, the curse of dimensionality is alleviated to a large degree without sacrificing optimality. We demonstrate our theoretical results through two numerical examples: a 3D Dubins Car model and a 6D Acrobatic Quadrotor model.

preprint2016arXiv

Learning Quadrotor Dynamics Using Neural Network for Flight Control

Traditional learning approaches proposed for controlling quadrotors or helicopters have focused on improving performance for specific trajectories by iteratively improving upon a nominal controller, for example learning from demonstrations, iterative learning, and reinforcement learning. In these schemes, however, it is not clear how the information gathered from the training trajectories can be used to synthesize controllers for more general trajectories. Recently, the efficacy of deep learning in inferring helicopter dynamics has been shown. Motivated by the generalization capability of deep learning, this paper investigates whether a neural network based dynamics model can be employed to synthesize control for trajectories different than those used for training. To test this, we learn a quadrotor dynamics model using only translational and only rotational training trajectories, each of which can be controlled independently, and then use it to simultaneously control the yaw and position of a quadrotor, which is non-trivial because of nonlinear couplings between the two motions. We validate our approach in experiments on a quadrotor testbed.

preprint2016arXiv

Model Comparison of a Data-Driven and a Physical Model for Simulating HVAC Systems

Commercial buildings are responsible for a large fraction of energy consumption in developed countries, and therefore are targets of energy efficiency programs. Motivated by the large inherent thermal inertia of buildings, the power consumption can be flexibly scheduled without compromising occupant comfort. This temporal flexibility offers opportunities for the provision of frequency regulation to support grid stability. To realize energy savings and frequency regulation, it is of prime importance to identify a realistic model for the temperature dynamics of a building. We identify a low- dimensional data-driven model and a high-dimensional physics- based model for different spatial granularities and temporal seasons based on a case study of an entire floor of Sutardja Dai Hall, an office building on the University of California, Berkeley campus. A comparison of these contrasting models shows that, despite the higher forecasting accuracy of the physics-based model, both models perform almost equally well for energy efficient control. We conclude that the data-driven model is more amenable to controller design due to its low complexity, and could serve as a substitution for highly complex physics- based models with an insignificant loss of prediction accuracy for many applications. On the other hand, our physics-based approach is more suitable for modeling buildings with finer spatial granularities.

preprint2016arXiv

Multiplayer Reach-Avoid Games via Pairwise Outcomes

A multiplayer reach-avoid game is a differential game between an attacking team with NA attackers and a defending team with ND defenders playing on a compact domain with obstacles. The attacking team aims to send M of the NA attackers to some target location, while the defending team aims to prevent that by capturing attackers or indefinitely delaying attackers from reaching the target. Although the analysis of this game plays an important role in many applications, the optimal solution to this game is computationally intractable when NA>1 or ND>1. In this paper, we present two approaches for the NA=ND=1 case to determine pairwise outcomes, and a graph theoretic maximum matching approach to merge these pairwise outcomes for an NA,ND>1 solution that provides guarantees on the performance of the defending team. We will show that the four-dimensional Hamilton-Jacobi-Isaacs approach allows for real-time updates to the maximum matching, and that the two-dimensional "path defense" approach is considerably more scalable with the number of players while maintaining defender performance guarantees.

preprint2016arXiv

Plug-and-Play Model Predictive Control for Load Shaping and Voltage Control in Smart Grids

This paper presents a predictive controller for handling plug-and-play (P&P) charging requests of flexible loads in a distribution system. We define two types of flexible loads: (i) deferrable loads that have a fixed power profile but can be deferred in time and (ii) shapeable loads that have flexible power profiles but fixed energy requests, such as Plug-in Electric Vehicles (PEVs). The proposed method uses a hierarchical control scheme based on a model predictive control (MPC) formulation for minimizing the global system cost. The first stage computes a reachable reference that trades off deviation from the nominal voltage with the required generation control. The second stage uses a price-based objective to aggregate flexible loads and provide load shaping services, while satisfying system constraints and users' preferences at all times. It is shown that the proposed controller is recursively feasible under specific conditions, i.e. the flexible load demands are satisfied and bus voltages remain within the desired limits. Finally, the proposed scheme is illustrated on a 55 bus radial distribution network.

preprint2016arXiv

Robust Sequential Path Planning Under Disturbances and Adversarial Intruder

Provably safe and scalable multi-vehicle path planning is an important and urgent problem due to the expected increase of automation in civilian airspace in the near future. Although this problem has been studied in the past, there has not been a method that guarantees both goal satisfaction and safety for vehicles with general nonlinear dynamics while taking into account disturbances and potential adversarial agents, to the best of our knowledge. Hamilton-Jacobi (HJ) reachability is the ideal tool for guaranteeing goal satisfaction and safety under such scenarios, and has been successfully applied to many small-scale problems. However, a direct application of HJ reachability in most cases becomes intractable when there are more than two vehicles due to the exponentially scaling computational complexity with respect to system dimension. In this paper, we take advantage of the guarantees HJ reachability provides, and eliminate the computation burden by assigning a strict priority ordering to the vehicles under consideration. Under this sequential path planning (SPP) scheme, vehicles reserve "space-time" portions in the airspace, and the space-time portions guarantee dynamic feasibility, collision avoidance, and optimality of the paths given the priority ordering. With a computation complexity that scales quadratically when accounting for both disturbances and an intruder, and linearly when accounting for only disturbances, SPP can tractably solve the multi-vehicle path planning problem for vehicles with general nonlinear dynamics in a practical setting. We demonstrate our theory in representative simulations.

preprint2016arXiv

Safe Platooning of Unmanned Aerial Vehicles via Reachability

Recently, there has been immense interest in using unmanned aerial vehicles (UAVs) for civilian operations such as package delivery, firefighting, and fast disaster response. As a result, UAV traffic management systems are needed to support potentially thousands of UAVs flying simultaneously in the airspace, in order to ensure their liveness and safety requirements are met. Hamilton-Jacobi (HJ) reachability is a powerful framework for providing conditions under which these requirements can be met, and for synthesizing the optimal controller for meeting them. However, due to the curse of dimensionality, HJ reachability is only tractable for a small number of vehicles if their set of maneuvers is unrestricted. In this paper, we define a platoon to be a group of UAVs in a single-file formation. We model each vehicle as a hybrid system with modes corresponding to its role in the platoon, and specify the set of allowed maneuvers in each mode to make the analysis tractable. We propose several liveness controllers based on HJ reachability, and wrap a safety controller, also based on HJ reachability, around the liveness controllers. For a single altitude range, our approach guarantees safety for one safety breach; in the unlikely event of multiple safety breaches, safety can be guaranteed over multiple altitude ranges. We demonstrate the satisfaction of liveness and safety requirements through simulations of three common scenarios.

preprint2016arXiv

Safe Sequential Path Planning of Multi-Vehicle Systems via Double-Obstacle Hamilton-Jacobi-Isaacs Variational Inequality

We consider the problem of planning trajectories for a group of $N$ vehicles, each aiming to reach its own target set while avoiding danger zones of other vehicles. The analysis of problems like this is extremely important practically, especially given the growing interest in utilizing unmanned aircraft systems for civil purposes. The direct solution of this problem by solving a single-obstacle Hamilton-Jacobi-Isaacs (HJI) variational inequality (VI) is numerically intractable due to the exponential scaling of computation complexity with problem dimensionality. Furthermore, the single-obstacle HJI VI cannot directly handle situations in which vehicles do not have a common scheduled arrival time. Instead, we perform sequential path planning by considering vehicles in order of priority, modeling higher-priority vehicles as time-varying obstacles for lower-priority vehicles. To do this, we solve a double-obstacle HJI VI which allows us to obtain the reach-avoid set, defined as the set of states from which a vehicle can reach its target while staying within a time-varying state constraint set. From the solution of the double-obstacle HJI VI, we can also extract the latest start time and the optimal control for each vehicle. This is a first application of the double-obstacle HJI VI which can handle systems with time-varying dynamics, target sets, and state constraint sets, and results in computation complexity that scales linearly, as opposed to exponentially, with the number of vehicles in consideration.

preprint2016arXiv

Secure Estimation for Unmanned Aerial Vehicles against Adversarial Cyber Attacks

In the coming years, usage of Unmanned Aerial Vehicles (UAVs) is expected to grow tremendously. Maintaining security of UAVs under cyber attacks is an important yet challenging task, as these attacks are often erratic and difficult to predict. Secure estimation problems study how to estimate the states of a dynamical system from a set of noisy and maliciously corrupted sensor measurements. The fewer assumptions that an estimator makes about the attacker, the larger the set of attacks it can protect the system against. In this paper, we focus on sensor attacks on UAVs and attempt to design a secure estimator for linear time-invariant systems based on as few assumptions about the attackers as possible. We propose a computationally efficient estimator that protects the system against arbitrary and unbounded attacks, where the set of attacked sensors can also change over time. In addition, we propose to combine our secure estimator with a Kalman Filter for improved practical performance and demonstrate its effectiveness through simulations of two scenarios where an UAV is under adversarial cyber attack.

preprint2016arXiv

Secure State Estimation for Nonlinear Power Systems under Cyber Attacks

This paper focuses on securely estimating the state of a nonlinear dynamical system from a set of corrupted measurements. In particular, we consider two broad classes of nonlinear systems, and propose a technique which enables us to perform secure state estimation for such nonlinear systems. We then provide guarantees on the achievable state estimation error against arbitrary corruptions, and analytically characterize the number of errors that can be perfectly corrected by a decoder. To illustrate how the proposed nonlinear estimation approach can be applied to practical systems, we focus on secure estimation for the wide area control of an interconnected power system under cyber-physical attacks and communication failures, and propose a secure estimator for the power system. Finally, we numerically show that the proposed secure estimation algorithm enables us to reconstruct the attack signals accurately.

preprint2015arXiv

Approximation Algorithms for Optimization of Combinatorial Dynamical Systems

This paper considers an optimization problem for a dynamical system whose evolution depends on a collection of binary decision variables. We develop scalable approximation algorithms with provable suboptimality bounds to provide computationally tractable solution methods even when the dimension of the system and the number of the binary variables are large. The proposed method employs a linear approximation of the objective function such that the approximate problem is defined over the feasible space of the binary decision variables, which is a discrete set. To define such a linear approximation, we propose two different variation methods: one uses continuous relaxation of the discrete space and the other uses convex combinations of the vector field and running payoff. The approximate problem is a 0-1 linear program, which can be solved by existing polynomial-time exact or approximation algorithms, and does not require the solution of the dynamical system. Furthermore, we characterize a sufficient condition ensuring the approximate solution has a provable suboptimality bound. We show that this condition can be interpreted as the concavity of the objective function. The performance and utility of the proposed algorithms are demonstrated with the ON/OFF control problems of interdependent refrigeration systems.

preprint2014arXiv

A sampling-based approach to scalable constraint satisfaction in linear sampled-data systems---Part I: Computation

Sampled-data (SD) systems, which are composed of both discrete- and continuous-time components, are arguably one of the most common classes of cyberphysical systems in practice; most modern controllers are implemented on digital platforms while the plant dynamics that are being controlled evolve continuously in time. As with all cyberphysical systems, ensuring hard constraint satisfaction is key in the safe operation of SD systems. A powerful analytical tool for guaranteeing such constraint satisfaction is the viability kernel: the set of all initial conditions for which a safety-preserving control law (that is, a control law that satisfies all input and state constraints) exists. In this paper we present a novel sampling-based algorithm that tightly approximates the viability kernel for high-dimensional sampled-data linear time-invariant (LTI) systems. Unlike prior work in this area, our algorithm formally handles both the discrete and continuous characteristics of SD systems. We prove the correctness and convergence of our approximation technique, provide discussions on heuristic methods to optimally bias the sampling process, and demonstrate the results on a twelve-dimensional flight envelope protection problem.

preprint2014arXiv

Dynamic Contracts with Partial Observations: Application to Indirect Load Control

This paper proposes a method to design an optimal dynamic contract between a principal and an agent, who has the authority to control both the principal's revenue and an engineered system. The key characteristic of our problem setting is that the principal has very limited information: the principal has no capability to monitor the agent's control or the state of the engineered system. The agent has perfect observations. With this asymmetry of information, we show that the principal can induce the agent to control both the revenue and the system processes in a way that maximizes the principal's utility, if the principal offers appropriate real-time and end-time compensation. We reformulate the dynamic contract design problem as a stochastic optimal control of both the engineered system and the agent's future expected payoff, which can be numerically solved using an associated Hamilton-Jacobi-Bellman equation. The performance and usefulness of the proposed contract are demonstrated with an indirect load control problem.

preprint2014arXiv

Reach-Avoid Problems with Time-Varying Dynamics, Targets and Constraints

We consider a reach-avoid differential game, in which one of the players aims to steer the system into a target set without violating a set of state constraints, while the other player tries to prevent the first from succeeding; the system dynamics, target set, and state constraints may all be time-varying. The analysis of this problem plays an important role in collision avoidance, motion planning and aircraft control, among other applications. Previous methods for computing the guaranteed winning initial conditions and strategies for each player have either required augmenting the state vector to include time, or have been limited to problems with either no state constraints or entirely static targets, constraints and dynamics. To incorporate time-varying dynamics, targets and constraints without the need for state augmentation, we propose a modified Hamilton-Jacobi-Isaacs equation in the form of a double-obstacle variational inequality, and prove that the zero sublevel set of its viscosity solution characterizes the capture basin for the target under the state constraints. Through this formulation, our method can compute the capture basin and winning strategies for time-varying games at no additional computational cost with respect to the time-invariant case. We provide an implementation of this method based on well-known numerical schemes and show its convergence through a simple example; we include a second example in which our method substantially outperforms the state augmentation approach.

preprint2014arXiv

Risk-Limiting Dynamic Contracts for Direct Load Control

This paper proposes a novel continuous-time dynamic contract framework that has a risk-limiting capability. If a principal and an agent enter into such a contract, the principal can optimally manage its performance and risk with a guarantee that the agent's risk is less than or equal to a pre-specified level and that the agent's expected payoff is greater than or equal to another pre-specified threshold. We achieve such risk-management capabilities by formulating the contract design problem as mean-variance constrained risk-sensitive control. A dynamic programming-based method is developed to solve the problem. The key idea of our proposed solution method is to reformulate the inequality constraints on the mean and the variance of the agent's payoff as dynamical system constraints by introducing new state and control variables. The reformulations use the martingale representation theorem. The proposed contract method enables us to develop a new direct load control method that provides the load-serving entity with financial risk management solutions in real-time electricity markets. We also propose an approximate decomposition of the optimal contract design problem for multiple customers into multiple low-dimensional contract problems for one customer. This allows the direct load control program to work with a large number of customers without any scalability issues. Furthermore, the contract design procedure can be completely parallelized. The performance and usefulness of the proposed contract method and its application to direct load control are demonstrated using data on the electric energy consumption of customers in Austin, Texas as well as the Electricity Reliability Council of Texas' locational marginal price data.

preprint2013arXiv

Identification of Parameters and Initial Values for Reaction-Diffusion Systems in Protein Networks (Extended Version)

Spatio-temporal biochemical signaling in a large class of protein-protein interaction networks is well modeled by a reaction-diffusion system. The global existence of the solution to the reaction-diffusion system is determined by the reaction kinetics model and the protein network topology. We propose a novel reaction kinetics model that guarantees that the reaction-diffusion system with this model has a nonnegative invariant global classical solution for any network topology. We then present a computational method to identify the unknown parameters and initial values for a reaction-diffusion system with this reaction kinetics model. The identification approach solves an optimization problem that minimizes the cost function defined as the $L^2$-norm of the difference between the data and the solution of the reaction-diffusion system. We utilize an adjoint-based optimal control method to obtain the gradients of the cost function with respect to the parameters and initial values. The regularity of the global classical solutions of the reaction-diffusion system and its corresponding adjoint system avoids situations in which the gradients blow up, and therefore guarantees the success of the identification method for any network structure. Utilizing this gradient information, an efficient algorithm to solve the optimization problem is proposed and applied to estimate the mass diffusivities, rate constants and initial values of a reaction-diffusion system that models protein-protein interactions in a signaling network that regulates the actin cytoskeleton in a malignant breast cell.

Claire J. Tomlin

What is connected

Connect this record

See the researcher in context

Building this map preview

35 published item(s)

Computation of Regions of Attraction for Hybrid Limit Cycles Using Reachability: An Application to Walking Robots

Koopman-Based Neural Lyapunov Functions for General Attractors

Risk-sensitive safety analysis using Conditional Value-at-Risk

Stability and Robustness of a Hybrid Control Law for the Half-bridge Inverter

FaSTrack: a Modular Framework for Fast and Guaranteed Safe Motion Planning

Safety and Liveness Guarantees through Reach-Avoid Reinforcement Learning

A Hamilton-Jacobi Reachability-Based Framework for Predicting and Analyzing Human Motion for Safe Planning

An Iterative Quadratic Method for General-Sum Differential Games with Feedback Linearizable Dynamics

Dynamically Computing Adversarial Perturbations for Recurrent Neural Networks

Efficient Iterative Linear-Quadratic Approximations for Nonlinear Multi-Player General-Sum Differential Games

Feedback Linearization for Unknown Systems via Reinforcement Learning

Improving Input-Output Linearizing Controllers for Bipedal Robots via Reinforcement Learning

Inference-Based Strategy Alignment for General-Sum Differential Games

Reinforcement Learning for Safety-Critical Control under Model Uncertainty, using Control Lyapunov Functions and Control Barrier Functions

Risk-sensitive safety specifications for stochastic systems using Conditional Value-at-Risk

Technical Report: Adaptive Control for Linearizable Systems Using On-Policy Reinforcement Learning

A Classification-based Approach for Approximate Reachability

Building Model Identification during Regular Operation - Empirical Results and Challenges

Exact and Efficient Hamilton-Jacobi Reachability for Decoupled Systems

Exact and Efficient Hamilton-Jacobi-based Guaranteed Safety Analysis via System Decomposition

Learning Quadrotor Dynamics Using Neural Network for Flight Control

Model Comparison of a Data-Driven and a Physical Model for Simulating HVAC Systems

Multiplayer Reach-Avoid Games via Pairwise Outcomes

Plug-and-Play Model Predictive Control for Load Shaping and Voltage Control in Smart Grids

Robust Sequential Path Planning Under Disturbances and Adversarial Intruder

Safe Platooning of Unmanned Aerial Vehicles via Reachability

Safe Sequential Path Planning of Multi-Vehicle Systems via Double-Obstacle Hamilton-Jacobi-Isaacs Variational Inequality

Secure Estimation for Unmanned Aerial Vehicles against Adversarial Cyber Attacks

Secure State Estimation for Nonlinear Power Systems under Cyber Attacks

Approximation Algorithms for Optimization of Combinatorial Dynamical Systems

A sampling-based approach to scalable constraint satisfaction in linear sampled-data systems---Part I: Computation

Dynamic Contracts with Partial Observations: Application to Indirect Load Control

Reach-Avoid Problems with Time-Varying Dynamics, Targets and Constraints

Risk-Limiting Dynamic Contracts for Direct Load Control

Identification of Parameters and Initial Values for Reaction-Diffusion Systems in Protein Networks (Extended Version)