Source author record

Geir Dullerud

Geir Dullerud appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.OC Systems and Control Cryptography and Security eess.SY math.DS Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

8works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Model-Free $μ$ Synthesis via Adversarial Reinforcement Learning

Motivated by the recent empirical success of policy-based reinforcement learning (RL), there has been a research trend studying the performance of policy-based RL methods on standard control benchmark problems. In this paper, we examine the effectiveness of policy-based RL methods on an important robust control problem, namely $μ$ synthesis. We build a connection between robust adversarial RL and $μ$ synthesis, and develop a model-free version of the well-known $DK$-iteration for solving state-feedback $μ$ synthesis with static $D$-scaling. In the proposed algorithm, the $K$ step mimics the classical central path algorithm via incorporating a recently-developed double-loop adversarial RL method as a subroutine, and the $D$ step is based on model-free finite difference approximation. Extensive numerical study is also presented to demonstrate the utility of our proposed model-free algorithm. Our study sheds new light on the connections between adversarial RL and robust control.

preprint2022arXiv

Revisiting PGD Attacks for Stability Analysis of Large-Scale Nonlinear Systems and Perception-Based Control

Many existing region-of-attraction (ROA) analysis tools find difficulty in addressing feedback systems with large-scale neural network (NN) policies and/or high-dimensional sensing modalities such as cameras. In this paper, we tailor the projected gradient descent (PGD) attack method developed in the adversarial learning community as a general-purpose ROA analysis tool for large-scale nonlinear systems and end-to-end perception-based control. We show that the ROA analysis can be approximated as a constrained maximization problem whose goal is to find the worst-case initial condition which shifts the terminal state the most. Then we present two PGD-based iterative methods which can be used to solve the resultant constrained maximization problem. Our analysis is not based on Lyapunov theory, and hence requires minimum information of the problem structures. In the model-based setting, we show that the PGD updates can be efficiently performed using back-propagation. In the model-free setting (which is more relevant to ROA analysis of perception-based control), we propose a finite-difference PGD estimate which is general and only requires a black-box simulator for generating the trajectories of the closed-loop system given any initial state. We demonstrate the scalability and generality of our analysis tool on several numerical examples with large-scale NN policies and high-dimensional image observations. We believe that our proposed analysis serves as a meaningful initial step toward further understanding of closed-loop stability of large-scale nonlinear systems and perception-based control.

preprint2020arXiv

Convergence Guarantees of Policy Optimization Methods for Markovian Jump Linear Systems

Recently, policy optimization for control purposes has received renewed attention due to the increasing interest in reinforcement learning. In this paper, we investigate the convergence of policy optimization for quadratic control of Markovian jump linear systems (MJLS). First, we study the optimization landscape of direct policy optimization for MJLS, and, in particular, show that despite the non-convexity of the resultant problem the unique stationary point is the global optimal solution. Next, we prove that the Gauss-Newton method and the natural policy gradient method converge to the optimal state feedback controller for MJLS at a linear rate if initialized at a controller which stabilizes the closed-loop dynamics in the mean square sense. We propose a novel Lyapunov argument to fix a key stability issue in the convergence proof. Finally, we present a numerical example to support our theory. Our work brings new insights for understanding the performance of policy learning methods on controlling unknown MJLS.

preprint2020arXiv

Policy Learning of MDPs with Mixed Continuous/Discrete Variables: A Case Study on Model-Free Control of Markovian Jump Systems

Markovian jump linear systems (MJLS) are an important class of dynamical systems that arise in many control applications. In this paper, we introduce the problem of controlling unknown (discrete-time) MJLS as a new benchmark for policy-based reinforcement learning of Markov decision processes (MDPs) with mixed continuous/discrete state variables. Compared with the traditional linear quadratic regulator (LQR), our proposed problem leads to a special hybrid MDP (with mixed continuous and discrete variables) and poses significant new challenges due to the appearance of an underlying Markov jump parameter governing the mode of the system dynamics. Specifically, the state of a MJLS does not form a Markov chain and hence one cannot study the MJLS control problem as a MDP with solely continuous state variable. However, one can augment the state and the jump parameter to obtain a MDP with a mixed continuous/discrete state space. We discuss how control theory sheds light on the policy parameterization of such hybrid MDPs. Then we modify the widely used natural policy gradient method to directly learn the optimal state feedback control policy for MJLS without identifying either the system dynamics or the transition probability of the switching parameter. We implement the (data-driven) natural policy gradient method on different MJLS examples. Our simulation results suggest that the natural gradient method can efficiently learn the optimal controller for MJLS with unknown dynamics.

preprint2016arXiv

Extremal storage functions and minimal realizations of discrete-time linear switching systems

We study the $\mathcal{L}_p$ induced gain of discrete-time linear switching systems with graph-constrained switching sequences. We first prove that, for stable systems in a minimal realization, for every $p \geq 1$, the $\mathcal{L}_p$-gain is exactly characterized through switching storage functions. These functions are shown to be the $p$th power of a norm. In order to consider general systems, we provide an algorithm for computing minimal realizations. These realizations are \emph{rectangular systems}, with a state dimension that varies according to the mode of the system. We apply our tools to the study on the of $\mathcal{L}_2$-gain. We provide algorithms for its approximation, and provide a converse result for the existence of quadratic switching storage functions. We finally illustrate the results with a physically motivated example.

preprint2016arXiv

Stability of discrete-time switching systems with constrained switching sequences

We introduce a novel framework for the stability analysis of discrete-time linear switching systems with switching sequences constrained by an automaton. The key element of the framework is the algebraic concept of multinorm, which associates a different norm per node of the automaton, and allows to exactly characterize stability. Building upon this tool, we develop the first arbitrarily accurate approximation schemes for estimating the constrained joint spectral radius r, that is the exponential growth rate of a switching system with constrained switching sequences. More precisely, given a relative accuracy a > 0, the algorithms compute an estimate of r within the range [r; (1 + a)r]. These algorithms amount to solve a well defined convex optimization program with known time-complexity, and whose size depends on the desired relative accuracy a > 0.

preprint2015arXiv

Controller Synthesis for Linear Time-varying Systems with Adversaries

We present a controller synthesis algorithm for a discrete time reach-avoid problem in the presence of adversaries. Our model of the adversary captures typical malicious attacks envisioned on cyber-physical systems such as sensor spoofing, controller corruption, and actuator intrusion. After formulating the problem in a general setting, we present a sound and complete algorithm for the case with linear dynamics and an adversary with a budget on the total L2-norm of its actions. The algorithm relies on a result from linear control theory that enables us to decompose and precisely compute the reachable states of the system in terms of a symbolic simulation of the adversary-free dynamics and the total uncertainty induced by the adversary. With this decomposition, the synthesis problem eliminates the universal quantifier on the adversary's choices and the symbolic controller actions can be effectively solved using an SMT solver. The constraints induced by the adversary are computed by solving second-order cone programmings. The algorithm is later extended to synthesize state-dependent controller and to generate attacks for the adversary. We present preliminary experimental results that show the effectiveness of this approach on several example problems.

preprint2012arXiv

Differentially Private Iterative Synchronous Consensus

The iterative consensus problem requires a set of processes or agents with different initial values, to interact and update their states to eventually converge to a common value. Protocols solving iterative consensus serve as building blocks in a variety of systems where distributed coordination is required for load balancing, data aggregation, sensor fusion, filtering, clock synchronization and platooning of autonomous vehicles. In this paper, we introduce the private iterative consensus problem where agents are required to converge while protecting the privacy of their initial values from honest but curious adversaries. Protecting the initial states, in many applications, suffice to protect all subsequent states of the individual participants. First, we adapt the notion of differential privacy in this setting of iterative computation. Next, we present a server-based and a completely distributed randomized mechanism for solving private iterative consensus with adversaries who can observe the messages as well as the internal states of the server and a subset of the clients. Finally, we establish the tradeoff between privacy and the accuracy of the proposed randomized mechanism.