Source author record

Gautam Goel

Gautam Goel appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.OC eess.SY math.DS Systems and Control

Catalog footprint

What is connected

4works

5topics

3close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Measurement-Feedback Control with Optimal Data-Dependent Regret

Inspired by online learning, data-dependent regret has recently been proposed as a criterion for controller design. In the regret-optimal control paradigm, causal controllers are designed to minimize regret against a hypothetical optimal noncausal controller, which selects the globally cost-minimizing sequence of control actions given noncausal access to the disturbance sequence. We extend regret-optimal control to the more challenging measurement-feedback setting, where the online controller must compete against the optimal noncausal controller without directly observing the state or the driving disturbance. We show that no measurement-feedback controller can have bounded competitive ratio or regret which is bounded by the pathlength of the measurement disturbance. We do derive, however, a controller whose regret has optimal dependence on the joint energy of the driving and measurement disturbances, and another controller whose regret has optimal dependence on the pathlength of the driving disturbance and the energy of the measurement disturbance. The key technique we introduce is a reduction from regret-optimal measurement-feedback control to $H_{\infty}$-optimal measurement-feedback control in a synthetic system. We present numerical simulations which illustrate the efficacy of our proposed control algorithms.

preprint2021arXiv

Regret-optimal control in dynamic environments

We consider control in linear time-varying dynamical systems from the perspective of regret minimization. Unlike most prior work in this area, we focus on the problem of designing an online controller which minimizes regret against the best dynamic sequence of control actions selected in hindsight (dynamic regret), instead of the best fixed controller in some specific class of controllers (static regret). This formulation is attractive when the environment changes over time and no single controller achieves good performance over the entire time horizon. We derive the state-space structure of the regret-optimal controller via a novel reduction to $H_{\infty}$ control and present a tight data-dependent bound on its regret in terms of the energy of the disturbance. Our results easily extend to the model-predictive setting where the controller can anticipate future disturbances and to settings where the controller only affects the system dynamics after a fixed delay. We present numerical experiments which show that our regret-optimal controller interpolates between the performance of the $H_2$-optimal and $H_{\infty}$-optimal controllers across stochastic and adversarial environments.

preprint2020arXiv

Online Optimization with Predictions and Non-convex Losses

We study online optimization in a setting where an online learner seeks to optimize a per-round hitting cost, which may be non-convex, while incurring a movement cost when changing actions between rounds. We ask: \textit{under what general conditions is it possible for an online learner to leverage predictions of future cost functions in order to achieve near-optimal costs?} Prior work has provided near-optimal online algorithms for specific combinations of assumptions about hitting and switching costs, but no general results are known. In this work, we give two general sufficient conditions that specify a relationship between the hitting and movement costs which guarantees that a new algorithm, Synchronized Fixed Horizon Control (SFHC), provides a $1+O(1/w)$ competitive ratio, where $w$ is the number of predictions available to the learner. Our conditions do not require the cost functions to be convex, and we also derive competitive ratio results for non-convex hitting and movement costs. Our results provide the first constant, dimension-free competitive ratio for online non-convex optimization with movement costs. Further, we give an example of a natural instance, Convex Body Chasing (CBC), where the sufficient conditions are not satisfied and we can prove that no online algorithm can have a competitive ratio that converges to 1.

preprint2020arXiv

The Power of Linear Controllers in LQR Control

The Linear Quadratic Regulator (LQR) framework considers the problem of regulating a linear dynamical system perturbed by environmental noise. We compute the policy regret between three distinct control policies: i) the optimal online policy, whose linear structure is given by the Ricatti equations; ii) the optimal offline linear policy, which is the best linear state feedback policy given the noise sequence; and iii) the optimal offline policy, which selects the globally optimal control actions given the noise sequence. We fully characterize the optimal offline policy and show that it has a recursive form in terms of the optimal online policy and future disturbances. We also show that cost of the optimal offline linear policy converges to the cost of the optimal online policy as the time horizon grows large, and consequently the optimal offline linear policy incurs linear regret relative to the optimal offline policy, even in the optimistic setting where the noise is drawn i.i.d from a known distribution. Although we focus on the setting where the noise is stochastic, our results also imply new lower bounds on the policy regret achievable when the noise is chosen by an adaptive adversary.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint