Source author record

Claire Tomlin

Claire Tomlin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Systems and Control Machine Learning Robotics math.OC eess.SY Artificial Intelligence Computer Vision Human-Computer Interaction math.ST Multiagent Systems Statistics Theory math.DS

Catalog footprint

What is connected

28works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

Competency-Aware Planning for Probabilistically Safe Navigation Under Perception Uncertainty

Perception-based navigation systems are useful for unmanned ground vehicle (UGV) navigation in complex terrains, where traditional depth-based navigation schemes are insufficient. However, these data-driven methods are highly dependent on their training data and can fail in surprising and dramatic ways with little warning. To ensure the safety of the vehicle and the surrounding environment, it is imperative that the navigation system is able to recognize the predictive uncertainty of the perception model and respond safely and effectively in the face of uncertainty. In an effort to enable safe navigation under perception uncertainty, we develop a probabilistic and reconstruction-based competency estimation (PaRCE) method to estimate the model's level of familiarity with an input image as a whole and with specific regions in the image. We find that the overall competency score can correctly predict correctly classified, misclassified, and out-of-distribution (OOD) samples. We also confirm that the regional competency maps can accurately distinguish between familiar and unfamiliar regions across images. We then use this competency information to develop a planning and control scheme that enables effective navigation while maintaining a low probability of error. We find that the competency-aware scheme greatly reduces the number of collisions with unfamiliar obstacles, compared to a baseline controller with no competency awareness. Furthermore, the regional competency information is very valuable in enabling efficient navigation.

preprint2025arXiv

Explaining Low Perception Model Competency with High-Competency Counterfactuals

There exist many methods to explain how an image classification model generates its decision, but very little work has explored methods to explain why a classifier might lack confidence in its prediction. As there are various reasons the classifier might lose confidence, it would be valuable for this model to not only indicate its level of uncertainty but also explain why it is uncertain. Counterfactual images have been used to visualize changes that could be made to an image to generate a different classification decision. In this work, we explore the use of counterfactuals to offer an explanation for low model competency--a generalized form of predictive uncertainty that measures confidence. Toward this end, we develop five novel methods to generate high-competency counterfactual images, namely Image Gradient Descent (IGD), Feature Gradient Descent (FGD), Autoencoder Reconstruction (Reco), Latent Gradient Descent (LGD), and Latent Nearest Neighbors (LNN). We evaluate these methods across two unique datasets containing images with six known causes for low model competency and find Reco, LGD, and LNN to be the most promising methods for counterfactual generation. We further evaluate how these three methods can be utilized by pre-trained Multimodal Large Language Models (MLLMs) to generate language explanations for low model competency. We find that the inclusion of a counterfactual image in the language model query greatly increases the ability of the model to generate an accurate explanation for the cause of low model competency, thus demonstrating the utility of counterfactual images in explaining low perception model competency.

preprint2025arXiv

PaRCE: Probabilistic and Reconstruction-based Competency Estimation for CNN-based Image Classification

Convolutional neural networks (CNNs) are extremely popular and effective for image classification tasks but tend to be overly confident in their predictions. Various works have sought to quantify uncertainty associated with these models, detect out-of-distribution (OOD) inputs, or identify anomalous regions in an image, but limited work has sought to develop a holistic approach that can accurately estimate perception model confidence across various sources of uncertainty. We develop a probabilistic and reconstruction-based competency estimation (PaRCE) method and compare it to existing approaches for uncertainty quantification and OOD detection. We find that our method can best distinguish between correctly classified, misclassified, and OOD samples with anomalous regions, as well as between samples with visual image modifications resulting in high, medium, and low prediction accuracy. We describe how to extend our approach for anomaly localization tasks and demonstrate the ability of our approach to distinguish between regions in an image that are familiar to the perception model from those that are unfamiliar. We find that our method generates interpretable scores that most reliably capture a holistic notion of perception model confidence.

preprint2023arXiv

Cost Inference for Feedback Dynamic Games from Noisy Partial State Observations and Incomplete Trajectories

In multi-agent dynamic games, the Nash equilibrium state trajectory of each agent is determined by its cost function and the information pattern of the game. However, the cost and trajectory of each agent may be unavailable to the other agents. Prior work on using partial observations to infer the costs in dynamic games assumes an open-loop information pattern. In this work, we demonstrate that the feedback Nash equilibrium concept is more expressive and encodes more complex behavior. It is desirable to develop specific tools for inferring players' objectives in feedback games. Therefore, we consider the dynamic game cost inference problem under the feedback information pattern, using only partial state observations and incomplete trajectory data. To this end, we first propose an inverse feedback game loss function, whose minimizer yields a feedback Nash equilibrium state trajectory closest to the observation data. We characterize the landscape and differentiability of the loss function. Given the difficulty of obtaining the exact gradient, our main contribution is an efficient gradient approximator, which enables a novel inverse feedback game solver that minimizes the loss using first-order optimization. In thorough empirical evaluations, we demonstrate that our algorithm converges reliably and has better robustness and generalization performance than the open-loop baseline method when the observation data reflects a group of players acting in a feedback Nash game.

preprint2022arXiv

Inducing Structure in Reward Learning by Learning Features

Reward learning enables robots to learn adaptable behaviors from human input. Traditional methods model the reward as a linear function of hand-crafted features, but that requires specifying all the relevant features a priori, which is impossible for real-world tasks. To get around this issue, recent deep Inverse Reinforcement Learning (IRL) methods learn rewards directly from the raw state but this is challenging because the robot has to implicitly learn the features that are important and how to combine them, simultaneously. Instead, we propose a divide and conquer approach: focus human input specifically on learning the features separately, and only then learn how to combine them into a reward. We introduce a novel type of human input for teaching features and an algorithm that utilizes it to learn complex features from the raw state space. The robot can then learn how to combine them into a reward using demonstrations, corrections, or other reward learning frameworks. We demonstrate our method in settings where all features have to be learned from scratch, as well as where some of the features are known. By first focusing human input specifically on the feature(s), our method decreases sample complexity and improves generalization of the learned reward over a deepIRL baseline. We show this in experiments with a physical 7DOF robot manipulator, as well as in a user study conducted in a simulated environment.

preprint2022arXiv

Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control

Learned models and policies can generalize effectively when evaluated within the distribution of the training data, but can produce unpredictable and erroneous outputs on out-of-distribution inputs. In order to avoid distribution shift when deploying learning-based control algorithms, we seek a mechanism to constrain the agent to states and actions that resemble those that it was trained on. In control theory, Lyapunov stability and control-invariant sets allow us to make guarantees about controllers that stabilize the system around specific states, while in machine learning, density models allow us to estimate the training data distribution. Can we combine these two concepts, producing learning-based control algorithms that constrain the system to in-distribution states using only in-distribution actions? In this work, we propose to do this by combining concepts from Lyapunov stability and density estimation, introducing Lyapunov density models: a generalization of control Lyapunov functions and density models that provides guarantees on an agent's ability to stay in-distribution over its entire trajectory.

preprint2021arXiv

Feature Expansive Reward Learning: Rethinking Human Input

When a person is not satisfied with how a robot performs a task, they can intervene to correct it. Reward learning methods enable the robot to adapt its reward function online based on such human input, but they rely on handcrafted features. When the correction cannot be explained by these features, recent work in deep Inverse Reinforcement Learning (IRL) suggests that the robot could ask for task demonstrations and recover a reward defined over the raw state space. Our insight is that rather than implicitly learning about the missing feature(s) from demonstrations, the robot should instead ask for data that explicitly teaches it about what it is missing. We introduce a new type of human input in which the person guides the robot from states where the feature being taught is highly expressed to states where it is not. We propose an algorithm for learning the feature from the raw state space and integrating it into the reward function. By focusing the human input on the missing feature, our method decreases sample complexity and improves generalization of the learned reward over the above deep IRL baseline. We show this in experiments with a physical 7DOF robot manipulator, as well as in a user study conducted in a simulated environment.

preprint2021arXiv

Visual Navigation Among Humans with Optimal Control as a Supervisor

Real world visual navigation requires robots to operate in unfamiliar, human-occupied dynamic environments. Navigation around humans is especially difficult because it requires anticipating their future motion, which can be quite challenging. We propose an approach that combines learning-based perception with model-based optimal control to navigate among humans based only on monocular, first-person RGB images. Our approach is enabled by our novel data-generation tool, HumANav that allows for photorealistic renderings of indoor environment scenes with humans in them, which are then used to train the perception module entirely in simulation. Through simulations and experiments on a mobile robot, we demonstrate that the learned navigation policies can anticipate and react to humans without explicitly predicting future human motion, generalize to previously unseen environments and human behaviors, and transfer directly from simulation to reality. Videos describing our approach and experiments, as well as a demo of HumANav are available on the project website.

preprint2020arXiv

A Successive-Elimination Approach to Adaptive Robotic Sensing

We study an adaptive source seeking problem, in which a mobile robot must identify the strongest emitter(s) of a signal in an environment with background emissions. Background signals may be highly heterogeneous and can mislead algorithms that are based on receding horizon control. We propose AdaSearch, a general algorithm for adaptive source seeking in the face of heterogeneous background noise. AdaSearch combines global trajectory planning with principled confidence intervals in order to concentrate measurements in promising regions while guaranteeing sufficient coverage of the entire area. Theoretical analysis shows that AdaSearch confers gains over a uniform sampling strategy when the distribution of background signals is highly variable. Simulation experiments demonstrate that when applied to the problem of radioactive source seeking, AdaSearch outperforms both uniform sampling and a receding time horizon information-maximization approach based on the current literature. We also demonstrate AdaSearch in hardware, providing further evidence of its potential for real-time implementation.

preprint2020arXiv

Customized Local Differential Privacy for Multi-Agent Distributed Optimization

Real-time data-driven optimization and control problems over networks may require sensitive information of participating users to calculate solutions and decision variables, such as in traffic or energy systems. Adversaries with access to coordination signals may potentially decode information on individual users and put user privacy at risk. We develop local differential privacy, which is a strong notion that guarantees user privacy regardless of any auxiliary information an adversary may have, for a larger family of convex distributed optimization problems. The mechanism allows agent to customize their own privacy level based on local needs and parameter sensitivities. We propose a general sampling based approach for determining sensitivity and derive analytical bounds for specific quadratic problems. We analyze inherent trade-offs between privacy and suboptimality and propose allocation schemes to divide the maximum allowable noise, a privacy budget, among all participating agents. Our algorithm is implemented to enable privacy in distributed optimal power flow for electric grids.

preprint2020arXiv

Eyes-Closed Safety Kernels: Safety for Autonomous Systems Under Loss of Observability

A framework is presented for handling a potential loss of observability of a dynamical system in a provably-safe way. Inspired by the fragility of data-driven perception systems used by autonomous vehicles, we formulate the problem that arises when a sensing modality fails or is found to be untrustworthy during autonomous operation. We cast this problem as a differential game played between the dynamical system being controlled and the external system factor(s) for which observations are lost. The game is a zero-sum Stackelberg game in which the controlled system (leader) is trying to find a trajectory which maximizes a function representing the safety of the system, and the unobserved factor (follower) is trying to minimize the same function. The set of winning initial configurations of this game for the controlled system represent the set of all states in which safety can be maintained with respect to the external factor, even if observability of that factor is lost. This is the set we refer to as the Eyes-Closed Safety Kernel. In practical use, the policy defined by the winning strategy of the controlled system is only needed to be executed whenever observability of the external system is lost or the system deviates from the Eyes-Closed Safety Kernel due to other, non-safety oriented control schemes. We present a means for solving this game offline, such that the resulting winning strategy can be used for computationally efficient, provably-safe, online control when needed. The solution approach presented is based on representing the game using the solutions of two Hamilton-Jacobi partial differential equations. We illustrate the applicability of our framework by working through a realistic example in which an autonomous car must avoid a dynamic obstacle despite potentially losing observability.

preprint2020arXiv

Generating Robust Supervision for Learning-Based Visual Navigation Using Hamilton-Jacobi Reachability

In Bansal et al. (2019), a novel visual navigation framework that combines learning-based and model-based approaches has been proposed. Specifically, a Convolutional Neural Network (CNN) predicts a waypoint that is used by the dynamics model for planning and tracking a trajectory to the waypoint. However, the CNN inevitably makes prediction errors which often lead to collisions in cluttered and tight spaces. In this paper, we present a novel Hamilton-Jacobi (HJ) reachability-based method to generate supervision for the CNN for waypoint prediction in an unseen environment. By modeling CNN prediction error as "disturbances" in robot's dynamics, our generated waypoints are robust to these disturbances, and consequently to the prediction errors. Moreover, using globally optimal HJ reachability analysis leads to predicting waypoints that are time-efficient and avoid greedy behavior. Through simulations and hardware experiments, we demonstrate the advantages of the proposed approach on navigating through cluttered, narrow indoor environments.

preprint2019arXiv

An Insect-scale Self-sufficient Rolling Microrobot

We design an insect-sized rolling microrobot driven by continuously rotating wheels. It measures 18mm$\times$8mm$\times$8mm. There are 2 versions of the robot - a 96mg laser-powered one and a 130mg supercapacitor powered one. The robot can move at 27mm/s (1.5 body lengths per second) with wheels rotating at 300$^\circ$/s, while consuming an average power of 2.5mW. Neither version has any electrical wires coming out of it, with the supercapacitor powered robot also being self-sufficient and is able to roll freely for 8 seconds after a single charge. Low-voltage electromagnetic actuators (1V-3V) along with a novel double-ratcheting mechanism enable the operation of this device. It is, to the best of our knowledge, the lightest and fastest self-sufficient rolling microrobot reported yet.

preprint2019arXiv

Design of the First Insect-scale Spinning-wing Robot

Here we present the design of an insect-scale microrobot that generates lift by spinning its wings. This is in contrast to most other microrobot designs at this size scale which rely on flapping wings to produce lift. The robot has a wing span of 4 centimeters and weighs 133 milligrams. It spins its wings at 47 revolutions/second generating $>$ 138 milligrams of lift while consuming approximately 60 milliwatts of total power and operating at a low voltage ($<$ 3 V). Of the total power consumed 8.8 milliwatts is mechanical power generated, part of which goes towards spinning the wings, and 51 milliwatts is wasted in resistive Joule heating. With a lift-to-power ratio of 2.3 grams/W, its performance is at par with the best reported flapping wing devices at the insect-scale.

preprint2019arXiv

Design of the first sub-milligram flapping wing aerial vehicle

Here we report the first sub-milligram flapping wing vehicle which is able to mimic insect wing kinematics. Wing stroke amplitude of 90$^\circ$ and wing pitch amplitude of 80$^\circ$ is demonstrated. This is also the smallest wing-span (single wing length of 3.5mm) device reported yet and is at the same mass-scale as a fruit fly. Assembly has been made simple and requires gluing together 5 components in contrast to higher part count and intensive assembly of other milligram-scale microrobots. This increases the fabrication speed and success-rate of the fully fabricated device. Low operational voltages (70mV) makes testing further easy and will enable eventual deployment of autonomous sub-milligram aerial vehicles.

preprint2017arXiv

On Identification of Distribution Grids

Large-scale integration of distributed energy resources into residential distribution feeders necessitates careful control of their operation through power flow analysis. While the knowledge of the distribution system model is crucial for this type of analysis, it is often unavailable or outdated. The recent introduction of synchrophasor technology in low-voltage distribution grids has created an unprecedented opportunity to learn this model from high-precision, time-synchronized measurements of voltage and current phasors at various locations. This paper focuses on joint estimation of model parameters (admittance values) and operational structure of a poly-phase distribution network from the available telemetry data via the lasso, a method for regression shrinkage and selection. We propose tractable convex programs capable of tackling the low rank structure of the distribution system and develop an online algorithm for early detection and localization of critical events that induce a change in the admittance matrix. The efficacy of these techniques is corroborated through power flow studies on four three-phase radial distribution systems serving real household demands.

preprint2016arXiv

A Bayesian Perspective on Residential Demand Response Using Smart Meter Data

The widespread deployment of Advanced Metering Infrastructure has made granular data of residential electricity consumption available on a large scale. Smart meters enable a two way communication between residential customers and utilities. One field of research that relies on such granular consumption data is Residential Demand Response, where individual users are incentivized to temporarily reduce their consumption during periods of high marginal cost of electricity. To quantify the economic potential of Residential Demand Response, it is important to estimate the reductions during Demand Response hours, taking into account the heterogeneity of electricity users. In this paper, we incorporate latent variables representing behavioral archetypes of electricity users into the process of short term load forecasting with Machine Learning methods, thereby differentiating between varying levels of energy consumption. The latent variables are constructed by fitting Conditional Mixture Models of Linear Regressions and Hidden Markov Models on smart meter readings of a Residential Demand Response program in the western United States. We observe a notable increase in the accuracy of short term load forecasts compared to the case without latent variables. We then estimate the reductions during Demand Response events conditional on the latent variables, and discover a higher DR reduction among users with automated smart home devices compared to those without.

preprint2016arXiv

Event Detection and Localization in Distribution Grids with Phasor Measurement Units

The recent introduction of synchrophasor technology into power distribution systems has given impetus to various monitoring, diagnostic, and control applications, such as system identification and event detection, which are crucial for restoring service, preventing outages, and managing equipment health. Drawing on the existing framework for inferring topology and admittances of a power network from voltage and current phasor measurements, this paper proposes an online algorithm for event detection and localization in unbalanced three-phase distribution systems. Using a convex relaxation and a matrix partitioning technique, the proposed algorithm is capable of identifying topology changes and attributing them to specific categories of events. The performance of this algorithm is evaluated on a standard test distribution feeder with synthesized loads, and it is shown that a tripped line can be detected and localized in an accurate and timely fashion, highlighting its potential for real-world applications.

preprint2016arXiv

Minimizing Regret on Reflexive Banach Spaces and Learning Nash Equilibria in Continuous Zero-Sum Games

We study a general version of the adversarial online learning problem. We are given a decision set $\mathcal{X}$ in a reflexive Banach space $X$ and a sequence of reward vectors in the dual space of $X$. At each iteration, we choose an action from $\mathcal{X}$, based on the observed sequence of previous rewards. Our goal is to minimize regret, defined as the gap between the realized reward and the reward of the best fixed action in hindsight. Using results from infinite dimensional convex analysis, we generalize the method of Dual Averaging (or Follow the Regularized Leader) to our setting and obtain general upper bounds on the worst-case regret that subsume a wide range of results from the literature. Under the assumption of uniformly continuous rewards, we obtain explicit anytime regret bounds in a setting where the decision set is the set of probability distributions on a compact metric space $S$ whose Radon-Nikodym derivatives are elements of $L^p(S)$ for some $p > 1$. Importantly, we make no convexity assumptions on either the set $S$ or the reward functions. We also prove a general lower bound on the worst-case regret for any online algorithm. We then apply these results to the problem of learning in repeated continuous two-player zero-sum games, in which players' strategy sets are compact metric spaces. In doing so, we first prove that if both players play a Hannan-consistent strategy, then with probability 1 the empirical distributions of play weakly converge to the set of Nash equilibria of the game. We then show that, under mild assumptions, Dual Averaging on the (infinite-dimensional) space of probability distributions indeed achieves Hannan-consistency. Finally, we illustrate our results through numerical examples.

preprint2016arXiv

Residential Demand Response Targeting Using Machine Learning with Observational Data

The large scale deployment of Advanced Metering Infrastructure among residential energy customers has served as a boon for energy systems research relying on granular consumption data. Residential Demand Response aims to utilize the flexibility of consumers to reduce their energy usage during times when the grid is strained. Suitable incentive mechanisms to encourage customers to deviate from their usual behavior have to be implemented to correctly control the bids into the wholesale electricity market as a Demand Response provider. In this paper, we present a framework for short term load forecasting on an individual user level, and relate nonexperimental estimates of Demand Response efficacy, i.e. the estimated reduction of consumption during Demand Response events, to the variability of user consumption. We apply our framework on a data set from a residential Demand Response program in the Western United States. Our results suggest that users with more variable consumption patterns are more likely to reduce their consumption compared to users with a more regular consumption behavior.

preprint2014arXiv

Compressed Sensing for Network Reconstruction

The problem of identifying sparse solutions for the link structure and dynamics of an unknown linear, time-invariant network is posed as finding sparse solutions x to Ax=b. If the sensing matrix A satisfies a rank condition, this problem has a unique, sparse solution. Here each row of A comprises one experiment consisting of input/output measurements and cannot be freely chosen. We show that if experiments are poorly designed, the rank condition may never be satisfied, resulting in multiple solutions. We discuss experimental strategies for designing experiments such that the sensing matrix has the desired properties and the problem is therefore well posed. This formulation allows prior knowledge to be taken into account in the form of known nonzero entries of x, requiring fewer experiments to be performed. A number of simulated examples are given to illustrate the approach, which provides a useful strategy commensurate with the type of experiments and measurements available to biologists. We also confirm suggested limitations on the use of convex relaxations for the efficient solution of this problem.

preprint2014arXiv

Practical Comparison of Optimization Algorithms for Learning-Based MPC with Linear Models

Learning-based control methods are an attractive approach for addressing performance and efficiency challenges in robotics and automation systems. One such technique that has found application in these domains is learning-based model predictive control (LBMPC). An important novelty of LBMPC lies in the fact that its robustness and stability properties are independent of the type of online learning used. This allows the use of advanced statistical or machine learning methods to provide the adaptation for the controller. This paper is concerned with providing practical comparisons of different optimization algorithms for implementing the LBMPC method, for the special case where the dynamic model of the system is linear and the online learning provides linear updates to the dynamic model. For comparison purposes, we have implemented a primal-dual infeasible start interior point method that exploits the sparsity structure of LBMPC. Our open source implementation (called LBmpcIPM) is available through a BSD license and is provided freely to enable the rapid implementation of LBMPC on other platforms. This solver is compared to the dense active set solvers LSSOL and qpOASES using a quadrotor helicopter platform. Two scenarios are considered: The first is a simulation comparing hovering control for the quadrotor, and the second is on-board control experiments of dynamic quadrotor flight. Though the LBmpcIPM method has better asymptotic computational complexity than LSSOL and qpOASES, we find that for certain integrated systems (like our quadrotor testbed) these methods can outperform LBmpcIPM. This suggests that actual benchmarks should be used when choosing which algorithm is used to implement LBMPC on practical systems.

preprint2012arXiv

Energy-Efficient Building HVAC Control Using Hybrid System LBMPC

Improving the energy-efficiency of heating, ventilation, and air-conditioning (HVAC) systems has the potential to realize large economic and societal benefits. This paper concerns the system identification of a hybrid system model of a building-wide HVAC system and its subsequent control using a hybrid system formulation of learning-based model predictive control (LBMPC). Here, the learning refers to model updates to the hybrid system model that incorporate the heating effects due to occupancy, solar effects, outside air temperature (OAT), and equipment, in addition to integrator dynamics inherently present in low-level control. Though we make significant modeling simplifications, our corresponding controller that uses this model is able to experimentally achieve a large reduction in energy usage without any degradations in occupant comfort. It is in this way that we justify the modeling simplifications that we have made. We conclude by presenting results from experiments on our building HVAC testbed, which show an average of 1.5MWh of energy savings per day (p = 0.002) with a 95% confidence interval of 1.0MWh to 2.1MWh of energy savings.

preprint2012arXiv

Incentive Design for Efficient Building Quality of Service

Buildings are a large consumer of energy, and reducing their energy usage may provide financial and societal benefits. One challenge in achieving efficient building operation is the fact that few financial motivations exist for encouraging low energy configuration and operation of buildings. As a result, incentive schemes for managers of large buildings are being proposed for the purpose of saving energy. This paper focuses on incentive design for the configuration and operation of building-wide heating, ventilation, and air-conditioning (HVAC) systems, because these systems constitute the largest portion of energy usage in most buildings. We begin with an empirical model of a building-wide HVAC system, which describes the tradeoffs between energy consumption, quality of service (as defined by occupant satisfaction), and the amount of work required for maintenance and configuration. The model has significant non-convexities, and so we derive some results regarding qualitative properties of non-convex optimization problems with certain partial-ordering features. These results are used to show that "baselining" incentive schemes suffer from moral hazard problems, and they also encourage energy reductions at the expense of also decreasing occupant satisfaction. We propose an alternative incentive scheme that has the interpretation of a performance-based bonus. A theoretical analysis shows that this encourages energy and monetary savings and modest gains in occupant satisfaction and quality of service, which is confirmed by our numerical simulations.

preprint2012arXiv

Provably Safe and Robust Learning-Based Model Predictive Control

Controller design faces a trade-off between robustness and performance, and the reliability of linear controllers has caused many practitioners to focus on the former. However, there is renewed interest in improving system performance to deal with growing energy constraints. This paper describes a learning-based model predictive control (LBMPC) scheme that provides deterministic guarantees on robustness, while statistical identification tools are used to identify richer models of the system in order to improve performance; the benefits of this framework are that it handles state and input constraints, optimizes system performance with respect to a cost function, and can be designed to use a wide variety of parametric or nonparametric statistical tools. The main insight of LBMPC is that safety and performance can be decoupled under reasonable conditions in an optimization framework by maintaining two models of the system. The first is an approximate model with bounds on its uncertainty, and the second model is updated by statistical methods. LBMPC improves performance by choosing inputs that minimize a cost subject to the learned dynamics, and it ensures safety and robustness by checking whether these same inputs keep the approximate model stable when it is subject to uncertainty. Furthermore, we show that if the system is sufficiently excited, then the LBMPC control action probabilistically converges to that of an MPC computed using the true dynamics.

preprint2012arXiv

Quantitative Methods for Comparing Different HVAC Control Schemes

Experimentally comparing the energy usage and comfort characteristics of different controllers in heating, ventilation, and air-conditioning (HVAC) systems is difficult because variations in weather and occupancy conditions preclude the possibility of establishing equivalent experimental conditions across the order of hours, days, and weeks. This paper is concerned with defining quantitative metrics of energy usage and occupant comfort, which can be computed and compared in a rigorous manner that is capable of determining whether differences between controllers are statistically significant in the presence of such environmental fluctuations. Experimental case studies are presented that compare two alternative controllers (a schedule controller and a hybrid system learning-based model predictive controller) to the default controller in a building-wide HVAC system. Lastly, we discuss how our proposed methodology may also be able to quantify the efficiency of other building automation systems.

preprint2012arXiv

Statistical Results on Filtering and Epi-convergence for Learning-Based Model Predictive Control

Learning-based model predictive control (LBMPC) is a technique that provides deterministic guarantees on robustness, while statistical identification tools are used to identify richer models of the system in order to improve performance. This technical note provides proofs that elucidate the reasons for our choice of measurement model, as well as giving proofs concerning the stochastic convergence of LBMPC. The first part of this note discusses simultaneous state estimation and statistical identification (or learning) of unmodeled dynamics, for dynamical systems that can be described by ordinary differential equations (ODE's). The second part provides proofs concerning the epi-convergence of different statistical estimators that can be used with the learning-based model predictive control (LBMPC) technique. In particular, we prove results on the statistical properties of a nonparametric estimator that we have designed to have the correct deterministic and stochastic properties for numerical implementation when used in conjunction with LBMPC.

preprint2011arXiv

Regression on manifolds: Estimation of the exterior derivative

Collinearity and near-collinearity of predictors cause difficulties when doing regression. In these cases, variable selection becomes untenable because of mathematical issues concerning the existence and numerical stability of the regression coefficients, and interpretation of the coefficients is ambiguous because gradients are not defined. Using a differential geometric interpretation, in which the regression coefficients are interpreted as estimates of the exterior derivative of a function, we develop a new method to do regression in the presence of collinearities. Our regularization scheme can improve estimation error, and it can be easily modified to include lasso-type regularization. These estimators also have simple extensions to the "large $p$, small $n$" context.

Claire Tomlin

What is connected

Connect this record

See the researcher in context

Building this map preview

28 published item(s)

Competency-Aware Planning for Probabilistically Safe Navigation Under Perception Uncertainty

Explaining Low Perception Model Competency with High-Competency Counterfactuals

PaRCE: Probabilistic and Reconstruction-based Competency Estimation for CNN-based Image Classification

Cost Inference for Feedback Dynamic Games from Noisy Partial State Observations and Incomplete Trajectories

Inducing Structure in Reward Learning by Learning Features

Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control

Feature Expansive Reward Learning: Rethinking Human Input

Visual Navigation Among Humans with Optimal Control as a Supervisor

A Successive-Elimination Approach to Adaptive Robotic Sensing

Customized Local Differential Privacy for Multi-Agent Distributed Optimization

Eyes-Closed Safety Kernels: Safety for Autonomous Systems Under Loss of Observability

Generating Robust Supervision for Learning-Based Visual Navigation Using Hamilton-Jacobi Reachability

An Insect-scale Self-sufficient Rolling Microrobot

Design of the First Insect-scale Spinning-wing Robot

Design of the first sub-milligram flapping wing aerial vehicle

On Identification of Distribution Grids

A Bayesian Perspective on Residential Demand Response Using Smart Meter Data

Event Detection and Localization in Distribution Grids with Phasor Measurement Units

Minimizing Regret on Reflexive Banach Spaces and Learning Nash Equilibria in Continuous Zero-Sum Games

Residential Demand Response Targeting Using Machine Learning with Observational Data

Compressed Sensing for Network Reconstruction

Practical Comparison of Optimization Algorithms for Learning-Based MPC with Linear Models

Energy-Efficient Building HVAC Control Using Hybrid System LBMPC

Incentive Design for Efficient Building Quality of Service

Provably Safe and Robust Learning-Based Model Predictive Control

Quantitative Methods for Comparing Different HVAC Control Schemes

Statistical Results on Filtering and Epi-convergence for Learning-Based Model Predictive Control

Regression on manifolds: Estimation of the exterior derivative