Source author record

S. Shankar Sastry

S. Shankar Sastry appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

47works

24topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

DEC-LOS-RRT: Decentralized Path Planning for Multi-robot Systems with Line-of-sight Constrained Communication

Decentralized planning for multi-agent systems, such as fleets of robots in a search-and-rescue operation, is often constrained by limitations on how agents can communicate with each other. One such limitation is the case when agents can communicate with each other only when they are in line-of-sight (LOS). Developing decentralized planning methods that guarantee safety is difficult in this case, as agents that are occluded from each other might not be able to communicate until it's too late to avoid a safety violation. In this paper, we develop a decentralized planning method that explicitly avoids situations where lack of visibility of other agents would lead to an unsafe situation. Building on top of an existing Rapidly-exploring Random Tree (RRT)-based approach, our method guarantees safety at each iteration. Simulation studies show the effectiveness of our method and compare the degradation in performance with respect to a clairvoyant decentralized planning algorithm where agents can communicate despite not being in LOS of each other.

preprint2022arXiv

Who Leads and Who Follows in Strategic Classification?

As predictive models are deployed into the real world, they must increasingly contend with strategic behavior. A growing body of work on strategic classification treats this problem as a Stackelberg game: the decision-maker "leads" in the game by deploying a model, and the strategic agents "follow" by playing their best response to the deployed model. Importantly, in this framing, the burden of learning is placed solely on the decision-maker, while the agents' best responses are implicitly treated as instantaneous. In this work, we argue that the order of play in strategic classification is fundamentally determined by the relative frequencies at which the decision-maker and the agents adapt to each other's actions. In particular, by generalizing the standard model to allow both players to learn over time, we show that a decision-maker that makes updates faster than the agents can reverse the order of play, meaning that the agents lead and the decision-maker follows. We observe in standard learning settings that such a role reversal can be desirable for both the decision-maker and the strategic agents. Finally, we show that a decision-maker with the freedom to choose their update frequency can induce learning dynamics that converge to Stackelberg equilibria with either order of play.

preprint2021arXiv

Maximum Likelihood Constraint Inference from Stochastic Demonstrations

When an expert operates a perilous dynamic system, ideal constraint information is tacitly contained in their demonstrated trajectories and controls. The likelihood of these demonstrations can be computed, given the system dynamics and task objective, and the maximum likelihood constraints can be identified. Prior constraint inference work has focused mainly on deterministic models. Stochastic models, however, can capture the uncertainty and risk tolerance that are often present in real systems of interest. This paper extends maximum likelihood constraint inference to stochastic applications by using maximum causal entropy likelihoods. Furthermore, we propose an efficient algorithm that computes constraint likelihood and risk tolerance in a unified Bellman backup, allowing us to generalize to stochastic systems without increasing computational complexity.

preprint2020arXiv

Exponentially Stable First Order Control on Matrix Lie Groups

We present a novel first order controller for systems evolving on matrix Lie groups, a major use case of which is Cartesian velocity control on robot manipulators. This controller achieves global exponential trajectory tracking on a number of commonly used Lie groups including the Special Orthogonal Group SO(n), the Special Euclidean Group SE(n), and the General Linear Group over complex numbers GL(n, C). Additionally, this controller achieves local exponential trajectory tracking on all matrix Lie groups. We demonstrate the effectiveness of this controller in simulation on a number of different Lie groups as well as on hardware with a 7-DOF Sawyer robot arm.

preprint2020arXiv

Feedback Linearization for Unknown Systems via Reinforcement Learning

We present a novel approach to control design for nonlinear systems which leverages model-free policy optimization techniques to learn a linearizing controller for a physical plant with unknown dynamics. Feedback linearization is a technique from nonlinear control which renders the input-output dynamics of a nonlinear plant \emph{linear} under application of an appropriate feedback controller. Once a linearizing controller has been constructed, desired output trajectories for the nonlinear plant can be tracked using a variety of linear control techniques. However, the calculation of a linearizing controller requires a precise dynamics model for the system. As a result, model-based approaches for learning exact linearizing controllers generally require a simple, highly structured model of the system with easily identifiable parameters. In contrast, the model-free approach presented in this paper is able to approximate the linearizing controller for the plant using general function approximation architectures. Specifically, we formulate a continuous-time optimization problem over the parameters of a learned linearizing controller whose optima are the set of parameters which best linearize the plant. We derive conditions under which the learning problem is (strongly) convex and provide guarantees which ensure the true linearizing controller for the plant is recovered. We then discuss how model-free policy optimization algorithms can be used to solve a discrete-time approximation to the problem using data collected from the real-world plant. The utility of the framework is demonstrated in simulation and on a real-world robotic platform.

preprint2020arXiv

Improving Input-Output Linearizing Controllers for Bipedal Robots via Reinforcement Learning

The main drawbacks of input-output linearizing controllers are the need for precise dynamics models and not being able to account for input constraints. Model uncertainty is common in almost every robotic application and input saturation is present in every real world system. In this paper, we address both challenges for the specific case of bipedal robot control by the use of reinforcement learning techniques. Taking the structure of a standard input-output linearizing controller, we use an additive learned term that compensates for model uncertainty. Moreover, by adding constraints to the learning problem we manage to boost the performance of the final controller when input limits are present. We demonstrate the effectiveness of the designed framework for different levels of uncertainty on the five-link planar walking robot RABBIT.

preprint2020arXiv

LESS is More: Rethinking Probabilistic Models of Human Behavior

Robots need models of human behavior for both inferring human goals and preferences, and predicting what people will do. A common model is the Boltzmann noisily-rational decision model, which assumes people approximately optimize a reward function and choose trajectories in proportion to their exponentiated reward. While this model has been successful in a variety of robotics domains, its roots lie in econometrics, and in modeling decisions among different discrete options, each with its own utility or reward. In contrast, human trajectories lie in a continuous space, with continuous-valued features that influence the reward function. We propose that it is time to rethink the Boltzmann model, and design it from the ground up to operate over such trajectory spaces. We introduce a model that explicitly accounts for distances between trajectories, rather than only their rewards. Rather than each trajectory affecting the decision independently, similar trajectories now affect the decision together. We start by showing that our model better explains human behavior in a user study. We then analyze the implications this has for robot inference, first in toy environments where we have ground truth and find more accurate inference, and finally for a 7DOF robot arm learning from user demonstrations.

preprint2020arXiv

Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning

While most approaches to the problem of Inverse Reinforcement Learning (IRL) focus on estimating a reward function that best explains an expert agent's policy or demonstrated behavior on a control task, it is often the case that such behavior is more succinctly represented by a simple reward combined with a set of hard constraints. In this setting, the agent is attempting to maximize cumulative rewards subject to these given constraints on their behavior. We reformulate the problem of IRL on Markov Decision Processes (MDPs) such that, given a nominal model of the environment and a nominal reward function, we seek to estimate state, action, and feature constraints in the environment that motivate an agent's behavior. Our approach is based on the Maximum Entropy IRL framework, which allows us to reason about the likelihood of an expert agent's demonstrations given our knowledge of an MDP. Using our method, we can infer which constraints can be added to the MDP to most increase the likelihood of observing these demonstrations. We present an algorithm which iteratively infers the Maximum Likelihood Constraint to best explain observed behavior, and we evaluate its efficacy using both simulated behavior and recorded data of humans navigating around an obstacle.

preprint2020arXiv

On Gradient-Based Learning in Continuous Games

We formulate a general framework for competitive gradient-based learning that encompasses a wide breadth of multi-agent learning algorithms, and analyze the limiting behavior of competitive gradient-based learning algorithms using dynamical systems theory. For both general-sum and potential games, we characterize a non-negligible subset of the local Nash equilibria that will be avoided if each agent employs a gradient-based learning algorithm. We also shed light on the issue of convergence to non-Nash strategies in general- and zero-sum games, which may have no relevance to the underlying game, and arise solely due to the choice of algorithm. The existence and frequency of such strategies may explain some of the difficulties encountered when using gradient descent in zero-sum games as, e.g., in the training of generative adversarial networks. To reinforce the theoretical contributions, we provide empirical results that highlight the frequency of linear quadratic dynamic games (a benchmark for multi-agent reinforcement learning) that admit global Nash equilibria that are almost surely avoided by policy gradient.

preprint2020arXiv

Technical Report: Adaptive Control for Linearizable Systems Using On-Policy Reinforcement Learning

This paper proposes a framework for adaptively learning a feedback linearization-based tracking controller for an unknown system using discrete-time model-free policy-gradient parameter update rules. The primary advantage of the scheme over standard model-reference adaptive control techniques is that it does not require the learned inverse model to be invertible at all instances of time. This enables the use of general function approximators to approximate the linearizing controller for the system without having to worry about singularities. However, the discrete-time and stochastic nature of these algorithms precludes the direct application of standard machinery from the adaptive control literature to provide deterministic stability proofs for the system. Nevertheless, we leverage these techniques alongside tools from the stochastic approximation literature to demonstrate that with high probability the tracking and parameter errors concentrate near zero when a certain persistence of excitation condition is satisfied. A simulated example of a double pendulum demonstrates the utility of the proposed theory. 1

preprint2020arXiv

Towards Verified Artificial Intelligence

Verified artificial intelligence (AI) is the goal of designing AI-based systems that that have strong, ideally provable, assurances of correctness with respect to mathematically-specified requirements. This paper considers Verified AI from a formal methods perspective. We describe five challenges for achieving Verified AI, and five corresponding principles for addressing these challenges.

preprint2016arXiv

Diagnosis and Repair for Synthesis from Signal Temporal Logic Specifications

We address the problem of diagnosing and repairing specifications for hybrid systems formalized in signal temporal logic (STL). Our focus is on the setting of automatic synthesis of controllers in a model predictive control (MPC) framework. We build on recent approaches that reduce the controller synthesis problem to solving one or more mixed integer linear programs (MILPs), where infeasibility of a MILP usually indicates unrealizability of the controller synthesis problem. Given an infeasible STL synthesis problem, we present algorithms that provide feedback on the reasons for unrealizability, and suggestions for making it realizable. Our algorithms are sound and complete, i.e., they provide a correct diagnosis, and always terminate with a non-trivial specification that is feasible using the chosen synthesis method, when such a solution exists. We demonstrate the effectiveness of our approach on the synthesis of controllers for various cyber-physical systems, including an autonomous driving application and an aircraft electric power system.

preprint2016arXiv

Differential Privacy of Populations in Routing Games

As our ground transportation infrastructure modernizes, the large amount of data being measured, transmitted, and stored motivates an analysis of the privacy aspect of these emerging cyber-physical technologies. In this paper, we consider privacy in the routing game, where the origins and destinations of drivers are considered private. This is motivated by the fact that this spatiotemporal information can easily be used as the basis for inferences for a person's activities. More specifically, we consider the differential privacy of the mapping from the amount of flow for each origin-destination pair to the traffic flow measurements on each link of a traffic network. We use a stochastic online learning framework for the population dynamics, which is known to converge to the Nash equilibrium of the routing game. We analyze the sensitivity of this process and provide theoretical guarantees on the convergence rates as well as differential privacy values for these models. We confirm these with simulations on a small example.

preprint2016arXiv

Dissimilarity-based Sparse Subset Selection

Finding an informative subset of a large collection of data points or models is at the center of many problems in computer vision, recommender systems, bio/health informatics as well as image and natural language processing. Given pairwise dissimilarities between the elements of a `source set' and a `target set,' we consider the problem of finding a subset of the source set, called representatives or exemplars, that can efficiently describe the target set. We formulate the problem as a row-sparsity regularized trace minimization problem. Since the proposed formulation is, in general, NP-hard, we consider a convex relaxation. The solution of our optimization finds representatives and the assignment of each element of the target set to each representative, hence, obtaining a clustering. We analyze the solution of our proposed optimization as a function of the regularization parameter. We show that when the two sets jointly partition into multiple groups, our algorithm finds representatives from all groups and reveals clustering of the sets. In addition, we show that the proposed framework can effectively deal with outliers. Our algorithm works with arbitrary dissimilarities, which can be asymmetric or violate the triangle inequality. To efficiently implement our algorithm, we consider an Alternating Direction Method of Multipliers (ADMM) framework, which results in quadratic complexity in the problem size. We show that the ADMM implementation allows to parallelize the algorithm, hence further reducing the computational time. Finally, by experiments on real-world datasets, we show that our proposed algorithm improves the state of the art on the two problems of scene categorization using representative images and time-series modeling and segmentation using representative~models.

preprint2016arXiv

Privacy-Enhanced Architecture for Occupancy-based HVAC Control

Large-scale sensing and actuation infrastructures have allowed buildings to achieve significant energy savings; at the same time, these technologies introduce significant privacy risks that must be addressed. In this paper, we present a framework for modeling the trade-off between improved control performance and increased privacy risks due to occupancy sensing. More specifically, we consider occupancy-based HVAC control as the control objective and the location traces of individual occupants as the private variables. Previous studies have shown that individual location information can be inferred from occupancy measurements. To ensure privacy, we design an architecture that distorts the occupancy data in order to hide individual occupant location information while maintaining HVAC performance. Using mutual information between the individual's location trace and the reported occupancy measurement as a privacy metric, we are able to optimally design a scheme to minimize privacy risk subject to a control performance guarantee. We evaluate our framework using real-world occupancy data: first, we verify that our privacy metric accurately assesses the adversary's ability to infer private variables from the distorted sensor measurements; then, we show that control performance is maintained through simulations of building operations using these distorted occupancy readings.

preprint2015arXiv

Approximation Algorithms for Optimization of Combinatorial Dynamical Systems

This paper considers an optimization problem for a dynamical system whose evolution depends on a collection of binary decision variables. We develop scalable approximation algorithms with provable suboptimality bounds to provide computationally tractable solution methods even when the dimension of the system and the number of the binary variables are large. The proposed method employs a linear approximation of the objective function such that the approximate problem is defined over the feasible space of the binary decision variables, which is a discrete set. To define such a linear approximation, we propose two different variation methods: one uses continuous relaxation of the discrete space and the other uses convex combinations of the vector field and running payoff. The approximate problem is a 0-1 linear program, which can be solved by existing polynomial-time exact or approximation algorithms, and does not require the solution of the dynamical system. Furthermore, we characterize a sufficient condition ensuring the approximate solution has a provable suboptimality bound. We show that this condition can be interpreted as the concavity of the objective function. The performance and utility of the proposed algorithms are demonstrated with the ON/OFF control problems of interdependent refrigeration systems.

preprint2015arXiv

Event-Selected Vector Field Discontinuities Yield Piecewise-Differentiable Flows

We study a class of discontinuous vector fields brought to our attention by multi-legged animal locomotion. Such vector fields arise not only in biomechanics, but also in robotics, neuroscience, and electrical engineering, to name a few domains of application. Under the conditions that (i) the vector field's discontinuities are locally confined to a finite number of smooth submanifolds and (ii) the vector field is transverse to these surfaces in an appropriate sense, we show that the vector field yields a well-defined flow that is Lipschitz continuous and piecewise-differentiable. This implies that although the flow is not classically differentiable, nevertheless it admits a first-order approximation (known as a Bouligand derivative) that is piecewise-linear and continuous at every point. We exploit this first-order approximation to infer existence of piecewise-differentiable impact maps (including Poincaré maps for periodic orbits), show the flow is locally conjugate (via a piecewise-differentiable homeomorphism) to a flowbox, and assess the effect of perturbations (both infinitesimal and non-infinitesimal) on the flow. We use these results to give a sufficient condition for the exponential stability of a periodic orbit passing through a point of multiply intersecting events, and apply the theory in illustrative examples to demonstrate synchronization in abstract first- and second-order phase oscillator models.

preprint2015arXiv

Model Reduction Near Periodic Orbits of Hybrid Dynamical Systems

We show that, near periodic orbits, a class of hybrid models can be reduced to or approximated by smooth continuous-time dynamical systems. Specifically, near an exponentially stable periodic orbit undergoing isolated transitions in a hybrid dynamical system, nearby executions generically contract superexponentially to a constant-dimensional subsystem. Under a non-degeneracy condition on the rank deficiency of the associated Poincare map, the contraction occurs in finite time regardless of the stability properties of the orbit. Hybrid transitions may be removed from the resulting subsystem via a topological quotient that admits a smooth structure to yield an equivalent smooth dynamical system. We demonstrate reduction of a high-dimensional underactuated mechanical model for terrestrial locomotion, assess structural stability of deadbeat controllers for rhythmic locomotion and manipulation, and derive a normal form for the stability basin of a hybrid oscillator. These applications illustrate the utility of our theoretical results for synthesis and analysis of feedback control laws for rhythmic hybrid behavior.

preprint2015arXiv

Quantifying the Utility-Privacy Tradeoff in the Smart Grid

The modernization of the electrical grid and the installation of smart meters come with many advantages to control and monitoring. However, in the wrong hands, the data might pose a privacy threat. In this paper, we consider the tradeoff between smart grid operations and the privacy of consumers. We analyze the tradeoff between smart grid operations and how often data is collected by considering a realistic direct-load control example using thermostatically controlled loads, and we give simulation results to show how its performance degrades as the sampling frequency decreases. Additionally, we introduce a new privacy metric, which we call inferential privacy. This privacy metric assumes a strong adversary model, and provides an upper bound on the adversary's ability to infer a private parameter, independent of the algorithm he uses. Combining these two results allow us to directly consider the tradeoff between better load control and consumer privacy.

preprint2014arXiv

A Learning Based Approach to Control Synthesis of Markov Decision Processes for Linear Temporal Logic Specifications

We propose to synthesize a control policy for a Markov decision process (MDP) such that the resulting traces of the MDP satisfy a linear temporal logic (LTL) property. We construct a product MDP that incorporates a deterministic Rabin automaton generated from the desired LTL property. The reward function of the product MDP is defined from the acceptance condition of the Rabin automaton. This construction allows us to apply techniques from learning theory to the problem of synthesis for LTL specifications even when the transition probabilities are not known a priori. We prove that our method is guaranteed to find a controller that satisfies the LTL property with probability one if such a policy exists, and we suggest empirically with a case study in traffic control that our method produces reasonable control strategies even when the LTL property cannot be satisfied with probability one.

preprint2014arXiv

Effects of Risk on Privacy Contracts for Demand-Side Management

As smart meters continue to be deployed around the world collecting unprecedented levels of fine-grained data about consumers, we need to find mechanisms that are fair to both, (1) the electric utility who needs the data to improve their operations, and (2) the consumer who has a valuation of privacy but at the same time benefits from sharing consumption data. In this paper we address this problem by proposing privacy contracts between electric utilities and consumers with the goal of maximizing the social welfare of both. Our mathematical model designs an optimization problem between a population of users that have different valuations on privacy and the costs of operation by the utility. We then show how contracts can change depending on the probability of a privacy breach. This line of research can help inform not only current but also future smart meter collection practices.

preprint2014arXiv

Experimental Design for Human-in-the-Loop Driving Simulations

This report describes a new experimental setup for human-in-the-loop simulations. A force feedback simulator with four axis motion has been setup for real-time driving experiments. The simulator will move to simulate the forces a driver feels while driving, which allows for a realistic experience for the driver. This setup allows for flexibility and control for the researcher in a realistic simulation environment. Experiments concerning driver distraction can also be carried out safely in this test bed, in addition to multi-agent experiments. All necessary code to run the simulator, the additional sensors, and the basic processing is available for use.

preprint2014arXiv

Incentive Design and Utility Learning via Energy Disaggregation

The utility company has many motivations for modifying energy consumption patterns of consumers such as revenue decoupling and demand response programs. We model the utility company--consumer interaction as a principal--agent problem. We present an iterative algorithm for designing incentives while estimating the consumer's utility function. Incentives are designed using the aggregated as well as the disaggregated (device level) consumption data. We simulate the iterative control (incentive design) and estimation (utility learning and disaggregation) process for examples including the design of incentives based on the aggregate consumption data as well as the disaggregated consumption data.

preprint2014arXiv

Metrization and Simulation of Controlled Hybrid Systems

The study of controlled hybrid systems requires practical tools for approximation and comparison of system behaviors. Existing approaches to these problems impose undue restrictions on the system's continuous and discrete dynamics. Metrization and simulation of controlled hybrid systems is considered here in a unified framework by constructing a state space metric. The metric is applied to develop a numerical simulation algorithm that converges uniformly, with a known rate of convergence, to orbitally stable executions of controlled hybrid systems, up to and including Zeno events. Benchmark hybrid phenomena illustrate the utility of the proposed tools.

preprint2014arXiv

On the Characterization of Local Nash Equilibria in Continuous Games

We present a unified framework for characterizing local Nash equilibria in continuous games on either infinite-dimensional or finite-dimensional non-convex strategy spaces. We provide intrinsic necessary and sufficient first- and second-order conditions ensuring strategies constitute local Nash equilibria. We term points satisfying the sufficient conditions differential Nash equilibria. Further, we provide a sufficient condition (non-degeneracy) guaranteeing differential Nash equilibria are isolated and show that such equilibria are structurally stable. We present tutorial examples to illustrate our results and highlight degeneracies that can arise in continuous games.

preprint2014arXiv

Privacy and Customer Segmentation in the Smart Grid

In the electricity grid, networked sensors which record and transmit increasingly high-granularity data are being deployed. In such a setting, privacy concerns are a natural consideration. We present an attack model for privacy breaches, and, using results from estimation theory, derive theoretical results ensuring that an adversary will fail to infer private information with a certain probability, independent of the algorithm used. We show utility companies would benefit from less noisy, higher frequency data, as it would improve various smart grid operations such as load prediction. We provide a method to quantify how smart grid operations improve as a function of higher frequency data. In order to obtain the consumer's valuation of privacy, we design a screening mechanism consisting of a menu of contracts to the energy consumer with varying guarantees of privacy. The screening process is a means to segment customers. Finally, we design insurance contracts using the probability of a privacy breach to be offered by third-party insurance companies.

preprint2014arXiv

Rapid Integration and Calibration of New Sensors Using the Berkeley Aachen Robotics Toolkit (BART)

After the three DARPA Grand Challenge contests many groups around the world have continued to actively research and work toward an autonomous vehicle capable of accomplishing a mission in a given context (e.g. desert, city) while following a set of prescribed rules, but none has been completely successful in uncontrolled environments, a task that many people trivially fulfill every day. We believe that, together with improving the sensors used in cars and the artificial intelligence algorithms used to process the information, the community should focus on the systems engineering aspects of the problem, i.e. the limitations of the car (in terms of space, power, or heat dissipation) and the limitations of the software development cycle. This paper explores these issues and our experiences overcoming them.

preprint2014arXiv

Reach-Avoid Problems with Time-Varying Dynamics, Targets and Constraints

We consider a reach-avoid differential game, in which one of the players aims to steer the system into a target set without violating a set of state constraints, while the other player tries to prevent the first from succeeding; the system dynamics, target set, and state constraints may all be time-varying. The analysis of this problem plays an important role in collision avoidance, motion planning and aircraft control, among other applications. Previous methods for computing the guaranteed winning initial conditions and strategies for each player have either required augmenting the state vector to include time, or have been limited to problems with either no state constraints or entirely static targets, constraints and dynamics. To incorporate time-varying dynamics, targets and constraints without the need for state augmentation, we propose a modified Hamilton-Jacobi-Isaacs equation in the form of a double-obstacle variational inequality, and prove that the zero sublevel set of its viscosity solution characterizes the capture basin for the target under the state constraints. Through this formulation, our method can compute the capture basin and winning strategies for time-varying games at no additional computational cost with respect to the time-invariant case. We provide an implementation of this method based on well-known numerical schemes and show its convergence through a simple example; we include a second example in which our method substantially outperforms the state augmentation approach.

preprint2014arXiv

Social Game for Building Energy Efficiency: Utility Learning, Simulation, and Analysis

We describe a social game that we designed for encouraging energy efficient behavior amongst building occupants with the aim of reducing overall energy consumption in the building. Occupants vote for their desired lighting level and win points which are used in a lottery based on how far their vote is from the maximum setting. We assume that the occupants are utility maximizers and that their utility functions capture the tradeoff between winning points and their comfort level. We model the occupants as non-cooperative agents in a continuous game and we characterize their play using the Nash equilibrium concept. Using occupant voting data, we parameterize their utility functions and use a convex optimization problem to estimate the parameters. We simulate the game defined by the estimated utility functions and show that the estimated model for occupant behavior is a good predictor of their actual behavior. In addition, we show that due to the social game, there is a significant reduction in energy consumption.

preprint2014arXiv

Sparse Illumination Learning and Transfer for Single-Sample Face Recognition with Image Corruption and Misalignment

Single-sample face recognition is one of the most challenging problems in face recognition. We propose a novel algorithm to address this problem based on a sparse representation based classification (SRC) framework. The new algorithm is robust to image misalignment and pixel corruption, and is able to reduce required gallery images to one sample per class. To compensate for the missing illumination information traditionally provided by multiple gallery images, a sparse illumination learning and transfer (SILT) technique is introduced. The illumination in SILT is learned by fitting illumination examples of auxiliary face images from one or more additional subjects with a sparsely-used illumination dictionary. By enforcing a sparse representation of the query image in the illumination dictionary, the SILT can effectively recover and transfer the illumination and pose information from the alignment stage to the recognition stage. Our extensive experiments have demonstrated that the new algorithms significantly outperform the state of the art in the single-sample regime and with less restrictions. In particular, the single-sample face alignment accuracy is comparable to that of the well-known Deformable SRC algorithm using multiple gallery images per class. Furthermore, the face recognition accuracy exceeds those of the SRC and Extended SRC algorithms using hand labeled alignment initialization.

preprint2013arXiv

Blind Identification of ARX Models with Piecewise Constant Inputs

Blind system identification is known to be a hard ill-posed problem and without further assumptions, no unique solution is at hand. In this contribution, we are concerned with the task of identifying an ARX model from only output measurements. Driven by the task of identifying systems that are turned on and off at unknown times, we seek a piecewise constant input and a corresponding ARX model which approximates the measured outputs. We phrase this as a rank minimization problem and present a relaxed convex formulation to approximate its solution. The proposed method was developed to model power consumption of electrical appliances and is now a part of a bigger energy disaggregation framework. Code will be made available online.

preprint2013arXiv

Compressive Shift Retrieval

The classical shift retrieval problem considers two signals in vector form that are related by a shift. The problem is of great importance in many applications and is typically solved by maximizing the cross-correlation between the two signals. Inspired by compressive sensing, in this paper, we seek to estimate the shift directly from compressed signals. We show that under certain conditions, the shift can be recovered using fewer samples and less computation compared to the classical setup. Of particular interest is shift estimation from Fourier coefficients. We show that under rather mild conditions only one Fourier coefficient suffices to recover the true shift.

preprint2013arXiv

Energy Disaggregation via Adaptive Filtering

The energy disaggregation problem is recovering device level power consumption signals from the aggregate power consumption signal for a building. We show in this paper how the disaggregation problem can be reformulated as an adaptive filtering problem. This gives both a novel disaggregation algorithm and a better theoretical understanding for disaggregation. In particular, we show how the disaggregation problem can be solved online using a filter bank and discuss its optimality.

preprint2013arXiv

Fundamental Limits of Nonintrusive Load Monitoring

Provided an arbitrary nonintrusive load monitoring (NILM) algorithm, we seek bounds on the probability of distinguishing between scenarios, given an aggregate power consumption signal. We introduce a framework for studying a general NILM algorithm, and analyze the theory in the general case. Then, we specialize to the case where the error is Gaussian. In both cases, we are able to derive upper bounds on the probability of distinguishing scenarios. Finally, we apply the results to real data to derive bounds on the probability of distinguishing between scenarios as a function of the measurement noise, the sampling rate, and the device usage.

preprint2013arXiv

Incentive Mechanisms for Internet Congestion Management: Fixed-Budget Rebate versus Time-of-Day Pricing

Mobile data traffic has been steadily rising in the past years. This has generated a significant interest in the deployment of incentive mechanisms to reduce peak-time congestion. Typically, the design of these mechanisms requires information about user demand and sensitivity to prices. Such information is naturally imperfect. In this paper, we propose a \emph{fixed-budget rebate mechanism} that gives each user a reward proportional to his percentage contribution to the aggregate reduction in peak time demand. For comparison, we also study a time-of-day pricing mechanism that gives each user a fixed reward per unit reduction of his peak-time demand. To evaluate the two mechanisms, we introduce a game-theoretic model that captures the \emph{public good} nature of decongestion. For each mechanism, we demonstrate that the socially optimal level of decongestion is achievable for a specific choice of the mechanism's parameter. We then investigate how imperfect information about user demand affects the mechanisms' effectiveness. From our results, the fixed-budget rebate pricing is more robust when the users' sensitivity to congestion is "sufficiently" convex. This feature of the fixed-budget rebate mechanism is attractive for many situations of interest and is driven by its closed-loop property, i.e., the unit reward decreases as the peak-time demand decreases.

preprint2013arXiv

Nonlinear Basis Pursuit

In compressive sensing, the basis pursuit algorithm aims to find the sparsest solution to an underdetermined linear equation system. In this paper, we generalize basis pursuit to finding the sparsest solution to higher order nonlinear systems of equations, called nonlinear basis pursuit. In contrast to the existing nonlinear compressive sensing methods, the new algorithm that solves the nonlinear basis pursuit problem is convex and not greedy. The novel algorithm enables the compressive sensing approach to be used for a broader range of applications where there are nonlinear relationships between the measurements and the unknowns.

preprint2013arXiv

Nonlinear Compressive Particle Filtering

Many systems for which compressive sensing is used today are dynamical. The common approach is to neglect the dynamics and see the problem as a sequence of independent problems. This approach has two disadvantages. Firstly, the temporal dependency in the state could be used to improve the accuracy of the state estimates. Secondly, having an estimate for the state and its support could be used to reduce the computational load of the subsequent step. In the linear Gaussian setting, compressive sensing was recently combined with the Kalman filter to mitigate above disadvantages. In the nonlinear dynamical case, compressive sensing can not be used and, if the state dimension is high, the particle filter would perform poorly. In this paper we combine one of the most novel developments in compressive sensing, nonlinear compressive sensing, with the particle filter. We show that the marriage of the two is essential and that neither the particle filter or nonlinear compressive sensing alone gives a satisfying solution.

preprint2013arXiv

Quadratic Basis Pursuit

In many compressive sensing problems today, the relationship between the measurements and the unknowns could be nonlinear. Traditional treatment of such nonlinear relationships have been to approximate the nonlinearity via a linear model and the subsequent un-modeled dynamics as noise. The ability to more accurately characterize nonlinear models has the potential to improve the results in both existing compressive sensing applications and those where a linear approximation does not suffice, e.g., phase retrieval. In this paper, we extend the classical compressive sensing framework to a second-order Taylor expansion of the nonlinearity. Using a lifting technique and a method we call quadratic basis pursuit, we show that the sparse signal can be recovered exactly when the sampling rate is sufficiently high. We further present efficient numerical algorithms to recover sparse signals in second-order nonlinear systems, which are considerably more difficult to solve than their linear counterparts in sparse optimization.

preprint2013arXiv

Scalable Anomaly Detection in Large Homogenous Populations

Anomaly detection in large populations is a challenging but highly relevant problem. The problem is essentially a multi-hypothesis problem, with a hypothesis for every division of the systems into normal and anomal systems. The number of hypothesis grows rapidly with the number of systems and approximate solutions become a necessity for any problems of practical interests. In the current paper we take an optimization approach to this multi-hypothesis problem. We first observe that the problem is equivalent to a non-convex combinatorial optimization problem. We then relax the problem to a convex problem that can be solved distributively on the systems and that stays computationally tractable as the number of systems increase. An interesting property of the proposed method is that it can under certain conditions be shown to give exactly the same result as the combinatorial multi-hypothesis problem and the relaxation is hence tight.

preprint2012arXiv

Compressive Phase Retrieval From Squared Output Measurements Via Semidefinite Programming

Given a linear system in a real or complex domain, linear regression aims to recover the model parameters from a set of observations. Recent studies in compressive sensing have successfully shown that under certain conditions, a linear program, namely, l1-minimization, guarantees recovery of sparse parameter signals even when the system is underdetermined. In this paper, we consider a more challenging problem: when the phase of the output measurements from a linear system is omitted. Using a lifting technique, we show that even though the phase information is missing, the sparse signal can be recovered exactly by solving a simple semidefinite program when the sampling rate is sufficiently high, albeit the exact solutions to both sparse signal recovery and phase retrieval are combinatorial. The results extend the type of applications that compressive sensing can be applied to those where only output magnitudes can be observed. We demonstrate the accuracy of the algorithms through theoretical analysis, extensive simulations and a practical experiment.

preprint2012arXiv

Consistent Approximations for the Optimal Control of Constrained Switched Systems

Though switched dynamical systems have shown great utility in modeling a variety of physical phenomena, the construction of an optimal control of such systems has proven difficult since it demands some type of optimal mode scheduling. In this paper, we devise an algorithm for the computation of an optimal control of constrained nonlinear switched dynamical systems. The control parameter for such systems include a continuous-valued input and discrete-valued input, where the latter corresponds to the mode of the switched system that is active at a particular instance in time. Our approach, which we prove converges to local minimizers of the constrained optimal control problem, first relaxes the discrete-valued input, then performs traditional optimal control, and then projects the constructed relaxed discrete-valued input back to a pure discrete-valued input by employing an extension to the classical Chattering Lemma that we prove. We extend this algorithm by formulating a computationally implementable algorithm which works by discretizing the time interval over which the switched dynamical system is defined. Importantly, we prove that this implementable algorithm constructs a sequence of points by recursive application that converge to the local minimizers of the original constrained optimal control problem. Four simulation experiments are included to validate the theoretical developments.

preprint2012arXiv

Fast L1-Minimization Algorithms For Robust Face Recognition

L1-minimization refers to finding the minimum L1-norm solution to an underdetermined linear system b=Ax. Under certain conditions as described in compressive sensing theory, the minimum L1-norm solution is also the sparsest solution. In this paper, our study addresses the speed and scalability of its algorithms. In particular, we focus on the numerical implementation of a sparsity-based classification framework in robust face recognition, where sparse representation is sought to recover human identities from very high-dimensional facial images that may be corrupted by illumination, facial disguise, and pose variation. Although the underlying numerical problem is a linear program, traditional algorithms are known to suffer poor scalability for large-scale applications. We investigate a new solution based on a classical convex optimization framework, known as Augmented Lagrangian Methods (ALM). The new convex solvers provide a viable solution to real-world, time-critical applications such as face recognition. We conduct extensive experiments to validate and compare the performance of the ALM algorithms against several popular L1-minimization solvers, including interior-point method, Homotopy, FISTA, SESOP-PCD, approximate message passing (AMP) and TFOCS. To aid peer evaluation, the code for all the algorithms has been made publicly available.

preprint2012arXiv

On the Lagrangian Biduality of Sparsity Minimization Problems

Recent results in Compressive Sensing have shown that, under certain conditions, the solution to an underdetermined system of linear equations with sparsity-based regularization can be accurately recovered by solving convex relaxations of the original problem. In this work, we present a novel primal-dual analysis on a class of sparsity minimization problems. We show that the Lagrangian bidual (i.e., the Lagrangian dual of the Lagrangian dual) of the sparsity minimization problems can be used to derive interesting convex relaxations: the bidual of the $\ell_0$-minimization problem is the $\ell_1$-minimization problem; and the bidual of the $\ell_{0,1}$-minimization problem for enforcing group sparsity on structured data is the $\ell_{1,\infty}$-minimization problem. The analysis provides a means to compute per-instance non-trivial lower bounds on the (group) sparsity of the desired solutions. In a real-world application, the bidual relaxation improves the performance of a sparsity-based classification framework applied to robust face recognition.

preprint2012arXiv

Provably Safe and Robust Learning-Based Model Predictive Control

Controller design faces a trade-off between robustness and performance, and the reliability of linear controllers has caused many practitioners to focus on the former. However, there is renewed interest in improving system performance to deal with growing energy constraints. This paper describes a learning-based model predictive control (LBMPC) scheme that provides deterministic guarantees on robustness, while statistical identification tools are used to identify richer models of the system in order to improve performance; the benefits of this framework are that it handles state and input constraints, optimizes system performance with respect to a cost function, and can be designed to use a wide variety of parametric or nonparametric statistical tools. The main insight of LBMPC is that safety and performance can be decoupled under reasonable conditions in an optimization framework by maintaining two models of the system. The first is an approximate model with bounds on its uncertainty, and the second model is updated by statistical methods. LBMPC improves performance by choosing inputs that minimize a cost subject to the learned dynamics, and it ensures safety and robustness by checking whether these same inputs keep the approximate model stable when it is subject to uncertainty. Furthermore, we show that if the system is sufficiently excited, then the LBMPC control action probabilistically converges to that of an MPC computed using the true dynamics.

preprint2012arXiv

Statistical Results on Filtering and Epi-convergence for Learning-Based Model Predictive Control

Learning-based model predictive control (LBMPC) is a technique that provides deterministic guarantees on robustness, while statistical identification tools are used to identify richer models of the system in order to improve performance. This technical note provides proofs that elucidate the reasons for our choice of measurement model, as well as giving proofs concerning the stochastic convergence of LBMPC. The first part of this note discusses simultaneous state estimation and statistical identification (or learning) of unmodeled dynamics, for dynamical systems that can be described by ordinary differential equations (ODE's). The second part provides proofs concerning the epi-convergence of different statistical estimators that can be used with the learning-based model predictive control (LBMPC) technique. In particular, we prove results on the statistical properties of a nonparametric estimator that we have designed to have the correct deterministic and stochastic properties for numerical implementation when used in conjunction with LBMPC.

preprint2011arXiv

Dimension Reduction Near Periodic Orbits of Hybrid Systems

When the Poincaré map associated with a periodic orbit of a hybrid dynamical system has constant-rank iterates, we demonstrate the existence of a constant-dimensional invariant subsystem near the orbit which attracts all nearby trajectories in finite time. This result shows that the long-term behavior of a hybrid model with a large number of degrees-of-freedom may be governed by a low-dimensional smooth dynamical system. The appearance of such simplified models enables the translation of analytical tools from smooth systems-such as Floquet theory-to the hybrid setting and provides a bridge between the efforts of biologists and engineers studying legged locomotion.

preprint2010arXiv

Mean-square boundedness of stochastic networked control systems with bounded control inputs

We consider the problem of controlling marginally stable linear systems using bounded control inputs for networked control settings in which the communication channel between the remote controller and the system is unreliable. We assume that the states are perfectly observed, but the control inputs are transmitted over a noisy communication channel. Under mild hypotheses on the noise introduced by the control communication channel and large enough control authority, we construct a control policy that renders the state of the closed-loop system mean-square bounded.

S. Shankar Sastry

What is connected

Connect this record

See the researcher in context

Building this map preview

47 published item(s)

DEC-LOS-RRT: Decentralized Path Planning for Multi-robot Systems with Line-of-sight Constrained Communication

Who Leads and Who Follows in Strategic Classification?

Maximum Likelihood Constraint Inference from Stochastic Demonstrations

Exponentially Stable First Order Control on Matrix Lie Groups

Feedback Linearization for Unknown Systems via Reinforcement Learning

Improving Input-Output Linearizing Controllers for Bipedal Robots via Reinforcement Learning

LESS is More: Rethinking Probabilistic Models of Human Behavior

Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning

On Gradient-Based Learning in Continuous Games

Technical Report: Adaptive Control for Linearizable Systems Using On-Policy Reinforcement Learning

Towards Verified Artificial Intelligence

Diagnosis and Repair for Synthesis from Signal Temporal Logic Specifications

Differential Privacy of Populations in Routing Games

Dissimilarity-based Sparse Subset Selection

Privacy-Enhanced Architecture for Occupancy-based HVAC Control

Approximation Algorithms for Optimization of Combinatorial Dynamical Systems

Event-Selected Vector Field Discontinuities Yield Piecewise-Differentiable Flows

Model Reduction Near Periodic Orbits of Hybrid Dynamical Systems

Quantifying the Utility-Privacy Tradeoff in the Smart Grid

A Learning Based Approach to Control Synthesis of Markov Decision Processes for Linear Temporal Logic Specifications

Effects of Risk on Privacy Contracts for Demand-Side Management

Experimental Design for Human-in-the-Loop Driving Simulations

Incentive Design and Utility Learning via Energy Disaggregation

Metrization and Simulation of Controlled Hybrid Systems

On the Characterization of Local Nash Equilibria in Continuous Games

Privacy and Customer Segmentation in the Smart Grid

Rapid Integration and Calibration of New Sensors Using the Berkeley Aachen Robotics Toolkit (BART)

Reach-Avoid Problems with Time-Varying Dynamics, Targets and Constraints

Social Game for Building Energy Efficiency: Utility Learning, Simulation, and Analysis

Sparse Illumination Learning and Transfer for Single-Sample Face Recognition with Image Corruption and Misalignment

Blind Identification of ARX Models with Piecewise Constant Inputs

Compressive Shift Retrieval

Energy Disaggregation via Adaptive Filtering

Fundamental Limits of Nonintrusive Load Monitoring

Incentive Mechanisms for Internet Congestion Management: Fixed-Budget Rebate versus Time-of-Day Pricing

Nonlinear Basis Pursuit

Nonlinear Compressive Particle Filtering

Quadratic Basis Pursuit

Scalable Anomaly Detection in Large Homogenous Populations

Compressive Phase Retrieval From Squared Output Measurements Via Semidefinite Programming

Consistent Approximations for the Optimal Control of Constrained Switched Systems

Fast L1-Minimization Algorithms For Robust Face Recognition

On the Lagrangian Biduality of Sparsity Minimization Problems

Provably Safe and Robust Learning-Based Model Predictive Control

Statistical Results on Filtering and Epi-convergence for Learning-Based Model Predictive Control

Dimension Reduction Near Periodic Orbits of Hybrid Systems

Mean-square boundedness of stochastic networked control systems with bounded control inputs