Researcher profile

Aranya Chakrabortty

Aranya Chakrabortty contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2022arXiv

A Robust Stackelberg Game for Cyber-Security Investment in Networked Control Systems

We present a resource-planning game for cyber-security of networked control systems (NCS). The NCS is assumed to be operating in closed-loop using a linear state-feedback $\mathcal{H}_2$ controller. A zero-sum, two-player Stackelberg game (SG) is developed between an attacker and a defender for this NCS. The attacker aims to disable communication of selected nodes and thereby render the feedback gain matrix to be sparse, leading to degradation of closed-loop performance, while the defender aims to prevent this loss by investing in the protection of targeted nodes. Both players trade their $\mathcal{H}_2$-performance objectives for the costs of their actions. The standard backward induction method is modified to determine a cost-based Stackelberg equilibrium (CBSE) that saves the players' costs without degrading the control performance. We analyze the dependency of a CBSE on the relative budgets of the players as well as on the node "importance" order. Moreover, a robust-defense method is developed for the realistic case when the defender is not informed about the attacker's resources. The proposed algorithms are validated using examples from wide-area control of electric power systems. It is demonstrated that reliable and robust defense is feasible unless the defender's resources are severely limited relative to the attacker's resources. We also show that the proposed methods are robust to time-varying model uncertainties and thus are suitable for long-term security investment in realistic NCSs. Finally, we employ computationally efficient genetic algorithms (GA) to compute the optimal strategies of the attacker and the defender in realistic large power systems.

preprint2022arXiv

Distributed Cooperative Multi-Agent Reinforcement Learning with Directed Coordination Graph

Existing distributed cooperative multi-agent reinforcement learning (MARL) frameworks usually assume undirected coordination graphs and communication graphs while estimating a global reward via consensus algorithms for policy evaluation. Such a framework may induce expensive communication costs and exhibit poor scalability due to requirement of global consensus. In this work, we study MARLs with directed coordination graphs, and propose a distributed RL algorithm where the local policy evaluations are based on local value functions. The local value function of each agent is obtained by local communication with its neighbors through a directed learning-induced communication graph, without using any consensus algorithm. A zeroth-order optimization (ZOO) approach based on parameter perturbation is employed to achieve gradient estimation. By comparing with existing ZOO-based RL algorithms, we show that our proposed distributed RL algorithm guarantees high scalability. A distributed resource allocation example is shown to illustrate the effectiveness of our algorithm.

preprint2022arXiv

Hierarchical Frequency and Voltage Control using Prioritized Utilization of Inverter Based Resources

We propose a novel hierarchical frequency and voltage control design for multi-area power system integrated with inverter-based resources (IBRs). The design is based on the idea of prioritizing the use of IBRs over conventional generator-based control in compensating for sudden and unpredicted changes in loads and generations, and thereby mitigate any undesired dynamics in the frequency or the voltage by exploiting their fast actuation time constants. A new sequential optimization problem, referred to as Area Prioritized Power Flow (APPF), is formulated to model this prioritization. It is shown that compared to conventional power flow APPF not only leads to a fairer balance between the dispatch of active and reactive power from the IBRs and the synchronous generators, but also limits the impact of any contingency from spreading out beyond its respective control area, thereby guaranteeing a better collective dynamic performance of the grid. This improvement, however, comes at the cost of adding an extra layer of communication needed for executing APPF in a hierarchical way. Results are validated using simulations of a 9-machine, 6-IBR, 33-bus, 3-area power system model, illustrating how APPF can mitigate a disturbance faster and more efficiently by prioritizing the use of local area-resources.

preprint2021arXiv

Decomposability and Parallel Computation of Multi-Agent LQR

Individual agents in a multi-agent system (MAS) may have decoupled open-loop dynamics, but a cooperative control objective usually results in coupled closed-loop dynamics thereby making the control design computationally expensive. The computation time becomes even higher when a learning strategy such as reinforcement learning (RL) needs to be applied to deal with the situation when the agents dynamics are not known. To resolve this problem, we propose a parallel RL scheme for a linear quadratic regulator (LQR) design in a continuous-time linear MAS. The idea is to exploit the structural properties of two graphs embedded in the $Q$ and $R$ weighting matrices in the LQR objective to define an orthogonal transformation that can convert the original LQR design to multiple decoupled smaller-sized LQR designs. We show that if the MAS is homogeneous then this decomposition retains closed-loop optimality. Conditions for decomposability, an algorithm for constructing the transformation matrix, a parallel RL algorithm, and robustness analysis when the design is applied to non-homogeneous MAS are presented. Simulations show that the proposed approach can guarantee significant speed-up in learning without any loss in the cumulative value of the LQR cost.

preprint2021arXiv

Learning Distributed Stabilizing Controllers for Multi-Agent Systems

We address the problem of model-free distributed stabilization of heterogeneous multi-agent systems using reinforcement learning (RL). Two algorithms are developed. The first algorithm solves a centralized linear quadratic regulator (LQR) problem without knowing any initial stabilizing gain in advance. The second algorithm builds upon the results of the first algorithm, and extends it to distributed stabilization of multi-agent systems with predefined interaction graphs. Rigorous proofs are provided to show that the proposed algorithms achieve guaranteed convergence if specific conditions hold. A simulation example is presented to demonstrate the theoretical results.

preprint2021arXiv

Optimal Co-Designs of Communication and Control in Bandwidth-Constrained Cyber-Physical Systems

We address the problem of sparsity-promoting optimal control of cyber-physical systems (CPSs) in the presence of communication delays. The delays are categorized into two types - namely, an inter-layer delay for passing state and control information between the physical layer and the cyber layer, and an intra-layer delay that operates between the computing agents, referred to here as control nodes (CNs), within the cyber-layer. Our objective is to minimize the closed-loop H2-norm of the physical system by co-designing an optimal combination of these two delays and a sparse state-feedback controller while respecting a given bandwidth cost constraint. We propose a two-loop optimization algorithm for this. Based on the alternating directions method of multipliers (ADMM), the inner loop handles the conflicting directions between the decreasing H2-norm and the increasing sparsity level of the controller. The outer loop comprises a semidefinite program (SDP)-based relaxation of non-convex inequalities necessary for closed-loop stability. Moreover, for CPSs where the state and control information assigned to the CNs are not private, we derive an additional algorithm that further sparsifies the communication topology by modifying the row and column structures of the obtained controller, resulting in reassigning the communication map between the cyber and physical layers, and determining which physical agent should send its state information to which CN. Proofs for closed-loop stability and optimality are provided for both algorithms, followed by numerical simulations.

preprint2020arXiv

A Stackelberg Security Investment Game for Voltage Stability of Power Systems

We formulate a Stackelberg game between an attacker and a defender of a power system. The attacker attempts to alter the load setpoints of the power system covertly and intelligently, so that the voltage stability margin of the grid is reduced, driving the entire system towards a voltage collapse. The defender, or the system operator, aims to compensate for this reduction by retuning the reactive power injection to the grid by switching on control devices, such as a bank of shunt capacitors. A modified Backward Induction method is proposed to find a cost-based Stackelberg equilibrium (CBSE) of the game, which saves the players' costs while providing the optimal allocation of both players' investment resources under budget and covertness constraints. We analyze the proposed game extensively for the IEEE 9-bus power system model and present an example of its performance for the IEEE 39-bus power system model. It is demonstrated that the defender is able to maintain system stability unless its security budget is much lower than the attacker's budget.

preprint2020arXiv

Co-Design of Delays and Sparse Controllers for Bandwidth-Constrained Cyber-Physical Systems

We address the problem of sparsity-promoting optimal control of cyber-physical systems with feedback delays. The delays are categorized into two classes - namely, intra-layer delay, and inter-layer delay between the cyber and the physical layers. Our objective is to minimize the H2-norm of the closed-loop system by designing an optimal combination of these two delays along with a sparse state-feedback controller, while respecting a given bandwidth constraint. We propose a two-loop optimization algorithm for this. The inner loop, based on alternating directions method of multipliers (ADMM), handles the conflicting directions of decreasing H2-norm and increasing sparsity of the controller. The outer loop comprises of semidefinite program (SDP)-based relaxations of non-convex inequalities necessary for stable co-design of the delays with the controller. We illustrate this algorithm using simulations that highlight various aspects of how delays and sparsity impact the stability and $\mc{H}_2$ performance of a LTI system.

preprint2020arXiv

Fast Online Reinforcement Learning Control using State-Space Dimensionality Reduction

In this paper, we propose a fast reinforcement learning (RL) control algorithm that enables online control of large-scale networked dynamic systems. RL is an effective way of designing model-free linear quadratic regulator (LQR) controllers for linear time-invariant (LTI) networks with unknown state-space models. However, when the network size is large, conventional RL can result in unacceptably long learning times. The proposed approach is to construct a compressed state vector by projecting the measured state through a projective matrix. This matrix is constructed from online measurements of the states in a way that it captures the dominant controllable subspace of the open-loop network model. Next, a RL-controller is learned using the reduced-dimensional state instead of the original state such that the resultant cost is close to the optimal LQR cost. Numerical benefits as well as the cyber-physical implementation benefits of the approach are verified using illustrative examples including an example of wide-area control of the IEEE 68-bus benchmark power system.

preprint2020arXiv

Hierarchical Control of Multi-Agent Systems using Online Reinforcement Learning

We propose a new reinforcement learning based approach to designing hierarchical linear quadratic regulator (LQR) controllers for heterogeneous linear multi-agent systems with unknown state-space models and separated control objectives. The separation arises from grouping the agents into multiple non-overlapping groups, and defining the control goal as two distinct objectives. The first objective aims to minimize a group-wise block-decentralized LQR function that models group-level mission. The second objective, on the other hand, tries to minimize an LQR function between the average states (centroids) of the groups. Exploiting this separation, we redefine the weighting matrices of the LQR functions in a way that they allow us to decouple their respective algebraic Riccati equations. Thereafter, we develop a reinforcement learning strategy that uses online measurements of the agent states and the average states to learn the respective controllers based on the approximate Riccati equations. Since the first controller is block-decentralized and, therefore, can be learned in parallel, while the second controller is reduced-dimensional due to averaging, the overall design enjoys a significantly reduced learning time compared to centralized reinforcement learning.

preprint2020arXiv

Reduced-Dimensional Reinforcement Learning Control using Singular Perturbation Approximations

We present a set of model-free, reduced-dimensional reinforcement learning (RL) based optimal control designs for linear time-invariant singularly perturbed (SP) systems. We first present a state-feedback and output-feedback based RL control design for a generic SP system with unknown state and input matrices. We take advantage of the underlying time-scale separation property of the plant to learn a linear quadratic regulator (LQR) for only its slow dynamics, thereby saving a significant amount of learning time compared to the conventional full-dimensional RL controller. We analyze the sub-optimality of the design using SP approximation theorems and provide sufficient conditions for closed-loop stability. Thereafter, we extend both designs to clustered multi-agent consensus networks, where the SP property reflects through clustering. We develop both centralized and cluster-wise block-decentralized RL controllers for such networks, in reduced dimensions. We demonstrate the details of the implementation of these controllers using simulations of relevant numerical examples and compare them with conventional RL designs to show the computational benefits of our approach.