Source author record

Sayak Mukherjee

Sayak Mukherjee appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.stat-mech eess.SY Machine Learning Quantitative Methods Systems and Control Molecular Networks Cell Behavior Populations and Evolution

Catalog footprint

What is connected

11works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Learning Stochastic Parametric Differentiable Predictive Control Policies

The problem of synthesizing stochastic explicit model predictive control policies is known to be quickly intractable even for systems of modest complexity when using classical control-theoretic methods. To address this challenge, we present a scalable alternative called stochastic parametric differentiable predictive control (SP-DPC) for unsupervised learning of neural control policies governing stochastic linear systems subject to nonlinear chance constraints. SP-DPC is formulated as a deterministic approximation to the stochastic parametric constrained optimal control problem. This formulation allows us to directly compute the policy gradients via automatic differentiation of the problem's value function, evaluated over sampled parameters and uncertainties. In particular, the computed expectation of the SP-DPC problem's value function is backpropagated through the closed-loop system rollouts parametrized by a known nominal system dynamics model and neural control policy which allows for direct model-based policy optimization. We provide theoretical probabilistic guarantees for policies learned via the SP-DPC method on closed-loop stability and chance constraints satisfaction. Furthermore, we demonstrate the computational efficiency and scalability of the proposed policy optimization algorithm in three numerical examples, including systems with a large number of states or subject to nonlinear constraints.

preprint2021arXiv

Imposing Robust Structured Control Constraint on Reinforcement Learning of Linear Quadratic Regulator

This paper discusses learning a structured feedback control to obtain sufficient robustness to exogenous inputs for linear dynamic systems with unknown state matrix. The structural constraint on the controller is necessary for many cyber-physical systems, and our approach presents a design for any generic structure, paving the way for distributed learning control. The ideas from reinforcement learning (RL) in conjunction with control-theoretic sufficient stability and performance guarantees are used to develop the methodology. First, a model-based framework is formulated using dynamic programming to embed the structural constraint in the linear quadratic regulator (LQR) setting along with sufficient robustness conditions. Thereafter, we translate these conditions to a data-driven learning-based framework - robust structured reinforcement learning (RSRL) that enjoys the control-theoretic guarantees on stability and convergence. We validate our theoretical results with a simulation on a multi-agent network with 6 agents.

preprint2021arXiv

Scalable Voltage Control using Structure-Driven Hierarchical Deep Reinforcement Learning

This paper presents a novel hierarchical deep reinforcement learning (DRL) based design for the voltage control of power grids. DRL agents are trained for fast, and adaptive selection of control actions such that the voltage recovery criterion can be met following disturbances. Existing voltage control techniques suffer from the issues of speed of operation, optimal coordination between different locations, and scalability. We exploit the area-wise division structure of the power system to propose a hierarchical DRL design that can be scaled to the larger grid models. We employ an enhanced augmented random search algorithm that is tailored for the voltage control problem in a two-level architecture. We train area-wise decentralized RL agents to compute lower-level policies for the individual areas, and concurrently train a higher-level DRL agent that uses the updates of the lower-level policies to efficiently coordinate the control actions taken by the lower-level agents. Numerical experiments on the IEEE benchmark 39-bus model with 3 areas demonstrate the advantages and various intricacies of the proposed hierarchical approach.

preprint2020arXiv

Reduced-Dimensional Reinforcement Learning Control using Singular Perturbation Approximations

We present a set of model-free, reduced-dimensional reinforcement learning (RL) based optimal control designs for linear time-invariant singularly perturbed (SP) systems. We first present a state-feedback and output-feedback based RL control design for a generic SP system with unknown state and input matrices. We take advantage of the underlying time-scale separation property of the plant to learn a linear quadratic regulator (LQR) for only its slow dynamics, thereby saving a significant amount of learning time compared to the conventional full-dimensional RL controller. We analyze the sub-optimality of the design using SP approximation theorems and provide sufficient conditions for closed-loop stability. Thereafter, we extend both designs to clustered multi-agent consensus networks, where the SP property reflects through clustering. We develop both centralized and cluster-wise block-decentralized RL controllers for such networks, in reduced dimensions. We demonstrate the details of the implementation of these controllers using simulations of relevant numerical examples and compare them with conventional RL designs to show the computational benefits of our approach.

preprint2015arXiv

Maximum Entropy estimation of probability distribution of variables in higher dimensions from lower dimensional data

A common statistical situation concerns inferring an unknown distribution Q(x) from a known distribution P(y), where X (dimension n), and Y (dimension m) have a known functional relationship. Most commonly, n<m, and the task is relatively straightforward. For example, if Y1 and Y2 are independent random variables, each uniform on [0, 1], one can determine the distribution of X = Y1 + Y2; here m=2 and n=1. However, biological and physical situations can arise where n>m. In general, in the absence of additional information, there is no unique solution to Q in those cases. Nevertheless, one may still want to draw some inferences about Q. To this end, we propose a novel maximum entropy (MaxEnt) approach that estimates Q(x) based only on the available data, namely, P(y). The method has the additional advantage that one does not need to explicitly calculate the Lagrange multipliers. In this paper we develop the approach, for both discrete and continuous probability distributions, and demonstrate its validity. We give an intuitive justification as well, and we illustrate with examples.

preprint2014arXiv

Host-to-host variation of ecological interactions in polymicrobial infections

Host-to-host variability with respect to interactions between microorganisms and multicellular hosts are commonly observed in infection and in homeostasis. However, the majority of mechanistic models used in analyzing host-microorganism relationships, as well as most of the ecological theories proposed to explain co-evolution of host and microbes, are based on averages across a host population. By assuming that observed variations are random and independent, these models overlook the role of inter-host differences. Here we analyze mechanisms underlying host-to-host variations, using the well-characterized experimental infection model of polymicrobial otitis media (OM) in chinchillas, in combination with population dynamic models and a Maximum Entropy (MaxEnt) based inference scheme. We find that the nature of the interactions among bacterial species critically regulates host-to-host variations of these interactions. Surprisingly, seemingly unrelated phenomena, such as the efficiency of individual bacterial species in utilizing nutrients for growth and the microbe-specific host immune response, can become interdependent in a host population. The latter finding suggests a potential mechanism that could lead to selection of specific strains of bacterial species during the coevolution of the host immune response and the bacterial species.

preprint2013arXiv

Cell responses only partially shape cell-to-cell variations in protein abundances in Escherichia coli chemotaxis

Cell-to-cell variations in protein abundance in clonal cell populations are ubiquitous in living systems. Since protein composition determines responses in individual cells, it stands to reason that the variations themselves are subject to selective pressures. But the functional role of these cell-to-cell differences is not well understood. One way to tackle questions regarding relationships between form and function is to perturb the form (e.g., change the protein abundances) and observe the resulting changes in some function. Here we take on the form-function relationship from the inverse perspective, asking instead what specific constraints on cell-to-cell variations in protein abundance are imposed by a given functional phenotype. We develop a maximum entropy (MaxEnt) based approach to posing questions of this type, and illustrate the method by application to the well characterized chemotactic response in Escherichia coli (E. coli). We find that full determination of observed cell-to-cell variations in protein abundances is not inherent in chemotaxis itself, but in fact appears to be jointly imposed by the chemotaxis program in conjunction with other factors, e.g., the protein synthesis machinery and/or additional non-chemotactic cell functions such as cell metabolism. These results illustrate the power of MaxEnt as a tool for the investigation of relationships between biological form and function.

preprint2013arXiv

Data-driven quantification of robustness and sensitivity of cell signaling networks

Robustness and sensitivity of responses generated by cell signaling networks has been associated with survival and evolvability of organisms. However, existing methods analyzing robustness and sensitivity of signaling networks ignore the experimentally observed cell-to-cell variations of protein abundances and cell functions or contain ad hoc assumptions. We propose and apply a data driven Maximum Entropy (MaxEnt) based method to quantify robustness and sensitivity of Escherichia coli (E. coli) chemotaxis signaling network. Our analysis correctly rank orders different models of E. coli chemotaxis based on their robustness and suggests that parameters regulating cell signaling are evolutionary selected to vary in individual cells according to their abilities to perturb cell functions. Furthermore, predictions from our approach regarding distribution of protein abundances and properties of chemotactic responses in individual cells based on cell population averaged data are in excellent agreement with their experimental counterparts. Our approach is general and can be used to evaluate robustness as well as generate predictions of single cell properties based on population averaged experimental data in a wide range of cell signaling systems.

preprint2013arXiv

In silico Modeling of Itk Activation Kinetics in Thymocytes Suggests Competing Positive and Negative IP4 Mediated Feedbacks Increase Robustness

The inositol-phosphate messenger inositol(1,3,4,5)tetrakisphosphate (IP4) is essential for thymocyte positive selection by regulating plasma-membrane association of the protein tyrosine kinase Itk downstream of the T cell receptor (TCR). IP4 can act as a soluble analog of the phosphoinositide 3-kinase (PI3K) membrane lipid product phosphatidylinositol(3,4,5)trisphosphate (PIP3). PIP3 recruits signaling proteins such as Itk to cellular membranes by binding to PH and other domains. In thymocytes, low-dose IP4 binding to the Itk PH domain surprisingly promoted and high-dose IP4 inhibited PIP3 binding of Itk PH domains. However, the mechanisms that underlie the regulation of membrane recruitment of Itk by IP4 and PIP3 remain unclear. The distinct Itk PH domain ability to oligomerize is consistent with a cooperative-allosteric mode of IP4 action. However, other possibilities cannot be ruled out due to difficulties in quantitatively measuring the interactions between Itk, IP4 and PIP3, and in generating non-oligomerizing Itk PH domain mutants. This has hindered a full mechanistic understanding of how IP4 controls Itk function. By combining experimentally measured kinetics of PLCγ1 phosphorylation by Itk with in silico modeling of multiple Itk signaling circuits and a maximum entropy (MaxEnt) based computational approach, we show that those in silico models which are most robust against variations of protein and lipid expression levels and kinetic rates at the single cell level share a cooperative-allosteric mode of Itk regulation by IP4 involving oligomeric Itk PH domains at the plasma membrane. This identifies MaxEnt as an excellent tool for quantifying robustness for complex TCR signaling circuits and provides testable predictions to further elucidate a controversial mechanism of PIP3 signaling.

preprint2012arXiv

Dramatic reduction of dimensionality in large biochemical networks due to strong pair correlations

Large multidimensionality of high-throughput datasets pertaining to cell signaling and gene regulation renders it difficult to extract mechanisms underlying the complex kinetics involving various biochemical compounds (e.g., proteins, lipids). Data-driven models often circumvent this difficulty by using pair correlations of the protein expression levels to produce a small numbers (<10) of principal components, each a linear combination of the concentrations, to successfully model how cells respond to different stimuli. However, it is not understood if this reduction is specific to a particular biological system or to nature of the stimuli used in these experiments. We study temporal changes in pair correlations described by the covariance matrix between different molecular species that evolve following deterministic mass action kinetics in large biologically relevant reaction networks and show that this dramatic reduction of dimensions (from hundreds to <5) arises from the strong correlations between different species at any time and is in sensitive of the form of the nonlinear interactions, network architecture and values of rate constants and concentrations over a wide range. We relate temporal changes in the eigenvalue spectrum of the covariance matrix to low-dimensional, local changes in directions of the trajectory embedded in much larger dimensions using elementary differential geometry. We illustrate how to extract biologically relevant insights such as identifying significant time scales and groups of correlated chemical species from our analysis. Our work provides for the first time a theoretical underpinning for the successful experimental analysis and points to way to extract mechanisms from large- scale high throughput data sets.

preprint2010arXiv

Dynamical phase transition of a 1D transport process including death

Motivated by biological aspects related to fungus growth, we consider the competition of growth and corrosion. We study a modification of the totally asymmetric exclusion process, including the probabilities of injection $α$ and death of the last particle $δ$. The system presents a phase transition at $δ_c(α)$, where the average position of the last particle $<L>$ grows as $\sqrt{t}$. For $δ>δ_c$, a non equilibrium stationary state exists while for $δ<δ_c$ the asymptotic state presents a low density and max current phases. We discuss the scaling of the density and current profiles for parallel and sequential updates.

Sayak Mukherjee

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Learning Stochastic Parametric Differentiable Predictive Control Policies

Imposing Robust Structured Control Constraint on Reinforcement Learning of Linear Quadratic Regulator

Scalable Voltage Control using Structure-Driven Hierarchical Deep Reinforcement Learning

Reduced-Dimensional Reinforcement Learning Control using Singular Perturbation Approximations

Maximum Entropy estimation of probability distribution of variables in higher dimensions from lower dimensional data

Host-to-host variation of ecological interactions in polymicrobial infections

Cell responses only partially shape cell-to-cell variations in protein abundances in Escherichia coli chemotaxis

Data-driven quantification of robustness and sensitivity of cell signaling networks

In silico Modeling of Itk Activation Kinetics in Thymocytes Suggests Competing Positive and Negative IP4 Mediated Feedbacks Increase Robustness

Dramatic reduction of dimensionality in large biochemical networks due to strong pair correlations

Dynamical phase transition of a 1D transport process including death