Source author record

Jemin George

Jemin George appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC eess.SY Machine Learning Systems and Control Artificial Intelligence Distributed, Parallel, and Cluster Computing Multiagent Systems Computer Science and Game Theory math.DS math.ST Statistics Theory

Catalog footprint

What is connected

10works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Asynchronous Local Computations in Distributed Bayesian Learning

Due to the expanding scope of machine learning (ML) to the fields of sensor networking, cooperative robotics and many other multi-agent systems, distributed deployment of inference algorithms has received a lot of attention. These algorithms involve collaboratively learning unknown parameters from dispersed data collected by multiple agents. There are two competing aspects in such algorithms, namely, intra-agent computation and inter-agent communication. Traditionally, algorithms are designed to perform both synchronously. However, certain circumstances need frugal use of communication channels as they are either unreliable, time-consuming, or resource-expensive. In this paper, we propose gossip-based asynchronous communication to leverage fast computations and reduce communication overhead simultaneously. We analyze the effects of multiple (local) intra-agent computations by the active agents between successive inter-agent communications. For local computations, Bayesian sampling via unadjusted Langevin algorithm (ULA) MCMC is utilized. The communication is assumed to be over a connected graph (e.g., as in decentralized learning), however, the results can be extended to coordinated communication where there is a central server (e.g., federated learning). We theoretically quantify the convergence rates in the process. To demonstrate the efficacy of the proposed algorithm, we present simulations on a toy problem as well as on real world data sets to train ML models to perform classification tasks. We observe faster initial convergence and improved performance accuracy, especially in the low data range. We achieve on average 78% and over 90% classification accuracy respectively on the Gamma Telescope and mHealth data sets from the UCI ML repository.

preprint2022arXiv

Distributed Cooperative Multi-Agent Reinforcement Learning with Directed Coordination Graph

Existing distributed cooperative multi-agent reinforcement learning (MARL) frameworks usually assume undirected coordination graphs and communication graphs while estimating a global reward via consensus algorithms for policy evaluation. Such a framework may induce expensive communication costs and exhibit poor scalability due to requirement of global consensus. In this work, we study MARLs with directed coordination graphs, and propose a distributed RL algorithm where the local policy evaluations are based on local value functions. The local value function of each agent is obtained by local communication with its neighbors through a directed learning-induced communication graph, without using any consensus algorithm. A zeroth-order optimization (ZOO) approach based on parameter perturbation is employed to achieve gradient estimation. By comparing with existing ZOO-based RL algorithms, we show that our proposed distributed RL algorithm guarantees high scalability. A distributed resource allocation example is shown to illustrate the effectiveness of our algorithm.

preprint2021arXiv

A Decentralized Approach to Bayesian Learning

Motivated by decentralized approaches to machine learning, we propose a collaborative Bayesian learning algorithm taking the form of decentralized Langevin dynamics in a non-convex setting. Our analysis show that the initial KL-divergence between the Markov Chain and the target posterior distribution is exponentially decreasing while the error contributions to the overall KL-divergence from the additive noise is decreasing in polynomial time. We further show that the polynomial-term experiences speed-up with number of agents and provide sufficient conditions on the time-varying step-sizes to guarantee convergence to the desired distribution. The performance of the proposed algorithm is evaluated on a wide variety of machine learning tasks. The empirical results show that the performance of individual agents with locally available data is on par with the centralized setting with considerable improvement in the convergence rate.

preprint2021arXiv

Decomposability and Parallel Computation of Multi-Agent LQR

Individual agents in a multi-agent system (MAS) may have decoupled open-loop dynamics, but a cooperative control objective usually results in coupled closed-loop dynamics thereby making the control design computationally expensive. The computation time becomes even higher when a learning strategy such as reinforcement learning (RL) needs to be applied to deal with the situation when the agents dynamics are not known. To resolve this problem, we propose a parallel RL scheme for a linear quadratic regulator (LQR) design in a continuous-time linear MAS. The idea is to exploit the structural properties of two graphs embedded in the $Q$ and $R$ weighting matrices in the LQR objective to define an orthogonal transformation that can convert the original LQR design to multiple decoupled smaller-sized LQR designs. We show that if the MAS is homogeneous then this decomposition retains closed-loop optimality. Conditions for decomposability, an algorithm for constructing the transformation matrix, a parallel RL algorithm, and robustness analysis when the design is applied to non-homogeneous MAS are presented. Simulations show that the proposed approach can guarantee significant speed-up in learning without any loss in the cumulative value of the LQR cost.

preprint2021arXiv

Learning Distributed Stabilizing Controllers for Multi-Agent Systems

We address the problem of model-free distributed stabilization of heterogeneous multi-agent systems using reinforcement learning (RL). Two algorithms are developed. The first algorithm solves a centralized linear quadratic regulator (LQR) problem without knowing any initial stabilizing gain in advance. The second algorithm builds upon the results of the first algorithm, and extends it to distributed stabilization of multi-agent systems with predefined interaction graphs. Rigorous proofs are provided to show that the proposed algorithms achieve guaranteed convergence if specific conditions hold. A simulation example is presented to demonstrate the theoretical results.

preprint2020arXiv

A Finite-Time Algorithm for the Distributed Tracking of Maneuvering Target

This paper presents a novel distributed algorithm for tracking a maneuvering target using bearing or direction of arrival measurements collected by a networked sensor array. The proposed approach is built on the dynamic average-consensus algorithm, which allows a networked group of agents (nodes) to reach consensus on the global average of a set of local time-varying signals in a distributed fashion. Since the average-consensus error corresponding to the presented dynamic average-consensus algorithm converges to zero in finite time, the proposed distributed algorithm guarantees that the tracking error converges to zero in finite time. Numerical simulations are provided to illustrate the effectiveness of the proposed algorithm.

preprint2020arXiv

Hierarchical Control of Multi-Agent Systems using Online Reinforcement Learning

We propose a new reinforcement learning based approach to designing hierarchical linear quadratic regulator (LQR) controllers for heterogeneous linear multi-agent systems with unknown state-space models and separated control objectives. The separation arises from grouping the agents into multiple non-overlapping groups, and defining the control goal as two distinct objectives. The first objective aims to minimize a group-wise block-decentralized LQR function that models group-level mission. The second objective, on the other hand, tries to minimize an LQR function between the average states (centroids) of the groups. Exploiting this separation, we redefine the weighting matrices of the LQR functions in a way that they allow us to decouple their respective algebraic Riccati equations. Thereafter, we develop a reinforcement learning strategy that uses online measurements of the agent states and the average states to learn the respective controllers based on the approximate Riccati equations. Since the first controller is block-decentralized and, therefore, can be learned in parallel, while the second controller is reduced-dimensional due to averaging, the overall design enjoys a significantly reduced learning time compared to centralized reinforcement learning.

preprint2020arXiv

High-Resolution Modeling of the Fastest First-Order Optimization Method for Strongly Convex Functions

Motivated by the fact that the gradient-based optimization algorithms can be studied from the perspective of limiting ordinary differential equations (ODEs), here we derive an ODE representation of the accelerated triple momentum (TM) algorithm. For unconstrained optimization problems with strongly convex cost, the TM algorithm has a proven faster convergence rate than the Nesterov's accelerated gradient (NAG) method but with the same computational complexity. We show that similar to the NAG method to capture accurately the characteristics of the TM method, we need to use a high-resolution modeling to obtain the ODE representation of the TM algorithm. We use a Lyapunov analysis to investigate the stability and convergence behavior of the proposed high-resolution ODE representation of the TM algorithm. We show through this analysis that this ODE model has robustness to deviation from the parameters of the TM algorithm. We compare the rate of the ODE representation of the TM method with that of the NAG method to confirm its faster convergence. Our study also leads to a tighter bound on the worst rate of convergence for the ODE model of the NAG method. Lastly, we discuss the use of the integral quadratic constraint (IQC) method to establish an estimate on the rate of convergence of the TM algorithm. A numerical example demonstrates our results.

preprint2020arXiv

SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Stochastic Optimization

In this paper, we propose and analyze SPARQ-SGD, which is an event-triggered and compressed algorithm for decentralized training of large-scale machine learning models. Each node can locally compute a condition (event) which triggers a communication where quantized and sparsified local model parameters are sent. In SPARQ-SGD each node takes at least a fixed number ($H$) of local gradient steps and then checks if the model parameters have significantly changed compared to its last update; it communicates further compressed model parameters only when there is a significant change, as specified by a (design) criterion. We prove that the SPARQ-SGD converges as $O(\frac{1}{nT})$ and $O(\frac{1}{\sqrt{nT}})$ in the strongly-convex and non-convex settings, respectively, demonstrating that such aggressive compression, including event-triggered communication, model sparsification and quantization does not affect the overall convergence rate as compared to uncompressed decentralized training; thereby theoretically yielding communication efficiency for "free". We evaluate SPARQ-SGD over real datasets to demonstrate significant amount of savings in communication over the state-of-the-art.

preprint2016arXiv

Strategic Seeding of Rival Opinions

We present a network influence game that models players strategically seeding the opinions of nodes embedded in a social network. A social learning dynamic, whereby nodes repeatedly update their opinions to resemble those of their neighbors, spreads the seeded opinions through the network. After a fixed period of time, the dynamic halts and each player's utility is determined by the relative strength of the opinions held by each node in the network vis-a-vis the other players. We show that the existence of a pure Nash equilibrium cannot be guaranteed in general. However, if the dynamics are allowed to progress for a sufficient amount of time so that a consensus among all of the nodes is obtained, then the existence of a pure Nash equilibrium can be guaranteed. The computational complexity of finding a pure strategy best response is shown to be NP-complete, but can be efficiently approximated to within a (1 - 1/e) factor of optimal by a simple greedy algorithm.

Jemin George

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

Asynchronous Local Computations in Distributed Bayesian Learning

Distributed Cooperative Multi-Agent Reinforcement Learning with Directed Coordination Graph

A Decentralized Approach to Bayesian Learning

Decomposability and Parallel Computation of Multi-Agent LQR

Learning Distributed Stabilizing Controllers for Multi-Agent Systems

A Finite-Time Algorithm for the Distributed Tracking of Maneuvering Target

Hierarchical Control of Multi-Agent Systems using Online Reinforcement Learning

High-Resolution Modeling of the Fastest First-Order Optimization Method for Strongly Convex Functions

SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Stochastic Optimization

Strategic Seeding of Rival Opinions