Source author record

Ali Jadbabaie

Ali Jadbabaie appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

79works

27topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs

Residual connections and normalization layers have become standard design choices for graph neural networks (GNNs), and were proposed as solutions to the mitigate the oversmoothing problem in GNNs. However, how exactly these methods help alleviate the oversmoothing problem from a theoretical perspective is not well understood. In this work, we provide a formal and precise characterization of (linearized) GNNs with residual connections and normalization layers. We establish that (a) for residual connections, the incorporation of the initial features at each layer can prevent the signal from becoming too smooth, and determines the subspace of possible node representations; (b) batch normalization prevents a complete collapse of the output embedding space to a one-dimensional subspace through the individual rescaling of each column of the feature matrix. This results in the convergence of node representations to the top-$k$ eigenspace of the message-passing operator; (c) moreover, we show that the centering step of a normalization layer -- which can be understood as a projection -- alters the graph signal in message-passing in such a way that relevant information can become harder to extract. We therefore introduce a novel, principled normalization layer called GraphNormv2 in which the centering step is learned such that it does not distort the original graph signal in an undesirable way. Experimental results confirm the effectiveness of our method.

preprint2024arXiv

Federated Optimization of Smooth Loss Functions

In this work, we study empirical risk minimization (ERM) within a federated learning framework, where a central server minimizes an ERM objective function using training data that is stored across $m$ clients. In this setting, the Federated Averaging (FedAve) algorithm is the staple for determining $ε$-approximate solutions to the ERM problem. Similar to standard optimization algorithms, the convergence analysis of FedAve only relies on smoothness of the loss function in the optimization parameter. However, loss functions are often very smooth in the training data too. To exploit this additional smoothness, we propose the Federated Low Rank Gradient Descent (FedLRGD) algorithm. Since smoothness in data induces an approximate low rank structure on the loss function, our method first performs a few rounds of communication between the server and clients to learn weights that the server can use to approximate clients' gradients. Then, our method solves the ERM problem at the server using inexact gradient descent. To show that FedLRGD can have superior performance to FedAve, we present a notion of federated oracle complexity as a counterpart to canonical oracle complexity. Under some assumptions on the loss function, e.g., strong convexity in parameter, $η$-Hölder smoothness in data, etc., we prove that the federated oracle complexity of FedLRGD scales like $ϕm(p/ε)^{Θ(d/η)}$ and that of FedAve scales like $ϕm(p/ε)^{3/4}$ (neglecting sub-dominant factors), where $ϕ\gg 1$ is a "communication-to-computation ratio," $p$ is the parameter dimension, and $d$ is the data dimension. Then, we show that when $d$ is small and the loss function is sufficiently smooth in the data, FedLRGD beats FedAve in federated oracle complexity. Finally, in the course of analyzing FedLRGD, we also establish a result on low rank approximation of latent variable models.

preprint2022arXiv

An Optimal Transport Approach to Personalized Federated Learning

Federated learning is a distributed machine learning paradigm, which aims to train a model using the local data of many distributed clients. A key challenge in federated learning is that the data samples across the clients may not be identically distributed. To address this challenge, personalized federated learning with the goal of tailoring the learned model to the data distribution of every individual client has been proposed. In this paper, we focus on this problem and propose a novel personalized Federated Learning scheme based on Optimal Transport (FedOT) as a learning algorithm that learns the optimal transport maps for transferring data points to a common distribution as well as the prediction model under the applied transport map. To formulate the FedOT problem, we extend the standard optimal transport task between two probability distributions to multi-marginal optimal transport problems with the goal of transporting samples from multiple distributions to a common probability domain. We then leverage the results on multi-marginal optimal transport problems to formulate FedOT as a min-max optimization problem and analyze its generalization and optimization properties. We discuss the results of several numerical experiments to evaluate the performance of FedOT under heterogeneous data distributions in federated learning problems.

preprint2022arXiv

Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity

We study oracle complexity of gradient based methods for stochastic approximation problems. Though in many settings optimal algorithms and tight lower bounds are known for such problems, these optimal algorithms do not achieve the best performance when used in practice. We address this theory-practice gap by focusing on instance-dependent complexity instead of worst case complexity. In particular, we first summarize known instance-dependent complexity results and categorize them into three levels. We identify the domination relation between different levels and propose a fourth instance-dependent bound that dominates existing ones. We then provide a sufficient condition according to which an adaptive algorithm with moment estimation can achieve the proposed bound without knowledge of noise levels. Our proposed algorithm and its analysis provide a theoretical justification for the success of moment estimation as it achieves improved instance complexity.

preprint2022arXiv

Byzantine-Robust Federated Linear Bandits

In this paper, we study a linear bandit optimization problem in a federated setting where a large collection of distributed agents collaboratively learn a common linear bandit model. Standard federated learning algorithms applied to this setting are vulnerable to Byzantine attacks on even a small fraction of agents. We propose a novel algorithm with a robust aggregation oracle that utilizes the geometric median. We prove that our proposed algorithm is robust to Byzantine attacks on fewer than half of agents and achieves a sublinear $\tilde{\mathcal{O}}({T^{3/4}})$ regret with $\mathcal{O}(\sqrt{T})$ steps of communication in $T$ steps. Moreover, we make our algorithm differentially private via a tree-based mechanism. Finally, if the level of corruption is known to be small, we show that using the geometric median of mean oracle for robust aggregation further improves the regret bound.

preprint2022arXiv

Current Implicit Policies May Not Eradicate COVID-19

Successful predictive modeling of epidemics requires an understanding of the implicit feedback control strategies which are implemented by populations to modulate the spread of contagion. While this task of capturing endogenous behavior can be achieved through intricate modeling assumptions, we find that a population's reaction to case counts can be described through a second order affine dynamical system with linear control which fits well to the data across different regions and times throughout the COVID-19 pandemic. The model fits the data well both in and out of sample across the 50 states of the United States, with comparable $R^2$ scores to state of the art ensemble predictions. In contrast to recent models of epidemics, rather than assuming that individuals directly control the contact rate which governs the spread of disease, we assume that individuals control the rate at which they vary their number of interactions, i.e. they control the derivative of the contact rate. We propose an implicit feedback law for this control input and verify that it correlates with policies taken throughout the pandemic. A key takeaway of the dynamical model is that the "stable" point of case counts is non-zero, i.e. COVID-19 will not be eradicated under the current collection of policies and strategies, and additional policies are needed to fully eradicate it quickly. Hence, we suggest alternative implicit policies which focus on making interventions (such as vaccinations and mobility restrictions) a function of cumulative case counts, for which our results suggest a better possibility of eradicating COVID-19.

preprint2022arXiv

Gradient Descent for Low-Rank Functions

Several recent empirical studies demonstrate that important machine learning tasks, e.g., training deep neural networks, exhibit low-rank structure, where the loss function varies significantly in only a few directions of the input space. In this paper, we leverage such low-rank structure to reduce the high computational cost of canonical gradient-based methods such as gradient descent (GD). Our proposed \emph{Low-Rank Gradient Descent} (LRGD) algorithm finds an $ε$-approximate stationary point of a $p$-dimensional function by first identifying $r \leq p$ significant directions, and then estimating the true $p$-dimensional gradient at every iteration by computing directional derivatives only along those $r$ directions. We establish that the "directional oracle complexities" of LRGD for strongly convex and non-convex objective functions are $\mathcal{O}(r \log(1/ε) + rp)$ and $\mathcal{O}(r/ε^2 + rp)$, respectively. When $r \ll p$, these complexities are smaller than the known complexities of $\mathcal{O}(p \log(1/ε))$ and $\mathcal{O}(p/ε^2)$ of {\gd} in the strongly convex and non-convex settings, respectively. Thus, LRGD significantly reduces the computational cost of gradient-based methods for sufficiently low-rank functions. In the course of our analysis, we also formally define and characterize the classes of exact and approximately low-rank functions.

preprint2022arXiv

Inference in Opinion Dynamics under Social Pressure

We introduce a new opinion dynamics model where a group of agents holds two kinds of opinions: inherent and declared. Each agent's inherent opinion is fixed and unobservable by the other agents. At each time step, agents broadcast their declared opinions on a social network, which are governed by the agents' inherent opinions and social pressure. In particular, we assume that agents may declare opinions that are not aligned with their inherent opinions to conform with their neighbors. This raises the natural question: Can we estimate the agents' inherent opinions from observations of declared opinions? For example, agents' inherent opinions may represent their true political alliances (Democrat or Republican), while their declared opinions may model the political inclinations of tweets on social media. In this context, we may seek to predict the election results by observing voters' tweets, which do not necessarily reflect their political support due to social pressure. We analyze this question in the special case where the underlying social network is a complete graph. We prove that, as long as the population does not include large majorities, estimation of aggregate and individual inherent opinions is possible. On the other hand, large majorities force minorities to lie over time, which makes asymptotic estimation impossible.

preprint2022arXiv

Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective

This work examines the deep disconnect between existing theoretical analyses of gradient-based algorithms and the practice of training deep neural networks. Specifically, we provide numerical evidence that in large-scale neural network training (e.g., ImageNet + ResNet101, and WT103 + TransformerXL models), the neural network's weights do not converge to stationary points where the gradient of the loss is zero. Remarkably, however, we observe that even though the weights do not converge to stationary points, the progress in minimizing the loss function halts and training loss stabilizes. Inspired by this observation, we propose a new perspective based on ergodic theory of dynamical systems to explain it. Rather than studying the evolution of weights, we study the evolution of the distribution of weights. We prove convergence of the distribution of weights to an approximate invariant measure, thereby explaining how the training loss can stabilize without weights necessarily converging to stationary points. We further discuss how this perspective can better align optimization theory with empirical observations in machine learning practice.

preprint2022arXiv

On Convergence of Gradient Descent Ascent: A Tight Local Analysis

Gradient Descent Ascent (GDA) methods are the mainstream algorithms for minimax optimization in generative adversarial networks (GANs). Convergence properties of GDA have drawn significant interest in the recent literature. Specifically, for $\min_{\mathbf{x}} \max_{\mathbf{y}} f(\mathbf{x};\mathbf{y})$ where $f$ is strongly-concave in $\mathbf{y}$ and possibly nonconvex in $\mathbf{x}$, (Lin et al., 2020) proved the convergence of GDA with a stepsize ratio $η_{\mathbf{y}}/η_{\mathbf{x}}=Θ(κ^2)$ where $η_{\mathbf{x}}$ and $η_{\mathbf{y}}$ are the stepsizes for $\mathbf{x}$ and $\mathbf{y}$ and $κ$ is the condition number for $\mathbf{y}$. While this stepsize ratio suggests a slow training of the min player, practical GAN algorithms typically adopt similar stepsizes for both variables, indicating a wide gap between theoretical and empirical results. In this paper, we aim to bridge this gap by analyzing the \emph{local convergence} of general \emph{nonconvex-nonconcave} minimax problems. We demonstrate that a stepsize ratio of $Θ(κ)$ is necessary and sufficient for local convergence of GDA to a Stackelberg Equilibrium, where $κ$ is the local condition number for $\mathbf{y}$. We prove a nearly tight convergence rate with a matching lower bound. We further extend the convergence guarantees to stochastic GDA and extra-gradient methods (EG). Finally, we conduct several numerical experiments to support our theoretical findings.

preprint2022arXiv

Unifying Epidemic Models with Mixtures

The COVID-19 pandemic has emphasized the need for a robust understanding of epidemic models. Current models of epidemics are classified as either mechanistic or non-mechanistic: mechanistic models make explicit assumptions on the dynamics of disease, whereas non-mechanistic models make assumptions on the form of observed time series. Here, we introduce a simple mixture-based model which bridges the two approaches while retaining benefits of both. The model represents time series of cases and fatalities as a mixture of Gaussian curves, providing a flexible function class to learn from data compared to traditional mechanistic models. Although the model is non-mechanistic, we show that it arises as the natural outcome of a stochastic process based on a networked SIR framework. This allows learned parameters to take on a more meaningful interpretation compared to similar non-mechanistic models, and we validate the interpretations using auxiliary mobility data collected during the COVID-19 pandemic. We provide a simple learning algorithm to identify model parameters and establish theoretical results which show the model can be efficiently learned from data. Empirically, we find the model to have low prediction error. The model is available live at covidpredictions.mit.edu. Ultimately, this allows us to systematically understand the impacts of interventions on COVID-19, which is critical in developing data-driven solutions to controlling epidemics.

preprint2021arXiv

Network Group Testing

We consider the problem of identifying infected individuals in a population of size N. We introduce a group testing approach that uses significantly fewer than N tests when infection prevalence is low. The most common approach to group testing, Dorfman testing, groups individuals randomly. However, as communicable diseases spread from individual to individual through underlying social networks, our approach utilizes network information to improve performance. Network grouping, which groups individuals by community, weakly dominates Dorfman testing in terms of the expected number of tests used. Network grouping's outperformance is determined by the strength of community structure in the network. When networks have strong community structure, network grouping achieves the lower bound for two-stage testing procedures. As an empirical example, we consider the scenario of a university testing its population for COVID-19. Using social network data from a Danish university, we demonstrate network grouping requires significantly fewer tests than Dorfman. In contrast to many proposed group testing approaches, network grouping is simple for practitioners to implement. In practice, individuals can be grouped by family unit, social group, or work group.

preprint2021arXiv

Time varying regression with hidden linear dynamics

We revisit a model for time-varying linear regression that assumes the unknown parameters evolve according to a linear dynamical system. Counterintuitively, we show that when the underlying dynamics are stable the parameters of this model can be estimated from data by combining just two ordinary least squares estimates. We offer a finite sample guarantee on the estimation error of our method and discuss certain advantages it has over Expectation-Maximization (EM), which is the main approach proposed by prior work.

preprint2020arXiv

A Distributed Cubic-Regularized Newton Method for Smooth Convex Optimization over Networks

We propose a distributed, cubic-regularized Newton method for large-scale convex optimization over networks. The proposed method requires only local computations and communications and is suitable for federated learning applications over arbitrary network topologies. We show a $O(k^{{-}3})$ convergence rate when the cost function is convex with Lipschitz gradient and Hessian, with $k$ being the number of iterations. We further provide network-dependent bounds for the communication required in each step of the algorithm. We provide numerical experiments that validate our theoretical results.

preprint2020arXiv

A Separation Theorem for Joint Sensor and Actuator Scheduling with Guaranteed Performance Bounds

We study the problem of jointly designing a sparse sensor and actuator schedule for linear dynamical systems while guaranteeing a control/estimation performance that approximates the fully sensed/actuated setting. We further prove a separation principle, showing that the problem can be decomposed into finding sensor and actuator schedules separately. However, it is shown that this problem cannot be efficiently solved or approximated in polynomial, or even quasi-polynomial time for time-invariant sensor/actuator schedules; instead, we develop deterministic polynomial-time algorithms for a time-varying sensor/actuator schedule with guaranteed approximation bounds. Our main result is to provide a polynomial-time joint actuator and sensor schedule that on average selects only a constant number of sensors and actuators at each time step, irrespective of the dimension of the system. The key idea is to sparsify the controllability and observability Gramians while providing approximation guarantees for Hankel singular values. This idea is inspired by recent results in theoretical computer science literature on sparsification.

preprint2020arXiv

Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions

We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonconvex functions. In particular, we study the class of Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions for which the chain rule of calculus holds. This class contains examples such as ReLU neural networks and others with non-differentiable activation functions. We first show that finding an $ε$-stationary point with first-order methods is impossible in finite time. We then introduce the notion of $(δ, ε)$-stationarity, which allows for an $ε$-approximate gradient to be the convex combination of generalized gradients evaluated at points within distance $δ$ to the solution. We propose a series of randomized first-order methods and analyze their complexity of finding a $(δ, ε)$-stationary point. Furthermore, we provide a lower bound and show that our stochastic algorithm has min-max optimal dependence on $δ$. Empirically, our methods perform well for training ReLU neural networks.

preprint2020arXiv

Deterministic and Randomized Actuator Scheduling With Guaranteed Performance Bounds

In this paper, we investigate the problem of actuator selection for linear dynamical systems. We develop a framework to design a sparse actuator schedule for a given large-scale linear system with guaranteed performance bounds using deterministic polynomial-time and randomized approximately linear-time algorithms. First, we introduce systemic controllability metrics for linear dynamical systems that are monotone and homogeneous with respect to the controllability Gramian. We show that several popular and widely used optimization criteria in the literature belong to this class of controllability metrics. Our main result is to provide a polynomial-time actuator schedule that on average selects only a constant number of actuators at each time step, independent of the dimension, to furnish a guaranteed approximation of the controllability metrics in comparison to when all actuators are in use. Our results naturally apply to the dual problem of sensor selection, in which we provide a guaranteed approximation to the observability Gramian. We illustrate the effectiveness of our theoretical findings via several numerical simulations using benchmark examples.

preprint2020arXiv

Estimation of Skill Distributions

In this paper, we study the problem of learning the skill distribution of a population of agents from observations of pairwise games in a tournament. These games are played among randomly drawn agents from the population. The agents in our model can be individuals, sports teams, or Wall Street fund managers. Formally, we postulate that the likelihoods of game outcomes are governed by the Bradley-Terry-Luce (or multinomial logit) model, where the probability of an agent beating another is the ratio between its skill level and the pairwise sum of skill levels, and the skill parameters are drawn from an unknown skill density of interest. The problem is, in essence, to learn a distribution from noisy, quantized observations. We propose a simple and tractable algorithm that learns the skill density with near-optimal minimax mean squared error scaling as $n^{-1+\varepsilon}$, for any $\varepsilon>0$, when the density is smooth. Our approach brings together prior work on learning skill parameters from pairwise comparisons with kernel density estimation from non-parametric statistics. Furthermore, we prove minimax lower bounds which establish minimax optimality of the skill parameter estimation technique used in our algorithm. These bounds utilize a continuum version of Fano's method along with a covering argument. We apply our algorithm to various soccer leagues and world cups, cricket world cups, and mutual funds. We find that the entropy of a learnt distribution provides a quantitative measure of skill, which provides rigorous explanations for popular beliefs about perceived qualities of sporting events, e.g., soccer league rankings. Finally, we apply our method to assess the skill distributions of mutual funds. Our results shed light on the abundance of low quality funds prior to the Great Recession of 2008, and the domination of the industry by more skilled funds after the financial crisis.

preprint2020arXiv

FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization

Federated learning is a distributed framework according to which a model is trained over a set of devices, while keeping data localized. This framework faces several systems-oriented challenges which include (i) communication bottleneck since a large number of devices upload their local updates to a parameter server, and (ii) scalability as the federated network consists of millions of devices. Due to these systems challenges as well as issues related to statistical heterogeneity of data and privacy concerns, designing a provably efficient federated learning method is of significant importance yet it remains challenging. In this paper, we present FedPAQ, a communication-efficient Federated Learning method with Periodic Averaging and Quantization. FedPAQ relies on three key features: (1) periodic averaging where models are updated locally at devices and only periodically averaged at the server; (2) partial device participation where only a fraction of devices participate in each round of the training; and (3) quantized message-passing where the edge nodes quantize their updates before uploading to the parameter server. These features address the communications and scalability challenges in federated learning. We also show that FedPAQ achieves near-optimal theoretical guarantees for strongly convex and non-convex loss functions and empirically demonstrate the communication-computation tradeoff provided by our method.

preprint2020arXiv

GAT-GMM: Generative Adversarial Training for Gaussian Mixture Models

Generative adversarial networks (GANs) learn the distribution of observed samples through a zero-sum game between two machine players, a generator and a discriminator. While GANs achieve great success in learning the complex distribution of image, sound, and text data, they perform suboptimally in learning multi-modal distribution-learning benchmarks including Gaussian mixture models (GMMs). In this paper, we propose Generative Adversarial Training for Gaussian Mixture Models (GAT-GMM), a minimax GAN framework for learning GMMs. Motivated by optimal transport theory, we design the zero-sum game in GAT-GMM using a random linear generator and a softmax-based quadratic discriminator architecture, which leads to a non-convex concave minimax optimization problem. We show that a Gradient Descent Ascent (GDA) method converges to an approximate stationary minimax point of the GAT-GMM optimization problem. In the benchmark case of a mixture of two symmetric, well-separated Gaussians, we further show this stationary point recovers the true parameters of the underlying GMM. We numerically support our theoretical findings by performing several experiments, which demonstrate that GAT-GMM can perform as well as the expectation-maximization algorithm in learning mixtures of two Gaussians.

preprint2020arXiv

LQG Control and Sensing Co-Design

We investigate a Linear-Quadratic-Gaussian (LQG) control and sensing co-design problem, where one jointly designs sensing and control policies. We focus on the realistic case where the sensing design is selected among a finite set of available sensors, where each sensor is associated with a different cost (e.g., power consumption). We consider two dual problem instances: sensing-constrained LQG control, where one maximizes control performance subject to a sensor cost budget, and minimum-sensing LQG control, where one minimizes sensor cost subject to performance constraints. We prove no polynomial time algorithm guarantees across all problem instances a constant approximation factor from the optimal. Nonetheless, we present the first polynomial time algorithms with per-instance suboptimality guarantees. To this end, we leverage a separation principle, that partially decouples the design of sensing and control. Then, we frame LQG co-design as the optimization of approximately supermodular set functions; we develop novel algorithms to solve the problems; and we prove original results on the performance of the algorithms, and establish connections between their suboptimality and control-theoretic quantities. We conclude the paper by discussing two applications, namely, sensing-constrained formation control and resource-constrained robot navigation.

preprint2020arXiv

Network Inference from Consensus Dynamics with Unknown Parameters

We explore the problem of inferring the graph Laplacian of a weighted, undirected network from snapshots of a single or multiple discrete-time consensus dynamics, subject to parameter uncertainty, taking place on the network. Specifically, we consider three problems in which we assume different levels of knowledge about the diffusion rates, observation times, and the input signal power of the dynamics. To solve these underdetermined problems, we propose a set of algorithms that leverage the spectral properties of the observed data and tools from convex optimization. Furthermore, we provide theoretical performance guarantees associated with these algorithms. We complement our theoretical work with numerical experiments, that demonstrate how our proposed methods outperform current state-of-the-art algorithms and showcase their effectiveness in recovering both synthetic and real-world networks.

preprint2020arXiv

Robust Federated Learning: The Case of Affine Distribution Shifts

Federated learning is a distributed paradigm that aims at training models using samples distributed across multiple users in a network while keeping the samples on users' devices with the aim of efficiency and protecting users privacy. In such settings, the training data is often statistically heterogeneous and manifests various distribution shifts across users, which degrades the performance of the learnt model. The primary goal of this paper is to develop a robust federated learning algorithm that achieves satisfactory performance against distribution shifts in users' samples. To achieve this goal, we first consider a structured affine distribution shift in users' data that captures the device-dependent data heterogeneity in federated settings. This perturbation model is applicable to various federated learning problems such as image classification where the images undergo device-dependent imperfections, e.g. different intensity, contrast, and brightness. To address affine distribution shifts across users, we propose a Federated Learning framework Robust to Affine distribution shifts (FLRA) that is provably robust against affine Wasserstein shifts to the distribution of observed samples. To solve the FLRA's distributed minimax problem, we propose a fast and efficient optimization method and provide convergence guarantees via a gradient Descent Ascent (GDA) method. We further prove generalization error bounds for the learnt classifier to show proper generalization from empirical distribution of samples to the true underlying distribution. We perform several numerical experiments to empirically support FLRA. We show that an affine distribution shift indeed suffices to significantly decrease the performance of the learnt classifier in a new test user, and our proposed algorithm achieves a significant gain in comparison to standard federated learning and adversarial training methods.

preprint2020arXiv

Sensing-Constrained LQG Control

Linear-Quadratic-Gaussian (LQG) control is concerned with the design of an optimal controller and estimator for linear Gaussian systems with imperfect state information. Standard LQG assumes the set of sensor measurements, to be fed to the estimator, to be given. However, in many problems, arising in networked systems and robotics, one may not be able to use all the available sensors, due to power or payload constraints, or may be interested in using the smallest subset of sensors that guarantees the attainment of a desired control goal. In this paper, we introduce the sensing-constrained LQG control problem, in which one has to jointly design sensing, estimation, and control, under given constraints on the resources spent for sensing. We focus on the realistic case in which the sensing strategy has to be selected among a finite set of possible sensing modalities. While the computation of the optimal sensing strategy is intractable, we present the first scalable algorithm that computes a near-optimal sensing strategy with provable sub-optimality guarantees. To this end, we show that a separation principle holds, which allows the design of sensing, estimation, and control policies in isolation. We conclude the paper by discussing two applications of sensing-constrained LQG control, namely, sensing-constrained formation control and resource-constrained robot navigation.

preprint2020arXiv

Sensor Placement for Optimal Kalman Filtering: Fundamental Limits, Submodularity, and Algorithms

In this paper, we focus on sensor placement in linear dynamic estimation, where the objective is to place a small number of sensors in a system of interdependent states so to design an estimator with a desired estimation performance. In particular, we consider a linear time-variant system that is corrupted with process and measurement noise, and study how the selection of its sensors affects the estimation error of the corresponding Kalman filter over a finite observation interval. Our contributions are threefold: First, we prove that the minimum mean square error of the Kalman filter decreases only linearly as the number of sensors increases. That is, adding extra sensors so to reduce this estimation error is ineffective, a fundamental design limit. Similarly, we prove that the number of sensors grows linearly with the system's size for fixed minimum mean square error and number of output measurements over an observation interval; this is another fundamental limit, especially for systems where the system's size is large. Second, we prove that the logdet of the error covariance of the Kalman filter, which captures the volume of the corresponding confidence ellipsoid, with respect to the system's initial condition and process noise is a supermodular and non-increasing set function in the choice of the sensor set. Therefore, it exhibits the diminishing returns property. Third, we provide efficient approximation algorithms that select a small number sensors so to optimize the Kalman filter with respect to this estimation error ---the worst-case performance guarantees of these algorithms are provided as well. Finally, we illustrate the efficiency of our algorithms using the problem of surface-based monitoring of CO2 sequestration sites studied in Weimer et al. (2008).

preprint2020arXiv

Why gradient clipping accelerates training: A theoretical justification for adaptivity

We provide a theoretical explanation for the effectiveness of gradient clipping in training deep neural networks. The key ingredient is a new smoothness condition derived from practical neural network training examples. We observe that gradient smoothness, a concept central to the analysis of first-order optimization algorithms that is often assumed to be a constant, demonstrates significant variability along the training trajectory of deep neural networks. Further, this smoothness positively correlates with the gradient norm, and contrary to standard assumptions in the literature, it can grow with the norm of the gradient. These empirical observations limit the applicability of existing theoretical analyses of algorithms that rely on a fixed bound on smoothness. These observations motivate us to introduce a novel relaxation of gradient smoothness that is weaker than the commonly used Lipschitz smoothness assumption. Under the new condition, we prove that two popular methods, namely, \emph{gradient clipping} and \emph{normalized gradient}, converge arbitrarily faster than gradient descent with fixed stepsize. We further explain why such adaptively scaled gradient methods can accelerate empirical convergence and verify our results empirically in popular neural network training settings.

preprint2019arXiv

Bayesian Decision Making in Groups is Hard

We study the computations that Bayesian agents undertake when exchanging opinions over a network. The agents act repeatedly on their private information and take myopic actions that maximize their expected utility according to a fully rational posterior belief. We show that such computations are NP-hard for two natural utility functions: one with binary actions, and another where agents reveal their posterior beliefs. In fact, we show that distinguishing between posteriors that are concentrated on different states of the world is NP-hard. Therefore, even approximating the Bayesian posterior beliefs is hard. We also describe a natural search algorithm to compute agents' actions, which we call elimination of impossible signals, and show that if the network is transitive, the algorithm can be modified to run in polynomial time.

preprint2019arXiv

Non-Bayesian Social Learning with Uncertain Models

Non-Bayesian social learning theory provides a framework that models distributed inference for a group of agents interacting over a social network. In this framework, each agent iteratively forms and communicates beliefs about an unknown state of the world with their neighbors using a learning rule. Existing approaches assume agents have access to precise statistical models (in the form of likelihoods) for the state of the world. However in many situations, such models must be learned from finite data. We propose a social learning rule that takes into account uncertainty in the statistical models using second-order probabilities. Therefore, beliefs derived from uncertain models are sensitive to the amount of past evidence collected for each hypothesis. We characterize how well the hypotheses can be tested on a social network, as consistent or not with the state of the world. We explicitly show the dependency of the generated beliefs with respect to the amount of prior evidence. Moreover, as the amount of prior evidence goes to infinity, learning occurs and is consistent with traditional social learning theory.

preprint2019arXiv

Random Walks on Simplicial Complexes and the normalized Hodge 1-Laplacian

Focusing on coupling between edges, we generalize the relationship between the normalized graph Laplacian and random walks on graphs by devising an appropriate normalization for the Hodge Laplacian -- the generalization of the graph Laplacian for simplicial complexes -- and relate this to a random walk on edges. Importantly, these random walks are intimately connected to the topology of the simplicial complex, just as random walks on graphs are related to the topology of the graph. This serves as a foundational step towards incorporating Laplacian-based analytics for higher-order interactions. We demonstrate how to use these dynamics for data analytics that extract information about the edge-space of a simplicial complex that complements and extends graph-based analysis. Specifically, we use our normalized Hodge Laplacian to derive spectral embeddings for examining trajectory data of ocean drifters near Madagascar and also develop a generalization of personalized PageRank for the edge-space of simplicial complexes to analyze a book co-purchasing dataset.

preprint2016arXiv

A Distributed Newton Method for Large Scale Consensus Optimization

In this paper, we propose a distributed Newton method for consensus optimization. Our approach outperforms state-of-the-art methods, including ADMM. The key idea is to exploit the sparsity of the dual Hessian and recast the computation of the Newton step as one of efficiently solving symmetric diagonally dominant linear equations. We validate our algorithm both theoretically and empirically. On the theory side, we demonstrate that our algorithm exhibits superlinear convergence within a neighborhood of optimality. Empirically, we show the superiority of this new method on a variety of machine learning problems. The proposed approach is scalable to very large problems and has a low communication overhead.

preprint2016arXiv

An Exact Distributed Newton Method for Reinforcement Learning

In this paper, we propose a distributed second- order method for reinforcement learning. Our approach is the fastest in literature so-far as it outperforms state-of-the-art methods, including ADMM, by significant margins. We achieve this by exploiting the sparsity pattern of the dual Hessian and transforming the problem of computing the Newton direction to one of solving a sequence of symmetric diagonally dominant system of equations. We validate the above claim both theoretically and empirically. On the theoretical side, we prove that similar to exact Newton, our algorithm exhibits super-linear convergence within a neighborhood of the optimal solution. Empirically, we demonstrate the superiority of this new method on a set of benchmark reinforcement learning tasks.

preprint2016arXiv

Bayesian Heuristics for Group Decisions

We propose a model of inference and heuristic decision-making in groups that is rooted in the Bayes rule but avoids the complexities of rational inference in partially observed environments with incomplete information, which are characteristic of group interactions. Our model is also consistent with a dual-process psychological theory of thinking: the group members behave rationally at the initiation of their interactions with each other (the slow and deliberative mode); however, in the ensuing decision epochs, they rely on a heuristic that replicates their experiences from the first stage (the fast automatic mode). We specialize this model to a group decision scenario where private observations are received at the beginning, and agents aim to take the best action given the aggregate observations of all group members. We study the implications of the information structure together with the properties of the probability distributions which determine the structure of the so-called "Bayesian heuristics" that the agents follow in our model. We also analyze the group decision outcomes in two classes of linear action updates and log-linear belief updates and show that many inefficiencies arise in group decisions as a result of repeated interactions between individuals, leading to overconfident beliefs as well as choice-shifts toward extremes. Nevertheless, balanced regular structures demonstrate a measure of efficiency in terms of aggregating the initial information of individuals. These results not only verify some well-known insights about group decision-making but also complement these insights by revealing additional mechanistic interpretations for the group declension-process, as well as psychological and cognitive intuitions about the group interaction model.

preprint2016arXiv

Bayesian Learning without Recall

We analyze a model of learning and belief formation in networks in which agents follow Bayes rule yet they do not recall their history of past observations and cannot reason about how other agents' beliefs are formed. They do so by making rational inferences about their observations which include a sequence of independent and identically distributed private signals as well as the actions of their neighboring agents at each time. Successive applications of Bayes rule to the entire history of past observations lead to forebodingly complex inferences: due to lack of knowledge about the global network structure, and unavailability of private observations, as well as third party interactions preceding every decision. Such difficulties make Bayesian updating of beliefs an implausible mechanism for social learning. To address these complexities, we consider a Bayesian without Recall model of inference. On the one hand, this model provides a tractable framework for analyzing the behavior of rational agents in social networks. On the other hand, this model also provides a behavioral foundation for the variety of non-Bayesian update rules in the literature. We present the implications of various choices for the structure of the action space and utility functions for such agents and investigate the properties of learning, convergence, and consensus in special cases.

preprint2016arXiv

Bio-Inspired Framework for Allocation of Protection Resources in Cyber-Physical Networks

In this chapter, we consider the problem of designing protection strategies to contain spreading processes in complex cyber-physical networks. We illustrate our ideas using a family of bio-motivated spreading models originally proposed in the epidemiological literature, e.g., the Susceptible-Infected-Susceptible (SIS) model. We first introduce a framework in which we are allowed to distribute two types of resources in order to contain the spread, namely, (i) preventive resources able to reduce the spreading rate, and (ii) corrective resources able to increase the recovery rate of nodes in which the resources are allocated. In practice, these resources have an associated cost that depends on either the resiliency level achieved by the preventive resource, or the restoration efficiency of the corrective resource. We present a mathematical framework, based on dynamic systems theory and convex optimization, to find the cost-optimal distribution of protection resources in a network to contain the spread. We also present two extensions to this framework in which (i) we consider generalized epidemic models, beyond the simple SIS model, and (ii) we assume uncertainties in the contact network in which the spreading is taking place. We compare these protection strategies with common heuristics previously proposed in the literature and illustrate our results with numerical simulations using the air traffic network.

preprint2016arXiv

Distributed Estimation and Learning over Heterogeneous Networks

We consider several estimation and learning problems that networked agents face when making decisions given their uncertainty about an unknown variable. Our methods are designed to efficiently deal with heterogeneity in both size and quality of the observed data, as well as heterogeneity over time (intermittence). The goal of the studied aggregation schemes is to efficiently combine the observed data that is spread over time and across several network nodes, accounting for all the network heterogeneities. Moreover, we require no form of coordination beyond the local neighborhood of every network agent or sensor node. The three problems that we consider are (i) maximum likelihood estimation of the unknown given initial data sets, (ii) learning the true model parameter from streams of data that the agents receive intermittently over time, and (iii) minimum variance estimation of a complete sufficient statistic from several data points that the networked agents collect over time. In each case we rely on an aggregation scheme to combine the observations of all agents; moreover, when the agents receive streams of data over time, we modify the update rules to accommodate the most recent observations. In every case, we demonstrate the efficiency of our algorithms by proving convergence to the globally efficient estimators given the observations of all agents. We supplement these results by investigating the rate of convergence and providing finite-time performance guarantees.

preprint2016arXiv

Distributed Estimation of Dynamic Parameters : Regret Analysis

This paper addresses the estimation of a time- varying parameter in a network. A group of agents sequentially receive noisy signals about the parameter (or moving target), which does not follow any particular dynamics. The parameter is not observable to an individual agent, but it is globally identifiable for the whole network. Viewing the problem with an online optimization lens, we aim to provide the finite-time or non-asymptotic analysis of the problem. To this end, we use a notion of dynamic regret which suits the online, non-stationary nature of the problem. In our setting, dynamic regret can be recognized as a finite-time counterpart of stability in the mean- square sense. We develop a distributed, online algorithm for tracking the moving target. Defining the path-length as the consecutive differences between target locations, we express an upper bound on regret in terms of the path-length of the target and network errors. We further show the consistency of the result with static setting and noiseless observations.

preprint2016arXiv

Distributed Online Optimization in Dynamic Environments Using Mirror Descent

This work addresses decentralized online optimization in non-stationary environments. A network of agents aim to track the minimizer of a global time-varying convex function. The minimizer evolves according to a known dynamics corrupted by an unknown, unstructured noise. At each time, the global function can be cast as a sum of a finite number of local functions, each of which is assigned to one agent in the network. Moreover, the local functions become available to agents sequentially, and agents do not have a prior knowledge of the future cost functions. Therefore, agents must communicate with each other to build an online approximation of the global function. We propose a decentralized variation of the celebrated Mirror Descent, developed by Nemirovksi and Yudin. Using the notion of Bregman divergence in lieu of Euclidean distance for projection, Mirror Descent has been shown to be a powerful tool in large-scale optimization. Our algorithm builds on Mirror Descent, while ensuring that agents perform a consensus step to follow the global function and take into account the dynamics of the global minimizer. To measure the performance of the proposed online algorithm, we compare it to its offline counterpart, where the global functions are available a priori. The gap between the two is called dynamic regret. We establish a regret bound that scales inversely in the spectral gap of the network, and more notably it represents the deviation of minimizer sequence with respect to the given dynamics. We then show that our results subsume a number of results in distributed optimization. We demonstrate the application of our method to decentralized tracking of dynamic parameters and verify the results via numerical experiments.

preprint2016arXiv

Learning without recall in directed circles and rooted trees

This work investigates the case of a network of agents that attempt to learn some unknown state of the world amongst the finitely many possibilities. At each time step, agents all receive random, independently distributed private signals whose distributions are dependent on the unknown state of the world. However, it may be the case that some or any of the agents cannot distinguish between two or more of the possible states based only on their private observations, as when several states result in the same distribution of the private signals. In our model, the agents form some initial belief (probability distribution) about the unknown state and then refine their beliefs in accordance with their private observations, as well as the beliefs of their neighbors. An agent learns the unknown state when her belief converges to a point mass that is concentrated at the true state. A rational agent would use the Bayes' rule to incorporate her neighbors' beliefs and own private signals over time. While such repeated applications of the Bayes' rule in networks can become computationally intractable, in this paper, we show that in the canonical cases of directed star, circle or path networks and their combinations, one can derive a class of memoryless update rules that replicate that of a single Bayesian agent but replace the self beliefs with the beliefs of the neighbors. This way, one can realize an exponentially fast rate of learning similar to the case of Bayesian (fully rational) agents. The proposed rules are a special case of the Learning without Recall.

preprint2016arXiv

Near-Optimal Sensor Scheduling for Batch State Estimation: Complexity, Algorithms, and Limits

In this paper, we focus on batch state estimation for linear systems. This problem is important in applications such as environmental field estimation, robotic navigation, and target tracking. Its difficulty lies on that limited operational resources among the sensors, e.g., shared communication bandwidth or battery power, constrain the number of sensors that can be active at each measurement step. As a result, sensor scheduling algorithms must be employed. Notwithstanding, current sensor scheduling algorithms for batch state estimation scale poorly with the system size and the time horizon. In addition, current sensor scheduling algorithms for Kalman filtering, although they scale better, provide no performance guarantees or approximation bounds for the minimization of the batch state estimation error. In this paper, one of our main contributions is to provide an algorithm that enjoys both the estimation accuracy of the batch state scheduling algorithms and the low time complexity of the Kalman filtering scheduling algorithms. In particular: 1) our algorithm is near-optimal: it achieves a solution up to a multiplicative factor 1/2 from the optimal solution, and this factor is close to the best approximation factor 1/e one can achieve in polynomial time for this problem; 2) our algorithm has (polynomial) time complexity that is not only lower than that of the current algorithms for batch state estimation; it is also lower than, or similar to, that of the current algorithms for Kalman filtering. We achieve these results by proving two properties for our batch state estimation error metric, which quantifies the square error of the minimum variance linear estimator of the batch state vector: a) it is supermodular in the choice of the sensors; b) it has a sparsity pattern (it involves matrices that are block tri-diagonal) that facilitates its evaluation at each sensor set.

preprint2016arXiv

Online Optimization in Dynamic Environments: Improved Regret Rates for Strongly Convex Problems

In this paper, we address tracking of a time-varying parameter with unknown dynamics. We formalize the problem as an instance of online optimization in a dynamic setting. Using online gradient descent, we propose a method that sequentially predicts the value of the parameter and in turn suffers a loss. The objective is to minimize the accumulation of losses over the time horizon, a notion that is termed dynamic regret. While existing methods focus on convex loss functions, we consider strongly convex functions so as to provide better guarantees of performance. We derive a regret bound that captures the path-length of the time-varying parameter, defined in terms of the distance between its consecutive values. In other words, the bound represents the natural connection of tracking quality to the rate of change of the parameter. We provide numerical experiments to complement our theoretical findings.

preprint2016arXiv

Scheduling Nonlinear Sensors for Stochastic Process Estimation

In this paper, we focus on activating only a few sensors, among many available, to estimate the state of a stochastic process of interest. This problem is important in applications such as target tracking and simultaneous localization and mapping (SLAM). It is challenging since it involves stochastic systems whose evolution is largely unknown, sensors with nonlinear measurements, and limited operational resources that constrain the number of active sensors at each measurement step. We provide an algorithm applicable to general stochastic processes and nonlinear measurements whose time complexity is linear in the planning horizon and whose performance is a multiplicative factor 1/2 away from the optimal performance. This is notable because the algorithm offers a significant computational advantage over the polynomial-time algorithm that achieves the best approximation factor 1/e. In addition, for important classes of Gaussian processes and nonlinear measurements corrupted with Gaussian noise, our algorithm enjoys the same time complexity as even the state-of-the-art algorithms for linear systems and measurements. We achieve our results by proving two properties for the entropy of the batch state vector conditioned on the measurements: a) it is supermodular in the choice of the sensors; b) it has a sparsity pattern (involves block tri-diagonal matrices) that facilitates its evaluation at each sensor set.

preprint2015arXiv

A Fast Distributed Solver for Symmetric Diagonally Dominant Linear Equations

In this paper, we propose a fast distributed solver for linear equations given by symmetric diagonally dominant M-Matrices. Our approach is based on a distributed implementation of the parallel solver of Spielman and Peng by considering a specific approximated inverse chain which can be computed efficiently in a distributed fashion. Representing the system of equations by a graph $\mathbb{G}$, the proposed distributed algorithm is capable of attaining $ε$-close solutions (for arbitrary $ε$) in time proportional to $n^{3}$ (number of nodes in $\mathbb{G}$), $α$ (upper bound on the size of the R-Hop neighborhood), and $\frac{{W}_{max}}{{W}_{min}}$ (maximum and minimum weight of edges in $\mathbb{G}$).

preprint2015arXiv

Competitive Diffusion in Social Networks: Quality or Seeding?

In this paper, we study a strategic model of marketing and product consumption in social networks. We consider two firms in a market competing to maximize the consumption of their products. Firms have a limited budget which can be either invested on the quality of the product or spent on initial seeding in the network in order to better facilitate spread of the product. After the decision of firms, agents choose their consumptions following a myopic best response dynamics which results in a local, linear update for their consumption decision. We characterize the unique Nash equilibrium of the game between firms and study the effect of the budgets as well as the network structure on the optimal allocation. We show that at the equilibrium, firms invest more budget on quality when their budgets are close to each other. However, as the gap between budgets widens, competition in qualities becomes less effective and firms spend more of their budget on seeding. We also show that given equal budget of firms, if seeding budget is nonzero for a balanced graph, it will also be nonzero for any other graph, and if seeding budget is zero for a star graph it will be zero for any other graph as well. As a practical extension, we then consider a case where products have some preset qualities that can be only improved marginally. At some point in time, firms learn about the network structure and decide to utilize a limited budget to mount their market share by either improving the quality or new seeding some agents to incline consumers towards their products. We show that the optimal budget allocation in this case simplifies to a threshold strategy. Interestingly, we derive similar results to that of the original problem, in which preset qualities simulate the role that budgets had in the original setup.

preprint2015arXiv

Distributed Resource Allocation for Epidemic control

We present a distributed resource allocation strategy to control an epidemic outbreak in a networked population based on a Distributed Alternating Direction Method of Multipliers (D-ADMM) algorithm. We consider a linearized Susceptible- Infected-Susceptible (SIS) epidemic spreading model in which agents in the network are able to allocate vaccination resources (for prevention) and antidotes (for treatment) in the presence of a contagion. We express our epidemic control condition as a spectral constraint involving the Perron-Frobenius eigenvalue, and formulate the resource allocation problem as a Geometric Program (GP). Next, we separate the network-wide optimization problem into subproblems optimally solved by each agent in a fully distributed way. We conclude the paper by illustrating performance of our solution framework with numerical simulations.

preprint2015arXiv

Distributed SDDM Solvers: Theory & Applications

In this paper, we propose distributed solvers for systems of linear equations given by symmetric diagonally dominant M-matrices based on the parallel solver of Spielman and Peng. We propose two versions of the solvers, where in the first, full communication in the network is required, while in the second communication is restricted to the R-Hop neighborhood between nodes for some $R \geq 1$. We rigorously analyze the convergence and convergence rates of our solvers, showing that our methods are capable of outperforming state-of-the-art techniques. Having developed such solvers, we then contribute by proposing an accurate distributed Newton method for network flow optimization. Exploiting the sparsity pattern of the dual Hessian, we propose a Newton method for network flow optimization that is both faster and more accurate than state-of-the-art techniques. Our method utilizes the distributed SDDM solvers for determining the Newton direction up to any arbitrary precision $ε>0$. We analyze the properties of our algorithm and show superlinear convergence within a neighborhood of the optimal. Finally, in a set of experiments conducted on randomly generated and barbell networks, we demonstrate that our approach is capable of significantly outperforming state-of-the-art techniques.

preprint2015arXiv

Fast, Accurate Second Order Methods for Network Optimization

Dual descent methods are commonly used to solve network flow optimization problems, since their implementation can be distributed over the network. These algorithms, however, often exhibit slow convergence rates. Approximate Newton methods which compute descent directions locally have been proposed as alternatives to accelerate the convergence rates of conventional dual descent. The effectiveness of these methods, is limited by the accuracy of such approximations. In this paper, we propose an efficient and accurate distributed second order method for network flow problems. The proposed approach utilizes the sparsity pattern of the dual Hessian to approximate the the Newton direction using a novel distributed solver for symmetric diagonally dominant linear equations. Our solver is based on a distributed implementation of a recent parallel solver of Spielman and Peng (2014). We analyze the properties of the proposed algorithm and show that, similar to conventional Newton methods, superlinear convergence within a neighbor- hood of the optimal value is attained. We finally demonstrate the effectiveness of the approach in a set of experiments on randomly generated networks.

preprint2015arXiv

Finite-time Analysis of the Distributed Detection Problem

This paper addresses the problem of distributed detection in fixed and switching networks. A network of agents observe partially informative signals about the unknown state of the world. Hence, they collaborate with each other to identify the true state. We propose an update rule building on distributed, stochastic optimization methods. Our main focus is on the finite-time analysis of the problem. For fixed networks, we bring forward the notion of Kullback-Leibler cost to measure the efficiency of the algorithm versus its centralized analog. We bound the cost in terms of the network size, spectral gap and relative entropy of agents' signal structures. We further consider the problem in random networks where the structure is realized according to a stationary distribution. We then prove that the convergence is exponentially fast (with high probability), and the non-asymptotic rate scales inversely in the spectral gap of the expected network.

preprint2015arXiv

Learning without Recall by Random Walks on Directed Graphs

We consider a network of agents that aim to learn some unknown state of the world using private observations and exchange of beliefs. At each time, agents observe private signals generated based on the true unknown state. Each agent might not be able to distinguish the true state based only on her private observations. This occurs when some other states are observationally equivalent to the true state from the agent's perspective. To overcome this shortcoming, agents must communicate with each other to benefit from local observations. We propose a model where each agent selects one of her neighbors randomly at each time. Then, she refines her opinion using her private signal and the prior of that particular neighbor. The proposed rule can be thought of as a Bayesian agent who cannot recall the priors based on which other agents make inferences. This learning without recall approach preserves some aspects of the Bayesian inference while being computationally tractable. By establishing a correspondence with a random walk on the network graph, we prove that under the described protocol, agents learn the truth exponentially fast in the almost sure sense. The asymptotic rate is expressed as the sum of the relative entropies between the signal structures of every agent weighted by the stationary distribution of the random walk.

preprint2015arXiv

Learning without Recall: A Case for Log-Linear Learning

We analyze a model of learning and belief formation in networks in which agents follow Bayes rule yet they do not recall their history of past observations and cannot reason about how other agents' beliefs are formed. They do so by making rational inferences about their observations which include a sequence of independent and identically distributed private signals as well as the beliefs of their neighboring agents at each time. Fully rational agents would successively apply Bayes rule to the entire history of observations. This leads to forebodingly complex inferences due to lack of knowledge about the global network structure that causes those observations. To address these complexities, we consider a Learning without Recall model, which in addition to providing a tractable framework for analyzing the behavior of rational agents in social networks, can also provide a behavioral foundation for the variety of non-Bayesian update rules in the literature. We present the implications of various choices for time-varying priors of such agents and how this choice affects learning and its rate.

preprint2015arXiv

Minimal Actuator Placement with Optimal Control Constraints

We introduce the problem of minimal actuator placement in a linear control system so that a bound on the minimum control effort for a given state transfer is satisfied while controllability is ensured. We first show that this is an NP-hard problem following the recent work of Olshevsky. Next, we prove that this problem has a supermodular structure. Afterwards, we provide an efficient algorithm that approximates up to a multiplicative factor of O(logn), where n is the size of the multi-agent network, any optimal actuator set that meets the specified energy criterion. Moreover, we show that this is the best approximation factor one can achieve in polynomial-time for the worst case. Finally, we test this algorithm over large Erdos-Renyi random networks to further demonstrate its efficiency.

preprint2015arXiv

Online Optimization : Competing with Dynamic Comparators

Recent literature on online learning has focused on developing adaptive algorithms that take advantage of a regularity of the sequence of observations, yet retain worst-case performance guarantees. A complementary direction is to develop prediction methods that perform well against complex benchmarks. In this paper, we address these two directions together. We present a fully adaptive method that competes with dynamic benchmarks in which regret guarantee scales with regularity of the sequence of cost functions and comparators. Notably, the regret bound adapts to the smaller complexity measure in the problem environment. Finally, we apply our results to drifting zero-sum, two-player games where both players achieve no regret guarantees against best sequences of actions in hindsight.

preprint2015arXiv

Optimal distributed control for platooning via sparse coprime factorizations

We introduce a novel distributed control architecture for heterogeneous platoons of linear time--invariant autonomous vehicles. Our approach is based on a generalization of the concept of {\em leader--follower} controllers for which we provide a Youla--like parameterization while the sparsity constraints are imposed on the controller's left coprime factors, outlying a new concept of structural constraints in distributed control. The proposed scheme is amenable to optimal controller design via norm based costs, it guarantees string stability and eliminates the accordion effect from the behavior of the platoon. We also introduce a synchronization mechanism for the exact compensation of the time delays induced by the wireless broadcasting of information.

preprint2015arXiv

Switching to Learn

A network of agents attempt to learn some unknown state of the world drawn by nature from a finite set. Agents observe private signals conditioned on the true state, and form beliefs about the unknown state accordingly. Each agent may face an identification problem in the sense that she cannot distinguish the truth in isolation. However, by communicating with each other, agents are able to benefit from side observations to learn the truth collectively. Unlike many distributed algorithms which rely on all-time communication protocols, we propose an efficient method by switching between Bayesian and non-Bayesian regimes. In this model, agents exchange information only when their private signals are not informative enough; thence, by switching between the two regimes, agents efficiently learn the truth using only a few rounds of communications. The proposed algorithm preserves learnability while incurring a lower communication cost. We also verify our theoretical findings by simulation examples.

preprint2014arXiv

Controllability and Fraction of Leaders in Infinite Network

In this paper, we study controllability of a network of linear single-integrator agents when the network size goes to infinity. We first investigate the effect of increasing size by injecting an input at every node and requiring that network controllability Gramian remain well-conditioned with the increasing dimension. We provide theoretical justification to the intuition that high degree nodes pose a challenge to network controllability. In particular, the controllability Gramian for the networks with bounded maximum degrees is shown to remain well-conditioned even as the network size goes to infinity. In the canonical cases of star, chain and ring networks, we also provide closed-form expressions which bound the condition number of the controllability Gramian in terms of the network size. We next consider the effect of the choice and number of leader nodes by actuating only a subset of nodes and considering the least eigenvalue of the Gramian as the network size increases. Accordingly, while a directed star topology can never be made controllable for all sizes by injecting an input just at a fraction $f<1$ of nodes; for path or cycle networks, the designer can actuate a non-zero fraction of nodes and spread them throughout the network in such way that the least eigenvalue of the Gramians remain bounded away from zero with the increasing size. The results offer interesting insights on the challenges of control in large networks and with high-degree nodes.

preprint2014arXiv

Distributed Detection : Finite-time Analysis and Impact of Network Topology

This paper addresses the problem of distributed detection in multi-agent networks. Agents receive private signals about an unknown state of the world. The underlying state is globally identifiable, yet informative signals may be dispersed throughout the network. Using an optimization-based framework, we develop an iterative local strategy for updating individual beliefs. In contrast to the existing literature which focuses on asymptotic learning, we provide a finite-time analysis. Furthermore, we introduce a Kullback-Leibler cost to compare the efficiency of the algorithm to its centralized counterpart. Our bounds on the cost are expressed in terms of network size, spectral gap, centrality of each agent and relative entropy of agents' signal structures. A key observation is that distributing more informative signals to central agents results in a faster learning rate. Furthermore, optimizing the weights, we can speed up learning by improving the spectral gap. We also quantify the effect of link failures on learning speed in symmetric networks. We finally provide numerical simulations which verify our theoretical results.

preprint2014arXiv

On the Degree Distribution of Pólya Urn Graph Processes

This paper presents a tighter bound on the degree distribution of arbitrary Pólya urn graph processes, proving that the proportion of vertices with degree $d$ obeys a power-law distribution $P(d) \propto d^{-γ}$ for $d \leq n^{\frac{1}{6}-ε}$ for any $ε> 0$, where $n$ represents the number of vertices in the network. Previous work by Bollobás et al. formalized the well-known preferential attachment model of Barabási and Albert, and showed that the power-law distribution held for $d \leq n^{\frac{1}{15}}$ with $γ= 3$. Our revised bound represents a significant improvement over existing models of degree distribution in scale-free networks, where its tightness is restricted by the Azuma-Hoeffding concentration inequality for martingales. We achieve this tighter bound through a careful analysis of the first set of vertices in the network generation process, and show that the newly acquired is at the edge of exhausting Bollobás model in the sense that the degree expectation breaks down for other powers.

preprint2014arXiv

Optimal Budget Allocation in Social Networks: Quality or Seeding

In this paper, we study a strategic model of marketing and product consumption in social networks. We consider two competing firms in a market providing two substitutable products with preset qualities. Agents choose their consumptions following a myopic best response dynamics which results in a local, linear update for the consumptions. At some point in time, firms receive a limited budget which they can use to trigger a larger consumption of their products in the network. Firms have to decide between marginally improving the quality of their products and giving free offers to a chosen set of agents in the network in order to better facilitate spreading their products. We derive a simple threshold rule for the optimal allocation of the budget and describe the resulting Nash equilibrium. It is shown that the optimal allocation of the budget depends on the entire distribution of centralities in the network, quality of products and the model parameters. In particular, we show that in a graph with a higher number of agents with centralities above a certain threshold, firms spend more budget on seeding in the optimal allocation. Furthermore, if seeding budget is nonzero for a balanced graph, it will also be nonzero for any other graph, and if seeding budget is zero for a star graph, it will be zero for any other graph too. We also show that firms allocate more budget to quality improvement when their qualities are close, in order to distance themselves from the rival firm. However, as the gap between qualities widens, competition in qualities becomes less effective and firms spend more budget on seeding.

preprint2014arXiv

Optimal Resource Allocation for Network Protection Against Spreading Processes

We study the problem of containing spreading processes in arbitrary directed networks by distributing protection resources throughout the nodes of the network. We consider two types of protection resources are available: (i) Preventive resources able to defend nodes against the spreading (such as vaccines in a viral infection process), and (ii) corrective resources able to neutralize the spreading after it has reached a node (such as antidotes). We assume that both preventive and corrective resources have an associated cost and study the problem of finding the cost-optimal distribution of resources throughout the nodes of the network. We analyze these questions in the context of viral spreading processes in directed networks. We study the following two problems: (i) Given a fixed budget, find the optimal allocation of preventive and corrective resources in the network to achieve the highest level of containment, and (ii) when a budget is not specified, find the minimum budget required to control the spreading process. We show that both resource allocation problems can be solved in polynomial time using Geometric Programming (GP) for arbitrary directed graphs of nonidentical nodes and a wide class of cost functions. Furthermore, our approach allows to optimize simultaneously over both preventive and corrective resources, even in the case of cost functions being node-dependent. We illustrate our approach by designing optimal protection strategies to contain an epidemic outbreak that propagates through an air transportation network.

preprint2013arXiv

A Novel Description of Linear Time--Invariant Networks via Structured Coprime Factorizations

In this paper we study state-space realizations of Linear and Time-Invariant (LTI) systems. Motivated by biochemical reaction networks, Gonçalves and Warnick have recently introduced the notion of a {\em Dynamical Structure Functions} (DSF), a particular factorization of the system's transfer function matrix that elucidates the interconnection structure in dependencies between manifest variables. We build onto this work by showing an intrinsic connection between a DSF and certain sparse left coprime factorizations. By establishing this link, we provide an interesting systems theoretic interpretation of sparsity patterns of coprime factors. In particular we show how the sparsity of these coprime factors allows for a given LTI system to be implemented as a network of LTI sub-systems. We examine possible applications in distributed control such as the design of a LTI controller that can be implemented over a network with a pre-specified topology.

preprint2013arXiv

Accelerated Backpressure Algorithm

We develop an Accelerated Back Pressure (ABP) algorithm using Accelerated Dual Descent (ADD), a distributed approximate Newton-like algorithm that only uses local information. Our construction is based on writing the backpressure algorithm as the solution to a network feasibility problem solved via stochastic dual subgradient descent. We apply stochastic ADD in place of the stochastic gradient descent algorithm. We prove that the ABP algorithm guarantees stable queues. Our numerical experiments demonstrate a significant improvement in convergence rate, especially when the packet arrival statistics vary over time.

preprint2013arXiv

Bayesian Quadratic Network Game Filters

A repeated network game where agents have quadratic utilities that depend on information externalities -- an unknown underlying state -- as well as payoff externalities -- the actions of all other agents in the network -- is considered. Agents play Bayesian Nash Equilibrium strategies with respect to their beliefs on the state of the world and the actions of all other nodes in the network. These beliefs are refined over subsequent stages based on the observed actions of neighboring peers. This paper introduces the Quadratic Network Game (QNG) filter that agents can run locally to update their beliefs, select corresponding optimal actions, and eventually learn a sufficient statistic of the network's state. The QNG filter is demonstrated on a Cournot market competition game and a coordination game to implement navigation of an autonomous team.

preprint2013arXiv

Exponentially Fast Parameter Estimation in Networks Using Distributed Dual Averaging

In this paper we present an optimization-based view of distributed parameter estimation and observational social learning in networks. Agents receive a sequence of random, independent and identically distributed (i.i.d.) signals, each of which individually may not be informative about the underlying true state, but the signals together are globally informative enough to make the true state identifiable. Using an optimization-based characterization of Bayesian learning as proximal stochastic gradient descent (with Kullback-Leibler divergence from a prior as a proximal function), we show how to efficiently use a distributed, online variant of Nesterov's dual averaging method to solve the estimation with purely local information. When the true state is globally identifiable, and the network is connected, we prove that agents eventually learn the true parameter using a randomized gossip scheme. We demonstrate that with high probability the convergence is exponentially fast with a rate dependent on the KL divergence of observations under the true state from observations under the second likeliest state. Furthermore, our work also highlights the possibility of learning under continuous adaptation of network which is a consequence of employing constant, unit stepsize for the algorithm.

preprint2013arXiv

Online Learning of Dynamic Parameters in Social Networks

This paper addresses the problem of online learning in a dynamic setting. We consider a social network in which each individual observes a private signal about the underlying state of the world and communicates with her neighbors at each time period. Unlike many existing approaches, the underlying state is dynamic, and evolves according to a geometric random walk. We view the scenario as an optimization problem where agents aim to learn the true state while suffering the smallest possible loss. Based on the decomposition of the global loss function, we introduce two update mechanisms, each of which generates an estimate of the true state. We establish a tight bound on the rate of change of the underlying state, under which individuals can track the parameter with a bounded variance. Then, we characterize explicit expressions for the steady state mean-square deviation(MSD) of the estimates from the truth, per individual. We observe that only one of the estimators recovers the optimal MSD, which underscores the impact of the objective function decomposition on the learning quality. Finally, we provide an upper bound on the regret of the proposed methods, measured as an average of errors in estimating the parameter in a finite time.

preprint2013arXiv

Optimal Vaccine Allocation to Control Epidemic Outbreaks in Arbitrary Networks

We consider the problem of controlling the propagation of an epidemic outbreak in an arbitrary contact network by distributing vaccination resources throughout the network. We analyze a networked version of the Susceptible-Infected-Susceptible (SIS) epidemic model when individuals in the network present different levels of susceptibility to the epidemic. In this context, controlling the spread of an epidemic outbreak can be written as a spectral condition involving the eigenvalues of a matrix that depends on the network structure and the parameters of the model. We study the problem of finding the optimal distribution of vaccines throughout the network to control the spread of an epidemic outbreak. We propose a convex framework to find cost-optimal distribution of vaccination resources when different levels of vaccination are allowed. We also propose a greedy approach with quality guarantees for the case of all-or-nothing vaccination. We illustrate our approaches with numerical simulations in a real social network.

preprint2012arXiv

A Distributed Line Search for Network Optimization

Dual descent methods are used to solve network optimization problems because descent directions can be computed in a distributed manner using information available either locally or at neighboring nodes. However, choosing a stepsize in the descent direction remains a challenge because its computation requires global information. This work presents an algorithm based on a local version of the Armijo rule that allows for the computation of a stepsize using only local and neighborhood information. We show that when our distributed line search algorithm is applied with a descent direction computed according to the Accelerated Dual Descent method \cite{acc11}, key properties of standard backtracking line search using the Armijo rule are recovered. We use simulations to demonstrate that our algorithm is a practical substitute for its centralized counterpart.

preprint2012arXiv

Moment-Based Spectral Analysis of Large-Scale Networks Using Local Structural Information

The eigenvalues of matrices representing the structure of large-scale complex networks present a wide range of applications, from the analysis of dynamical processes taking place in the network to spectral techniques aiming to rank the importance of nodes in the network. A common approach to study the relationship between the structure of a network and its eigenvalues is to use synthetic random networks in which structural properties of interest, such as degree distributions, are prescribed. Although very common, synthetic models present two major flaws: (\emph{i}) These models are only suitable to study a very limited range of structural properties, and (\emph{ii}) they implicitly induce structural properties that are not directly controlled and can deceivingly influence the network eigenvalue spectrum. In this paper, we propose an alternative approach to overcome these limitations. Our approach is not based on synthetic models, instead, we use algebraic graph theory and convex optimization to study how structural properties influence the spectrum of eigenvalues of the network. Using our approach, we can compute with low computational overhead global spectral properties of a network from its local structural properties. We illustrate our approach by studying how structural properties of online social networks influence their eigenvalue spectra.

preprint2012arXiv

Spectral Design of Dynamic Networks via Local Operations

Motivated by the relationship between the eigenvalue spectrum of the Laplacian matrix of a network and the behavior of dynamical processes evolving in it, we propose a distributed iterative algorithm in which a group of $n$ autonomous agents self-organize the structure of their communication network in order to control the network's eigenvalue spectrum. In our algorithm, we assume that each agent has access only to a local (myopic) view of the network around it. In each iteration, agents in the network peform a decentralized decision process to determine the edge addition/deletion that minimizes a distance function defined in the space of eigenvalue spectra. This spectral distance presents interesting theoretical properties that allow an efficient distributed implementation of the decision process. Our iterative algorithm is stable by construction, i.e., locally optimizes the network's eigenvalue spectrum, and is shown to perform extremely well in practice. We illustrate our results with nontrivial simulations in which we design networks matching the spectral properties of complex networks, such as small-world and power-law networks.

preprint2012arXiv

Structural Analysis of Laplacian Spectral Properties of Large-Scale Networks

Using methods from algebraic graph theory and convex optimization, we study the relationship between local structural features of a network and spectral properties of its Laplacian matrix. In particular, we derive expressions for the so-called spectral moments of the Laplacian matrix of a network in terms of a collection of local structural measurements. Furthermore, we propose a series of semidefinite programs to compute bounds on the spectral radius and the spectral gap of the Laplacian matrix from a truncated sequence of Laplacian spectral moments. Our analysis shows that the Laplacian spectral moments and spectral radius are strongly constrained by local structural features of the network. On the other hand, we illustrate how local structural features are usually not enough to estimate the Laplacian spectral gap.

preprint2012arXiv

Structural Analysis of Viral Spreading Processes in Social and Communication Networks Using Egonets

We study how the behavior of viral spreading processes is influenced by local structural properties of the network over which they propagate. For a wide variety of spreading processes, the largest eigenvalue of the adjacency matrix of the network plays a key role on their global dynamical behavior. For many real-world large-scale networks, it is unfeasible to exactly retrieve the complete network structure to compute its largest eigenvalue. Instead, one usually have access to myopic, egocentric views of the network structure, also called egonets. In this paper, we propose a mathematical framework, based on algebraic graph theory and convex optimization, to study how local structural properties of the network constrain the interval of possible values in which the largest eigenvalue must lie. Based on this framework, we present a computationally efficient approach to find this interval from a collection of egonets. Our numerical simulations show that, for several social and communication networks, local structural properties of the network strongly constrain the location of the largest eigenvalue and the resulting spreading dynamics. From a practical point of view, our results can be used to dictate immunization strategies to tame the spreading of a virus, or to design network topologies that facilitate the spreading of information virally.

preprint2011arXiv

A Distributed Newton Method for Network Utility Maximization

Most existing work uses dual decomposition and subgradient methods to solve Network Utility Maximization (NUM) problems in a distributed manner, which suffer from slow rate of convergence properties. This work develops an alternative distributed Newton-type fast converging algorithm for solving network utility maximization problems with self-concordant utility functions. By using novel matrix splitting techniques, both primal and dual updates for the Newton step can be computed using iterative schemes in a decentralized manner with limited information exchange. Similarly, the stepsize can be obtained via an iterative consensus-based averaging scheme. We show that even when the Newton direction and the stepsize in our method are computed within some error (due to finite truncation of the iterative schemes), the resulting objective function value still converges superlinearly to an explicitly characterized error neighborhood. Simulation results demonstrate significant convergence rate improvement of our algorithm relative to the existing subgradient methods based on dual decomposition.

preprint2011arXiv

Analysis of Equilibria and Strategic Interaction in Complex Networks

This paper studies $n$-person simultaneous-move games with linear best response function, where individuals interact within a given network structure. This class of games have been used to model various settings, such as, public goods, belief formation, peer effects, and oligopoly. The purpose of this paper is to study the effect of the network structure on Nash equilibrium outcomes of this class of games. Bramoullé et al. derived conditions for uniqueness and stability of a Nash equilibrium in terms of the smallest eigenvalue of the adjacency matrix representing the network of interactions. Motivated by this result, we study how local structural properties of the network of interactions affect this eigenvalue, influencing game equilibria. In particular, we use algebraic graph theory and convex optimization to derive new bounds on the smallest eigenvalue in terms of the distribution of degrees, cycles, and other relevant substructures. We illustrate our results with numerical simulations involving online social networks.

preprint2011arXiv

From Local Measurements to Network Spectral Properties: Beyond Degree Distributions

It is well-known that the behavior of many dynamical processes running on networks is intimately related to the eigenvalue spectrum of the network. In this paper, we address the problem of inferring global information regarding the eigenvalue spectrum of a network from a set of local samples of its structure. In particular, we find explicit relationships between the so-called spectral moments of a graph and the presence of certain small subgraphs, also called motifs, in the network. Since the eigenvalues of the network have a direct influence on the network dynamical behavior, our result builds a bridge between local network measurements (i.e., the presence of small subgraphs) and global dynamical behavior (via the spectral moments). Furthermore, based on our result, we propose a novel decentralized scheme to compute the spectral moments of a network by aggregating local measurements of the network topology. Our final objective is to understand the relationships between the behavior of dynamical processes taking place in a large-scale complex network and its local topological properties.

preprint2011arXiv

Networked estimation under information constraints

In this paper, we study estimation of potentially unstable linear dynamical systems when the observations are distributed over a network. We are interested in scenarios when the information exchange among the agents is restricted. In particular, we consider that each agent can exchange information with its neighbors only once per dynamical system evolution-step. Existing work with similar information-constraints is restricted to static parameter estimation, whereas, the work on dynamical systems assumes large number of information exchange iterations between every two consecutive system evolution steps. We show that when the agent communication network is sparely-connected, the sparsity of the network plays a key role in the stability and performance of the underlying estimation algorithm. To this end, we introduce the notion of \emph{Network Tracing Capacity} (NTC), which is defined as the largest two-norm of the system matrix that can be estimated with bounded error. Extending this to fully-connected networks or infinite information exchanges (per dynamical system evolution-step), we note that the NTC is infinite, i.e., any dynamical system can be estimated with bounded error. In short, the NTC characterizes the estimation capability of a sparse network by relating it to the evolution of the underlying dynamical system.

preprint2011arXiv

On Non-Bayesian Social Learning

We study a model of information aggregation and social learning recently proposed by Jadbabaie, Sandroni, and Tahbaz-Salehi, in which individual agents try to learn a correct state of the world by iteratively updating their beliefs using private observations and beliefs of their neighbors. No individual agent's private signal might be informative enough to reveal the unknown state. As a result, agents share their beliefs with others in their social neighborhood to learn from each other. At every time step each agent receives a private signal, and computes a Bayesian posterior as an intermediate belief. The intermediate belief is then averaged with the belief of neighbors to form the individual's belief at next time step. We find a set of minimal sufficient conditions under which the agents will learn the unknown state and reach consensus on their beliefs without any assumption on the private signal structure. The key enabler is a result that shows that using this update, agents will eventually forecast the indefinite future correctly.

preprint2010arXiv

Distributed Control of the Laplacian Spectral Moments of a Network

It is well-known that the eigenvalue spectrum of the Laplacian matrix of a network contains valuable information about the network structure and the behavior of many dynamical processes run on it. In this paper, we propose a fully decentralized algorithm that iteratively modifies the structure of a network of agents in order to control the moments of the Laplacian eigenvalue spectrum. Although the individual agents have knowledge of their local network structure only (i.e., myopic information), they are collectively able to aggregate this local information and decide on what links are most beneficial to be added or removed at each time step. Our approach relies on gossip algorithms to distributively compute the spectral moments of the Laplacian matrix, as well as ensure network connectivity in the presence of link deletions. We illustrate our approach in nontrivial computer simulations and show that a good final approximation of the spectral moments of the target Laplacian matrix is achieved for many cases of interest.

preprint2010arXiv

Moment-Based Analysis of Synchronization in Small-World Networks of Oscillators

In this paper, we investigate synchronization in a small-world network of coupled nonlinear oscillators. This network is constructed by introducing random shortcuts in a nearest-neighbors ring. The local stability of the synchronous state is closely related with the support of the eigenvalue distribution of the Laplacian matrix of the network. We introduce, for the first time, analytical expressions for the first three moments of the eigenvalue distribution of the Laplacian matrix as a function of the probability of shortcuts and the connectivity of the underlying nearest-neighbor coupled ring. We apply these expressions to estimate the spectral support of the Laplacian matrix in order to predict synchronization in small-world networks. We verify the efficiency of our predictions with numerical simulations.

preprint2010arXiv

On Asymptotic Consensus Value in Directed Random Networks

We study the asymptotic properties of distributed consensus algorithms over switching directed random networks. More specifically, we focus on consensus algorithms over independent and identically distributed, directed random graphs, where each agent can communicate with any other agent with some exogenously specified probability. While different aspects of consensus algorithms over random switching networks have been widely studied, a complete characterization of the distribution of the asymptotic value for general \textit{asymmetric} random consensus algorithms remains an open problem. In this paper, we derive closed-form expressions for the mean and an upper bound for the variance of the asymptotic consensus value, when the underlying network evolves according to an i.i.d. \textit{directed} random graph process. We also provide numerical simulations that illustrate our results.

preprint2010arXiv

Spectral Analysis of Virus Spreading in Random Geometric Networks

In this paper, we study the dynamics of a viral spreading process in random geometric graphs (RGG). The spreading of the viral process we consider in this paper is closely related with the eigenvalues of the adjacency matrix of the graph. We deduce new explicit expressions for all the moments of the eigenvalue distribution of the adjacency matrix as a function of the spatial density of nodes and the radius of connection. We apply these expressions to study the behavior of the viral infection in an RGG. Based on our results, we deduce an analytical condition that can be used to design RGG's in order to tame an initial viral infection. Numerical simulations are in accordance with our analytical predictions.

preprint2010arXiv

Spectral Control of Mobile Robot Networks

The eigenvalue spectrum of the adjacency matrix of a network is closely related to the behavior of many dynamical processes run over the network. In the field of robotics, this spectrum has important implications in many problems that require some form of distributed coordination within a team of robots. In this paper, we propose a continuous-time control scheme that modifies the structure of a position-dependent network of mobile robots so that it achieves a desired set of adjacency eigenvalues. For this, we employ a novel abstraction of the eigenvalue spectrum by means of the adjacency matrix spectral moments. Since the eigenvalue spectrum is uniquely determined by its spectral moments, this abstraction provides a way to indirectly control the eigenvalues of the network. Our construction is based on artificial potentials that capture the distance of the network's spectral moments to their desired values. Minimization of these potentials is via a gradient descent closed-loop system that, under certain convexity assumptions, ensures convergence of the network topology to one with the desired set of moments and, therefore, eigenvalues. We illustrate our approach in nontrivial computer simulations.

Ali Jadbabaie

What is connected

Connect this record

See the researcher in context

Building this map preview

79 published item(s)

Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs

Federated Optimization of Smooth Loss Functions

An Optimal Transport Approach to Personalized Federated Learning

Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity

Byzantine-Robust Federated Linear Bandits

Current Implicit Policies May Not Eradicate COVID-19

Gradient Descent for Low-Rank Functions

Inference in Opinion Dynamics under Social Pressure

Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective

On Convergence of Gradient Descent Ascent: A Tight Local Analysis

Unifying Epidemic Models with Mixtures

Network Group Testing

Time varying regression with hidden linear dynamics

A Distributed Cubic-Regularized Newton Method for Smooth Convex Optimization over Networks

A Separation Theorem for Joint Sensor and Actuator Scheduling with Guaranteed Performance Bounds

Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions

Deterministic and Randomized Actuator Scheduling With Guaranteed Performance Bounds

Estimation of Skill Distributions

FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization

GAT-GMM: Generative Adversarial Training for Gaussian Mixture Models

LQG Control and Sensing Co-Design

Network Inference from Consensus Dynamics with Unknown Parameters

Robust Federated Learning: The Case of Affine Distribution Shifts

Sensing-Constrained LQG Control

Sensor Placement for Optimal Kalman Filtering: Fundamental Limits, Submodularity, and Algorithms

Why gradient clipping accelerates training: A theoretical justification for adaptivity

Bayesian Decision Making in Groups is Hard

Non-Bayesian Social Learning with Uncertain Models

Random Walks on Simplicial Complexes and the normalized Hodge 1-Laplacian

A Distributed Newton Method for Large Scale Consensus Optimization

An Exact Distributed Newton Method for Reinforcement Learning

Bayesian Heuristics for Group Decisions

Bayesian Learning without Recall

Bio-Inspired Framework for Allocation of Protection Resources in Cyber-Physical Networks

Distributed Estimation and Learning over Heterogeneous Networks

Distributed Estimation of Dynamic Parameters : Regret Analysis

Distributed Online Optimization in Dynamic Environments Using Mirror Descent

Learning without recall in directed circles and rooted trees

Near-Optimal Sensor Scheduling for Batch State Estimation: Complexity, Algorithms, and Limits

Online Optimization in Dynamic Environments: Improved Regret Rates for Strongly Convex Problems

Scheduling Nonlinear Sensors for Stochastic Process Estimation

A Fast Distributed Solver for Symmetric Diagonally Dominant Linear Equations

Competitive Diffusion in Social Networks: Quality or Seeding?

Distributed Resource Allocation for Epidemic control

Distributed SDDM Solvers: Theory & Applications

Fast, Accurate Second Order Methods for Network Optimization

Finite-time Analysis of the Distributed Detection Problem

Learning without Recall by Random Walks on Directed Graphs

Learning without Recall: A Case for Log-Linear Learning

Minimal Actuator Placement with Optimal Control Constraints

Online Optimization : Competing with Dynamic Comparators

Optimal distributed control for platooning via sparse coprime factorizations

Switching to Learn

Controllability and Fraction of Leaders in Infinite Network

Distributed Detection : Finite-time Analysis and Impact of Network Topology

On the Degree Distribution of Pólya Urn Graph Processes

Optimal Budget Allocation in Social Networks: Quality or Seeding

Optimal Resource Allocation for Network Protection Against Spreading Processes

A Novel Description of Linear Time--Invariant Networks via Structured Coprime Factorizations

Accelerated Backpressure Algorithm

Bayesian Quadratic Network Game Filters

Exponentially Fast Parameter Estimation in Networks Using Distributed Dual Averaging

Online Learning of Dynamic Parameters in Social Networks

Optimal Vaccine Allocation to Control Epidemic Outbreaks in Arbitrary Networks

A Distributed Line Search for Network Optimization

Moment-Based Spectral Analysis of Large-Scale Networks Using Local Structural Information

Spectral Design of Dynamic Networks via Local Operations

Structural Analysis of Laplacian Spectral Properties of Large-Scale Networks

Structural Analysis of Viral Spreading Processes in Social and Communication Networks Using Egonets

A Distributed Newton Method for Network Utility Maximization

Analysis of Equilibria and Strategic Interaction in Complex Networks

From Local Measurements to Network Spectral Properties: Beyond Degree Distributions

Networked estimation under information constraints

On Non-Bayesian Social Learning