Researcher profile

S. Rasoul Etesami

S. Rasoul Etesami contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2024arXiv

Local Environment Poisoning Attacks on Federated Reinforcement Learning

Federated learning (FL) has become a popular tool for solving traditional Reinforcement Learning (RL) tasks. The multi-agent structure addresses the major concern of data-hungry in traditional RL, while the federated mechanism protects the data privacy of individual agents. However, the federated mechanism also exposes the system to poisoning by malicious agents that can mislead the trained policy. Despite the advantage brought by FL, the vulnerability of Federated Reinforcement Learning (FRL) has not been well-studied before. In this work, we propose a general framework to characterize FRL poisoning as an optimization problem and design a poisoning protocol that can be applied to policy-based FRL. Our framework can also be extended to FRL with actor-critic as a local RL algorithm by training a pair of private and public critics. We provably show that our method can strictly hurt the global objective. We verify our poisoning effectiveness by conducting extensive experiments targeting mainstream RL algorithms and over various RL OpenAI Gym environments covering a wide range of difficulty levels. Within these experiments, we compare clean and baseline poisoning methods against our proposed framework. The results show that the proposed framework is successful in poisoning FRL systems and reducing performance across various environments and does so more effectively than baseline methods. Our work provides new insights into the vulnerability of FL in RL training and poses new challenges for designing robust FRL algorithms

preprint2022arXiv

Maximizing Convergence Time in Network Averaging Dynamics Subject to Edge Removal

We consider the consensus interdiction problem (CIP), in which the goal is to maximize the convergence time of consensus averaging dynamics subject to removing a limited number of network edges. We first show that CIP can be cast as an effective resistance interdiction problem (ERIP), in which the goal is to remove a limited number of network edges to maximize the effective resistance between a source node and a sink node. We show that ERIP is strongly NP-hard, even for bipartite graphs of diameter three with fixed source/sink edges, and establish the same hardness result for the CIP. We then show that both ERIP and CIP cannot be approximated up to a (nearly) polynomial factor assuming exponential time hypothesis. Subsequently, we devise a polynomial-time $mn$-approximation algorithm for the ERIP that only depends on the number of nodes $n$ and the number of edges $m$, but is independent of the size of edge resistances. Finally, using a quadratic program formulation for the CIP, we devise an iterative approximation algorithm to find a first-order stationary solution for the CIP and evaluate its good performance through numerical results.

preprint2020arXiv

Duality and Stability in Complex Multiagent State-Dependent Network Dynamics

Despite significant progress on stability analysis of conventional multiagent networked systems with weakly coupled state-network dynamics, most of the existing results have shortcomings in addressing multiagent systems with highly coupled state-network dynamics. Motivated by numerous applications of such dynamics, in our previous work [1], we initiated a new direction for stability analysis of such systems that uses a sequential optimization framework. Building upon that, in this paper, we extend our results by providing another angle on multiagent network dynamics from a duality perspective, which allows us to view the network structure as dual variables of a constrained nonlinear program. Leveraging that idea, we show that the evolution of the coupled state-network multiagent dynamics can be viewed as iterates of a primal-dual algorithm for a static constrained optimization/saddle-point problem. This view bridges the Lyapunov stability of state-dependent network dynamics and frequently used optimization techniques such as block coordinated descent, mirror descent, the Newton method, and the subgradient method. As a result, we develop a systematic framework for analyzing the Lyapunov stability of state-dependent network dynamics using techniques from nonlinear optimization. Finally, we support our theoretical results through numerical simulations from social science.

preprint2020arXiv

Optimal versus Nash Equilibrium Computation for Networked Resource Allocation

Motivated by emerging resource allocation and data placement problems such as web caches and peer-to-peer systems, we consider and study a class of resource allocation problems over a network of agents (nodes). In this model, nodes can store only a limited number of resources while accessing the remaining ones through their closest neighbors. We consider this problem under both optimization and game-theoretic frameworks. In the case of optimal resource allocation we will first show that when there are only k=2 resources, the optimal allocation can be found efficiently in O(n^2\log n) steps, where n denotes the total number of nodes. However, for k>2 this problem becomes NP-hard with no polynomial time approximation algorithm with a performance guarantee better than 1+1/102k^2, even under metric access costs. We then provide a 3-approximation algorithm for the optimal resource allocation which runs only in linear time O(n). Subsequently, we look at this problem under a selfish setting formulated as a noncooperative game and provide a 3-approximation algorithm for obtaining its pure Nash equilibria under metric access costs. We then establish an equivalence between the set of pure Nash equilibria and flip-optimal solutions of the Max-k-Cut problem over a specific weighted complete graph. Using this reduction, we show that finding the lexicographically smallest Nash equilibrium for k> 2 is NP-hard, and provide an algorithm to find it in O(n^3 2^n) steps. While the reduction to weighted Max-k-Cut suggests that finding a pure Nash equilibrium using best response dynamics might be PLS-hard, it allows us to use tools from quadratic programming to devise more systematic algorithms towards obtaining Nash equilibrium points.

preprint2020arXiv

Toward Optimal Adversarial Policies in the Multiplicative Learning System with a Malicious Expert

We consider a learning system based on the conventional multiplicative weight (MW) rule that combines experts' advice to predict a sequence of true outcomes. It is assumed that one of the experts is malicious and aims to impose the maximum loss on the system. The loss of the system is naturally defined to be the aggregate absolute difference between the sequence of predicted outcomes and the true outcomes. We consider this problem under both offline and online settings. In the offline setting where the malicious expert must choose its entire sequence of decisions a priori, we show somewhat surprisingly that a simple greedy policy of always reporting false prediction is asymptotically optimal with an approximation ratio of $1+O(\sqrt{\frac{\ln N}{N}})$, where $N$ is the total number of prediction stages. In particular, we describe a policy that closely resembles the structure of the optimal offline policy. For the online setting where the malicious expert can adaptively make its decisions, we show that the optimal online policy can be efficiently computed by solving a dynamic program in $O(N^3)$. Our results provide a new direction for vulnerability assessment of commonly used learning algorithms to adversarial attacks where the threat is an integral part of the system.

preprint2019arXiv

Smart Routing of Electric Vehicles for Load Balancing in Smart Grids

Electric vehicles (EVs) are expected to be a major component of the smart grid. The rapid proliferation of EVs will introduce an unprecedented load on the existing electric grid due to the charging/discharging behavior of the EVs, thus motivating the need for novel approaches for routing EVs across the grid. In this paper, a novel gametheoretic framework for smart routing of EVs within the smart grid is proposed. The goal of this framework is to balance the electricity load across the grid while taking into account the traffic congestion and the waiting time at charging stations. The EV routing problem is formulated as a noncooperative game. For this game, it is shown that selfish behavior of EVs will result in a pure-strategy Nash equilibrium with the price of anarchy upper bounded by the variance of the ground load induced by the residential, industrial, or commercial users. Moreover, the results are extended to capture the stochastic nature of induced ground load as well as the subjective behavior of the owners of EVs as captured by using notions from the behavioral framework of prospect theory. Simulation results provide new insights on more efficient energy pricing at charging stations and under more realistic grid conditions.