Source author record

Yevgeniy Vorobeychik

Yevgeniy Vorobeychik appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Science and Game Theory Machine Learning Artificial Intelligence Cryptography and Security Computer Vision Social and Information Networks cond-mat.dis-nn Discrete Mathematics econ.TH Multiagent Systems physics.soc-ph Computational Engineering, Finance, and Science eess.IV Information Retrieval Networking and Internet Architecture nlin.AO nlin.CD nlin.CG q-fin.EC

Catalog footprint

What is connected

40works

19topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

COSMOS: Model-Agnostic Personalized Federated Learning with Clustered Server Models and Pseudo-Label-Only Communication

Federated learning (FL) in heterogeneous environments remains challenging because client models often differ in both architecture and data distribution. While recent approaches attempt to address this challenge through client clustering and knowledge distillation, simultaneously handling architectural and statistical heterogeneity remains difficult. We introduce COSMOS, a model-agnostic framework that enables server-side personalization using only pseudo-label communication. Clients train local models and predict on the public data; the server clusters clients by prediction similarity, trains a cluster-specific model for each group using its own compute, and distills the resulting models back to clients. We provide the first theoretical analysis showing that distillation from the learned cluster models can yield exponential personalization risk contraction, going beyond the convergence-to-stationarity guarantees typically provided in model-agnostic FL. Experiments across benchmarks demonstrate that COSMOS consistently outperforms all model-agnostic FL baselines while remaining competitive with state-of-the-art personalized FL methods. More broadly, our results highlight personalized server-side learning with pseudo-labels as a promising paradigm for scalable and model-agnostic federated learning in highly heterogeneous environments.

preprint2026arXiv

DiffVAS: Diffusion-Guided Visual Active Search in Partially Observable Environments

Visual active search (VAS) has been introduced as a modeling framework that leverages visual cues to direct aerial (e.g., UAV-based) exploration and pinpoint areas of interest within extensive geospatial regions. Potential applications of VAS include detecting hotspots for rare wildlife poaching, aiding search-and-rescue missions, and uncovering illegal trafficking of weapons, among other uses. Previous VAS approaches assume that the entire search space is known upfront, which is often unrealistic due to constraints such as a restricted field of view and high acquisition costs, and they typically learn policies tailored to specific target objects, which limits their ability to search for multiple target categories simultaneously. In this work, we propose DiffVAS, a target-conditioned policy that searches for diverse objects simultaneously according to task requirements in partially observable environments, which advances the deployment of visual active search policies in real-world applications. DiffVAS leverages a diffusion model to reconstruct the entire geospatial area from sequentially observed partial glimpses, which enables a target-conditioned reinforcement learning-based planning module to effectively reason and guide subsequent search steps. Extensive experiments demonstrate that DiffVAS excels in searching diverse objects in partially observable environments, significantly surpassing state-of-the-art methods on several datasets.

preprint2026arXiv

Learned Neighbor Trust for Collaborative Deployment in Model-Agnostic Decentralized Learning

Many decentralized distillation methods are designed around training-time coordination, yet deploy each node in isolation even when more capable neighbors remain available at inference time. This is an incomplete objective for settings such as IoT, where devices are heterogeneous, data is scarce and skewed, and a node's strongest neighbors may far exceed its own local capacity. We study how nodes should train so that their predictions compose well at deployment, and how each node should learn whom to trust. Under a server-free, model-agnostic protocol where nodes exchange only queries and soft predictions, we propose Learned Neighbor Trust (LNTrust) wherein each node learns a compact trust function over its neighborhood from local validation evidence. This trust function gates auxiliary distillation during training and defines a deployment ensemble at inference, so that collaboration learned during training transfers directly to deployment. Across datasets and topologies, LNTrust improves deployed accuracy over the strongest output-only baseline by large margins while using significantly less communication than previous methods.

preprint2026arXiv

Low Rank Adaptation for Adversarial Perturbation

Low-Rank Adaptation (LoRA), which leverages the insight that model updates typically reside in a low-dimensional space, has significantly improved the training efficiency of Large Language Models (LLMs) by updating neural network layers using low-rank matrices. Since the generation of adversarial examples is an optimization process analogous to model training, this naturally raises the question: Do adversarial perturbations exhibit a similar low-rank structure? In this paper, we provide both theoretical analysis and extensive empirical investigation across various attack methods, model architectures, and datasets to show that adversarial perturbations indeed possess an inherently low-rank structure. This insight opens up new opportunities for improving both adversarial attacks and defenses. We mainly focus on leveraging this low-rank property to improve the efficiency and effectiveness of black-box adversarial attacks, which often suffer from excessive query requirements. Our method follows a two-step approach. First, we use a reference model and auxiliary data to guide the projection of gradients into a low-dimensional subspace. Next, we confine the perturbation search in black-box attacks to this low-rank subspace, significantly improving the efficiency and effectiveness of the adversarial attacks. We evaluated our approach across a range of attack methods, benchmark models, datasets, and threat models. The results demonstrate substantial and consistent improvements in the performance of our low-rank adversarial attacks compared to conventional methods.

preprint2026arXiv

Residual-PAC Privacy: Automatic Privacy Control Beyond the Gaussian Barrier

The Probably Approximately Correct (PAC) Privacy framework [46] provides a powerful instance-based methodology to preserve privacy in complex data-driven systems. Existing PAC Privacy algorithms (we call them Auto-PAC) rely on a Gaussian mutual information upper bound. However, we show that the upper bound obtained by these algorithms is tight if and only if the perturbed mechanism output is jointly Gaussian with independent Gaussian noise. We propose two approaches for addressing this issue. First, we introduce two tractable post-processing methods for Auto-PAC, based on Donsker-Varadhan representation and sliced Wasserstein distances. However, the result still leaves wasted privacy budget. To address this issue more fundamentally, we introduce Residual-PAC (R-PAC) Privacy, an f-divergence-based measure to quantify privacy that remains after adversarial inference. To implement R-PAC Privacy in practice, we propose a Stackelberg Residual-PAC (SR-PAC) privatization mechanism, a game-theoretic framework that selects optimal noise distributions through convex bilevel optimization. Our approach achieves efficient privacy budget utilization for arbitrary data distributions and naturally composes when multiple mechanisms access the dataset. Our experiments demonstrate that SR-PAC consistently obtains a better privacy-utility tradeoff than both PAC and differential privacy baselines.

preprint2023arXiv

Robust Deep Reinforcement Learning through Bootstrapped Opportunistic Curriculum

Despite considerable advances in deep reinforcement learning, it has been shown to be highly vulnerable to adversarial perturbations to state observations. Recent efforts that have attempted to improve adversarial robustness of reinforcement learning can nevertheless tolerate only very small perturbations, and remain fragile as perturbation size increases. We propose Bootstrapped Opportunistic Adversarial Curriculum Learning (BCL), a novel flexible adversarial curriculum learning framework for robust reinforcement learning. Our framework combines two ideas: conservatively bootstrapping each curriculum phase with highest quality solutions obtained from multiple runs of the previous phase, and opportunistically skipping forward in the curriculum. In our experiments we show that the proposed BCL framework enables dramatic improvements in robustness of learned policies to adversarial perturbations. The greatest improvement is for Pong, where our framework yields robustness to perturbations of up to 25/255; in contrast, the best existing approach can only tolerate adversarial noise up to 5/255. Our code is available at: https://github.com/jlwu002/BCL.

preprint2022arXiv

A Game-Theoretic Approach for Hierarchical Epidemic Control

We design and analyze a multi-level game-theoretic model of hierarchical policy interventions for epidemic control, such as those in response to the COVID-19 pandemic. Our model captures the potentially mismatched priorities among a hierarchy of policy-makers (e.g., federal, state, and local governments) with respect to two cost components that have opposite dependence on the policy strength -- post-intervention infection rates and the socio-economic cost of policy implementation. Additionally, our model includes a crucial third factor in decisions: a cost of non-compliance with the policy-maker immediately above in the hierarchy, such as non-compliance of counties with state-level policies. We propose two novel algorithms for approximating solutions to such games. The first is based on best response dynamics (BRD), and exploits the tree structure of the game. The second combines quadratic integer programming (QIP), which enables us to collapse the two lowest levels of the game, with best response dynamics. Through extensive experiments, we show that our QIP-based approach significantly outperforms the BRD algorithm both in running time and the quality of equilibrium solutions. Finally, we apply the QIP-based algorithm to experiments based on both synthetic and real-world data under various parameter configurations and analyze the resulting (approximate) equilibria to gain insight into the impact of decentralization on overall welfare (measured as the negative sum of costs) as well as emergent properties like free-riding and fairness in cost distribution among policy-makers.

preprint2022arXiv

A Rotating Proposer Mechanism for Team Formation

We present a rotating proposer mechanism for team formation, which implements a Pareto efficient subgame perfect Nash equilibrium of an extensive-form team formation game.

preprint2022arXiv

Adversarial Robustness of Deep Sensor Fusion Models

We experimentally study the robustness of deep camera-LiDAR fusion architectures for 2D object detection in autonomous driving. First, we find that the fusion model is usually both more accurate, and more robust against single-source attacks than single-sensor deep neural networks. Furthermore, we show that without adversarial training, early fusion is more robust than late fusion, whereas the two perform similarly after adversarial training. However, we note that single-channel adversarial training of deep fusion is often detrimental even to robustness. Moreover, we observe cross-channel externalities, where single-channel adversarial training reduces robustness to attacks on the other channel. Additionally, we observe that the choice of adversarial model in adversarial training is critical: using attacks restricted to cars' bounding boxes is more effective in adversarial training and exhibits less significant cross-channel externalities. Finally, we find that joint-channel adversarial training helps mitigate many of the issues above, but does not significantly boost adversarial robustness.

preprint2022arXiv

Certifying Safety in Reinforcement Learning under Adversarial Perturbation Attacks

Function approximation has enabled remarkable advances in applying reinforcement learning (RL) techniques in environments with high-dimensional inputs, such as images, in an end-to-end fashion, mapping such inputs directly to low-level control. Nevertheless, these have proved vulnerable to small adversarial input perturbations. A number of approaches for improving or certifying robustness of end-to-end RL to adversarial perturbations have emerged as a result, focusing on cumulative reward. However, what is often at stake in adversarial scenarios is the violation of fundamental properties, such as safety, rather than the overall reward that combines safety with efficiency. Moreover, properties such as safety can only be defined with respect to true state, rather than the high-dimensional raw inputs to end-to-end policies. To disentangle nominal efficiency and adversarial safety, we situate RL in deterministic partially-observable Markov decision processes (POMDPs) with the goal of maximizing cumulative reward subject to safety constraints. We then propose a partially-supervised reinforcement learning (PSRL) framework that takes advantage of an additional assumption that the true state of the POMDP is known at training time. We present the first approach for certifying safety of PSRL policies under adversarial input perturbations, and two adversarial training approaches that make direct use of PSRL. Our experiments demonstrate both the efficacy of the proposed approach for certifying safety in adversarial environments, and the value of the PSRL framework coupled with adversarial training in improving certified safety while preserving high nominal reward and high-quality predictions of true state.

preprint2022arXiv

Computing Equilibria in Binary Networked Public Goods Games

Public goods games study the incentives of individuals to contribute to a public good and their behaviors in equilibria. In this paper, we examine a specific type of public goods game where players are networked and each has binary actions, and focus on the algorithmic aspects of such games. First, we show that checking the existence of a pure-strategy Nash equilibrium is NP-complete. We then identify tractable instances based on restrictions of either utility functions or of the underlying graphical structure. In certain cases, we also show that we can efficiently compute a socially optimal Nash equilibrium. Finally, we propose a heuristic approach for computing approximate equilibria in general binary networked public goods games, and experimentally demonstrate its effectiveness.

preprint2022arXiv

CROP: Certifying Robust Policies for Reinforcement Learning through Functional Smoothing

As reinforcement learning (RL) has achieved great success and been even adopted in safety-critical domains such as autonomous vehicles, a range of empirical studies have been conducted to improve its robustness against adversarial attacks. However, how to certify its robustness with theoretical guarantees still remains challenging. In this paper, we present the first unified framework CROP (Certifying Robust Policies for RL) to provide robustness certification on both action and reward levels. In particular, we propose two robustness certification criteria: robustness of per-state actions and lower bound of cumulative rewards. We then develop a local smoothing algorithm for policies derived from Q-functions to guarantee the robustness of actions taken along the trajectory; we also develop a global smoothing algorithm for certifying the lower bound of a finite-horizon cumulative reward, as well as a novel local smoothing algorithm to perform adaptive search in order to obtain tighter reward certification. Empirically, we apply CROP to evaluate several existing empirically robust RL algorithms, including adversarial training and different robust regularization, in four environments (two representative Atari games, Highway, and CartPole). Furthermore, by evaluating these algorithms against adversarial attacks, we demonstrate that our certification are often tight. All experiment results are available at website https://crop-leaderboard.github.io.

preprint2022arXiv

Learning Generative Deception Strategies in Combinatorial Masking Games

Deception is a crucial tool in the cyberdefence repertoire, enabling defenders to leverage their informational advantage to reduce the likelihood of successful attacks. One way deception can be employed is through obscuring, or masking, some of the information about how systems are configured, increasing attacker's uncertainty about their targets. We present a novel game-theoretic model of the resulting defender-attacker interaction, where the defender chooses a subset of attributes to mask, while the attacker responds by choosing an exploit to execute. The strategies of both players have combinatorial structure with complex informational dependencies, and therefore even representing these strategies is not trivial. First, we show that the problem of computing an equilibrium of the resulting zero-sum defender-attacker game can be represented as a linear program with a combinatorial number of system configuration variables and constraints, and develop a constraint generation approach for solving this problem. Next, we present a novel highly scalable approach for approximately solving such games by representing the strategies of both players as neural networks. The key idea is to represent the defender's mixed strategy using a deep neural network generator, and then using alternating gradient-descent-ascent algorithm, analogous to the training of Generative Adversarial Networks. Our experiments, as well as a case study, demonstrate the efficacy of the proposed approach.

preprint2022arXiv

Manipulating Elections by Changing Voter Perceptions

The integrity of elections is central to democratic systems. However, a myriad of malicious actors aspire to influence election outcomes for financial or political benefit. A common means to such ends is by manipulating perceptions of the voting public about select candidates, for example, through misinformation. We present a formal model of the impact of perception manipulation on election outcomes in the framework of spatial voting theory, in which the preferences of voters over candidates are generated based on their relative distance in the space of issues. We show that controlling elections in this model is, in general, NP-hard, whether issues are binary or real-valued. However, we demonstrate that critical to intractability is the diversity of opinions on issues exhibited by the voting public. When voter views lack diversity, and we can instead group them into a small number of categories -- for example, as a result of political polarization -- the election control problem can be solved in polynomial time in the number of issues and candidates for arbitrary scoring rules.

preprint2022arXiv

Networked Restless Multi-Armed Bandits for Mobile Interventions

Motivated by a broad class of mobile intervention problems, we propose and study restless multi-armed bandits (RMABs) with network effects. In our model, arms are partially recharging and connected through a graph, so that pulling one arm also improves the state of neighboring arms, significantly extending the previously studied setting of fully recharging bandits with no network effects. In mobile interventions, network effects may arise due to regular population movements (such as commuting between home and work). We show that network effects in RMABs induce strong reward coupling that is not accounted for by existing solution methods. We propose a new solution approach for networked RMABs, exploiting concavity properties which arise under natural assumptions on the structure of intervention effects. We provide sufficient conditions for optimality of our approach in idealized settings and demonstrate that it empirically outperforms state-of-the art baselines in three mobile intervention domains using real-world graphs.

preprint2022arXiv

Proceedings of the Artificial Intelligence for Cyber Security (AICS) Workshop at AAAI 2022

The workshop will focus on the application of AI to problems in cyber security. Cyber systems generate large volumes of data, utilizing this effectively is beyond human capabilities. Additionally, adversaries continue to develop new attacks. Hence, AI methods are required to understand and protect the cyber domain. These challenges are widely studied in enterprise networks, but there are many gaps in research and practice as well as novel problems in other domains. In general, AI techniques are still not widely adopted in the real world. Reasons include: (1) a lack of certification of AI for security, (2) a lack of formal study of the implications of practical constraints (e.g., power, memory, storage) for AI systems in the cyber domain, (3) known vulnerabilities such as evasion, poisoning attacks, (4) lack of meaningful explanations for security analysts, and (5) lack of analyst trust in AI solutions. There is a need for the research community to develop novel solutions for these practical issues.

preprint2022arXiv

Removing Malicious Nodes from Networks

A fundamental challenge in networked systems is detection and removal of suspected malicious nodes. In reality, detection is always imperfect, and the decision about which potentially malicious nodes to remove must trade off false positives (erroneously removing benign nodes) and false negatives (mistakenly failing to remove malicious nodes). However, in network settings this conventional tradeoff must now account for node connectivity. In particular, malicious nodes may exert malicious influence, so that mistakenly leaving some of these in the network may cause damage to spread. On the other hand, removing benign nodes causes direct harm to these, and indirect harm to their benign neighbors who would wish to communicate with them. We formalize the problem of removing potentially malicious nodes from a network under uncertainty through an objective that takes connectivity into account. We show that optimally solving the resulting problem is NP-Hard. We then propose a tractable solution approach based on a convex relaxation of the objective. Finally, we experimentally demonstrate that our approach significantly outperforms both a simple baseline that ignores network structure, as well as a state-of-the-art approach for a related problem, on both synthetic and real-world datasets.

preprint2022arXiv

Reward Delay Attacks on Deep Reinforcement Learning

Most reinforcement learning algorithms implicitly assume strong synchrony. We present novel attacks targeting Q-learning that exploit a vulnerability entailed by this assumption by delaying the reward signal for a limited time period. We consider two types of attack goals: targeted attacks, which aim to cause a target policy to be learned, and untargeted attacks, which simply aim to induce a policy with a low reward. We evaluate the efficacy of the proposed attacks through a series of experiments. Our first observation is that reward-delay attacks are extremely effective when the goal is simply to minimize reward. Indeed, we find that even naive baseline reward-delay attacks are also highly successful in minimizing the reward. Targeted attacks, on the other hand, are more challenging, although we nevertheless demonstrate that the proposed approaches remain highly effective at achieving the attacker's targets. In addition, we introduce a second threat model that captures a minimal mitigation that ensures that rewards cannot be used out of sequence. We find that this mitigation remains insufficient to ensure robustness to attacks that delay, but preserve the order, of rewards.

preprint2022arXiv

Solving Structured Hierarchical Games Using Differential Backward Induction

From large-scale organizations to decentralized political systems, hierarchical strategic decision making is commonplace. We introduce a novel class of structured hierarchical games (SHGs) that formally capture such hierarchical strategic interactions. In an SHG, each player is a node in a tree, and strategic choices of players are sequenced from root to leaves, with root moving first, followed by its children, then followed by their children, and so on until the leaves. A player's utility in an SHG depends on its own decision, and on the choices of its parent and all the tree leaves. SHGs thus generalize simultaneous-move games, as well as Stackelberg games with many followers. We leverage the structure of both the sequence of player moves as well as payoff dependence to develop a gradient-based back propagation-style algorithm, which we call Differential Backward Induction (DBI), for approximating equilibria of SHGs. We provide a sufficient condition for convergence of DBI and demonstrate its efficacy in finding approximate equilibrium solutions to several SHG models of hierarchical policy-making problems.

preprint2021arXiv

Multi-Scale Games: Representing and Solving Games on Networks with Group Structure

Network games provide a natural machinery to compactly represent strategic interactions among agents whose payoffs exhibit sparsity in their dependence on the actions of others. Besides encoding interaction sparsity, however, real networks often exhibit a multi-scale structure, in which agents can be grouped into communities, those communities further grouped, and so on, and where interactions among such groups may also exhibit sparsity. We present a general model of multi-scale network games that encodes such multi-level structure. We then develop several algorithmic approaches that leverage this multi-scale structure, and derive sufficient conditions for convergence of these to a Nash equilibrium. Our numerical experiments demonstrate that the proposed approaches enable orders of magnitude improvements in scalability when computing Nash equilibria in such games. For example, we can solve previously intractable instances involving up to 1 million agents in under 15 minutes.

preprint2021arXiv

Optimizing Graph Structure for Targeted Diffusion

The problem of diffusion control on networks has been extensively studied, with applications ranging from marketing to controlling infectious disease. However, in many applications, such as cybersecurity, an attacker may want to attack a targeted subgraph of a network, while limiting the impact on the rest of the network in order to remain undetected. We present a model POTION in which the principal aim is to optimize graph structure to achieve such targeted attacks. We propose an algorithm POTION-ALG for solving the model at scale, using a gradient-based approach that leverages Rayleigh quotients and pseudospectrum theory. In addition, we present a condition for certifying that a targeted subgraph is immune to such attacks. Finally, we demonstrate the effectiveness of our approach through experiments on real and synthetic networks.

preprint2021arXiv

Strategic Evasion of Centrality Measures

Among the most fundamental tools for social network analysis are centrality measures, which quantify the importance of every node in the network. This centrality analysis typically disregards the possibility that the network may have been deliberately manipulated to mislead the analysis. To solve this problem, a recent study attempted to understand how a member of a social network could rewire the connections therein to avoid being identified as a leader of that network. However, the study was based on the assumption that the network analyzer - the seeker - is oblivious to any evasion attempts by the evader. In this paper, we relax this assumption by modelling the seeker and evader as strategic players in a Bayesian Stackelberg game. In this context, we study the complexity of various optimization problems, and analyze the equilibria of the game under different assumptions, thereby drawing the first conclusions in the literature regarding which centralities the seeker should use to maximize the chances of detecting a strategic evader.

preprint2020arXiv

Adversarial Deep Reinforcement Learning based Adaptive Moving Target Defense

Moving target defense (MTD) is a proactive defense approach that aims to thwart attacks by continuously changing the attack surface of a system (e.g., changing host or network configurations), thereby increasing the adversary's uncertainty and attack cost. To maximize the impact of MTD, a defender must strategically choose when and what changes to make, taking into account both the characteristics of its system as well as the adversary's observed activities. Finding an optimal strategy for MTD presents a significant challenge, especially when facing a resourceful and determined adversary who may respond to the defender's actions. In this paper, we propose a multi-agent partially-observable Markov Decision Process model of MTD and formulate a two-player general-sum game between the adversary and the defender. Based on an established model of adaptive MTD, we propose a multi-agent reinforcement learning framework based on the double oracle algorithm to solve the game. In the experiments, we show the effectiveness of our framework in finding optimal policies.

preprint2020arXiv

Defending Against Physically Realizable Attacks on Image Classification

We study the problem of defending deep neural network approaches for image classification from physically realizable attacks. First, we demonstrate that the two most scalable and effective methods for learning robust models, adversarial training with PGD attacks and randomized smoothing, exhibit very limited effectiveness against three of the highest profile physical attacks. Next, we propose a new abstract adversarial model, rectangular occlusion attacks, in which an adversary places a small adversarially crafted rectangle in an image, and develop two approaches for efficiently computing the resulting adversarial examples. Finally, we demonstrate that adversarial training using our new attack yields image classification models that exhibit high robustness against the physically realizable attacks we study, offering the first effective generic defense against such attacks.

preprint2020arXiv

Election Control by Manipulating Issue Significance

Integrity of elections is vital to democratic systems, but it is frequently threatened by malicious actors. The study of algorithmic complexity of the problem of manipulating election outcomes by changing its structural features is known as election control. One means of election control that has been proposed is to select a subset of issues that determine voter preferences over candidates. We study a variation of this model in which voters have judgments about relative importance of issues, and a malicious actor can manipulate these judgments. We show that computing effective manipulations in this model is NP-hard even with two candidates or binary issues. However, we demonstrate that the problem is tractable with a constant number of voters or issues. Additionally, while it remains intractable when voters can vote stochastically, we exhibit an important special case in which stochastic voting enables tractable manipulation.

preprint2020arXiv

On Algorithmic Decision Procedures in Emergency Response Systems in Smart and Connected Communities

Emergency Response Management (ERM) is a critical problem faced by communities across the globe. Despite this, it is common for ERM systems to follow myopic decision policies in the real world. Principled approaches to aid ERM decision-making under uncertainty have been explored but have failed to be accepted into real systems. We identify a key issue impeding their adoption --- algorithmic approaches to emergency response focus on reactive, post-incident dispatching actions, i.e. optimally dispatching a responder \textit{after} incidents occur. However, the critical nature of emergency response dictates that when an incident occurs, first responders always dispatch the closest available responder to the incident. We argue that the crucial period of planning for ERM systems is not post-incident, but between incidents. This is not a trivial planning problem --- a major challenge with dynamically balancing the spatial distribution of responders is the complexity of the problem. An orthogonal problem in ERM systems is planning under limited communication, which is particularly important in disaster scenarios that affect communication networks. We address both problems by proposing two partially decentralized multi-agent planning algorithms that utilize heuristics and exploit the structure of the dispatch problem. We evaluate our proposed approach using real-world data, and find that in several contexts, dynamic re-balancing the spatial distribution of emergency responders reduces both the average response time as well as its variance.

preprint2020arXiv

Robust Collective Classification against Structural Attacks

Collective learning methods exploit relations among data points to enhance classification performance. However, such relations, represented as edges in the underlying graphical model, expose an extra attack surface to the adversaries. We study adversarial robustness of an important class of such graphical models, Associative Markov Networks (AMN), to structural attacks, where an attacker can modify the graph structure at test time. We formulate the task of learning a robust AMN classifier as a bi-level program, where the inner problem is a challenging non-linear integer program that computes optimal structural changes to the AMN. To address this technical challenge, we first relax the attacker problem, and then use duality to obtain a convex quadratic upper bound for the robust AMN problem. We then prove a bound on the quality of the resulting approximately optimal solutions, and experimentally demonstrate the efficacy of our approach. Finally, we apply our approach in a transductive learning setting, and show that robust AMN is much more robust than state-of-the-art deep learning methods, while sacrificing little in accuracy on non-adversarial data.

preprint2016arXiv

A General Retraining Framework for Scalable Adversarial Classification

Traditional classification algorithms assume that training and test data come from similar distributions. This assumption is violated in adversarial settings, where malicious actors modify instances to evade detection. A number of custom methods have been developed for both adversarial evasion attacks and robust learning. We propose the first systematic and general-purpose retraining framework which can: a) boost robustness of an \emph{arbitrary} learning algorithm, in the face of b) a broader class of adversarial models than any prior methods. We show that, under natural conditions, the retraining framework minimizes an upper bound on optimal adversarial risk, and show how to extend this result to account for approximations of evasion attacks. Extensive experimental evaluation demonstrates that our retraining methods are nearly indistinguishable from state-of-the-art algorithms for optimizing adversarial risk, but are more general and far more scalable. The experiments also confirm that without retraining, our adversarial framework dramatically reduces the effectiveness of learning. In contrast, retraining significantly boosts robustness to evasion attacks without significantly compromising overall accuracy.

preprint2016arXiv

Data Poisoning Attacks on Factorization-Based Collaborative Filtering

Recommendation and collaborative filtering systems are important in modern information and e-commerce applications. As these systems are becoming increasingly popular in the industry, their outputs could affect business decision making, introducing incentives for an adversarial party to compromise the availability or integrity of such systems. We introduce a data poisoning attack on collaborative filtering systems. We demonstrate how a powerful attacker with full knowledge of the learner can generate malicious data so as to maximize his/her malicious objectives, while at the same time mimicking normal user behavior to avoid being detected. While the complete knowledge assumption seems extreme, it enables a robust assessment of the vulnerability of collaborative filtering schemes to highly motivated attacks. We present efficient solutions for two popular factorization-based collaborative filtering algorithms: the \emph{alternative minimization} formulation and the \emph{nuclear norm minimization} method. Finally, we test the effectiveness of our proposed algorithms on real-world data and discuss potential defensive strategies.

preprint2016arXiv

Predicting Human Cooperation

The Prisoner's Dilemma has been a subject of extensive research due to its importance in understanding the ever-present tension between individual self-interest and social benefit. A strictly dominant strategy in a Prisoner's Dilemma (defection), when played by both players, is mutually harmful. Repetition of the Prisoner's Dilemma can give rise to cooperation as an equilibrium, but defection is as well, and this ambiguity is difficult to resolve. The numerous behavioral experiments investigating the Prisoner's Dilemma highlight that players often cooperate, but the level of cooperation varies significantly with the specifics of the experimental predicament. We present the first computational model of human behavior in repeated Prisoner's Dilemma games that unifies the diversity of experimental observations in a systematic and quantitatively reliable manner. Our model relies on data we integrated from many experiments, comprising 168,386 individual decisions. The computational model is composed of two pieces: the first predicts the first-period action using solely the structural game parameters, while the second predicts dynamic actions using both game parameters and history of play. Our model is extremely successful not merely at fitting the data, but in predicting behavior at multiple scales in experimental designs not used for calibration, using only information about the game structure. We demonstrate the power of our approach through a simulation analysis revealing how to best promote human cooperation.

preprint2016arXiv

Robust High-Dimensional Linear Regression

The effectiveness of supervised learning techniques has made them ubiquitous in research and practice. In high-dimensional settings, supervised learning commonly relies on dimensionality reduction to improve performance and identify the most important factors in predicting outcomes. However, the economic importance of learning has made it a natural target for adversarial manipulation of training data, which we term poisoning attacks. Prior approaches to dealing with robust supervised learning rely on strong assumptions about the nature of the feature matrix, such as feature independence and sub-Gaussian noise with low variance. We propose an integrated method for robust regression that relaxes these assumptions, assuming only that the feature matrix can be well approximated by a low-rank matrix. Our techniques integrate improved robust low-rank matrix approximation and robust principle component regression, and yield strong performance guarantees. Moreover, we experimentally show that our methods significantly outperform state of the art both in running time and prediction error.

preprint2016arXiv

Scheduling Resource-Bounded Monitoring Devices for Event Detection and Isolation in Networks

In networked systems, monitoring devices such as sensors are typically deployed to monitor various target locations. Targets are the points in the physical space at which events of some interest, such as random faults or attacks, can occur. Most often, these devices have limited energy supplies, and they can operate for a limited duration. As a result, energy-efficient monitoring of various target locations through a set of monitoring devices with limited energy supplies is a crucial problem in networked systems. In this paper, we study optimal scheduling of monitoring devices to maximize network coverage for detecting and isolating events on targets for a given network lifetime. The monitoring devices considered could remain active only for a fraction of the overall network lifetime. We formulate the problem of scheduling of monitoring devices as a graph labeling problem, which unlike other existing solutions, allows us to directly utilize the underlying network structure to explore the trade-off between coverage and network lifetime. In this direction, first we propose a greedy heuristic to solve the graph labeling problem, and then provide a game-theoretic solution to achieve near optimal graph labeling. Moreover, the proposed setup can be used to simultaneously solve the scheduling and placement of monitoring devices, which yields improved performance as compared to separately solving the placement and scheduling problems. Finally, we illustrate our results on various networks, including real-world water distribution networks.

preprint2015arXiv

Mechanism Design for Team Formation

Team formation is a core problem in AI. Remarkably, little prior work has addressed the problem of mechanism design for team formation, accounting for the need to elicit agents' preferences over potential teammates. Coalition formation in the related hedonic games has received much attention, but only from the perspective of coalition stability, with little emphasis on the mechanism design objectives of true preference elicitation, social welfare, and equity. We present the first formal mechanism design framework for team formation, building on recent combinatorial matching market design literature. We exhibit four mechanisms for this problem, two novel, two simple extensions of known mechanisms from other domains. Two of these (one new, one known) have desirable theoretical properties. However, we use extensive experiments to show our second novel mechanism, despite having no theoretical guarantees, empirically achieves good incentive compatibility, welfare, and fairness.

preprint2015arXiv

Multidefender Security Games

Stackelberg security game models and associated computational tools have seen deployment in a number of high-consequence security settings, such as LAX canine patrols and Federal Air Marshal Service. These models focus on isolated systems with only one defender, despite being part of a more complex system with multiple players. Furthermore, many real systems such as transportation networks and the power grid exhibit interdependencies between targets and, consequently, between decision makers jointly charged with protecting them. To understand such multidefender strategic interactions present in security, we investigate game theoretic models of security games with multiple defenders. Unlike most prior analysis, we focus on the situations in which each defender must protect multiple targets, so that even a single defender's best response decision is, in general, highly non-trivial. We start with an analytical investigation of multidefender security games with independent targets, offering an equilibrium and price-of-anarchy analysis of three models with increasing generality. In all models, we find that defenders have the incentive to over-protect targets, at times significantly. Additionally, in the simpler models, we find that the price of anarchy is unbounded, linearly increasing both in the number of defenders and the number of targets per defender. Considering interdependencies among targets, we develop a novel mixed-integer linear programming formulation to compute a defender's best response, and make use of this formulation in approximating Nash equilibria of the game. We apply this approach towards computational strategic analysis of several models of networks representing interdependencies, including real-world power networks. Our analysis shows how network structure and the probability of failure spread determine the propensity of defenders to over- or under-invest in security.

preprint2014arXiv

Characterizing short-term stability for Boolean networks over any distribution of transfer functions

We present a characterization of short-term stability of random Boolean networks under \emph{arbitrary} distributions of transfer functions. Given any distribution of transfer functions for a random Boolean network, we present a formula that decides whether short-term chaos (damage spreading) will happen. We provide a formal proof for this formula, and empirically show that its predictions are accurate. Previous work only works for special cases of balanced families. It has been observed that these characterizations fail for unbalanced families, yet such families are widespread in real biological networks.

preprint2012arXiv

Computing Optimal Security Strategies for Interdependent Assets

We introduce a novel framework for computing optimal randomized security policies in networked domains which extends previous approaches in several ways. First, we extend previous linear programming techniques for Stackelberg security games to incorporate benefits and costs of arbitrary security configurations on individual assets. Second, we offer a principled model of failure cascades that allows us to capture both the direct and indirect value of assets, and extend this model to capture uncertainty about the structure of the interdependency network. Third, we extend the linear programming formulation to account for exogenous (random) failures in addition to targeted attacks. The goal of our work is two-fold. First, we aim to develop techniques for computing optimal security strategies in realistic settings involving interdependent security. To this end, we evaluate the value of our technical contributions in comparison with previous approaches, and show that our approach yields much better defense policies and scales to realistic graphs. Second, our computational framework enables us to attain theoretical insights about security on networks. As an example, we study how allowing security to be endogenous impacts the relative resilience of different network topologies.

preprint2012arXiv

Constrained Automated Mechanism Design for Infinite Games of Incomplete Information

We present a functional framework for automated mechanism design based on a two-stage game model of strategic interaction between the designer and the mechanism participants, and apply it to several classes of two-player infinite games of incomplete information. At the core of our framework is a black-box optimization algorithm which guides the selection process of candidate mechanisms. Our approach yields optimal or nearly optimal mechanisms in several application domains using various objective functions. By comparing our results with known optimal mechanisms, and in some cases improving on the best known mechanisms, we provide evidence that ours is a promising approach to parametric design of indirect mechanisms.

preprint2012arXiv

Simulation-Based Game Theoretic Analysis of Keyword Auctions with Low-Dimensional Bidding Strategies

We perform a simulation-based analysis of keyword auctions modeled as one-shot games of incomplete information to study a series of mechanism design questions. Our first question addresses the degree to which incentive compatibility fails in generalized second-price (GSP) auctions. Our results suggest that sincere bidding in GSP auctions is a strikingly poor strategy and a poor predictor of equilibrium outcomes. We next show that the rank-by-revenue mechanism is welfare optimal, corroborating past results. Finally, we analyze profit as a function of auction mechanism under a series of alternative settings. Our conclusions coincide with those of Lahaie and Pennock [2007] when values and quality scores are strongly positively correlated: in such a case, rank-by-bid rules are clearly superior. We diverge, however, in showing that auctions that put little weight on quality scores almost universally dominate the pure rank-by-revenue scheme.

preprint2011arXiv

Influence and Dynamic Behavior in Random Boolean Networks

We present a rigorous mathematical framework for analyzing dynamics of a broad class of Boolean network models. We use this framework to provide the first formal proof of many of the standard critical transition results in Boolean network analysis, and offer analogous characterizations for novel classes of random Boolean networks. We precisely connect the short-run dynamic behavior of a Boolean network to the average influence of the transfer functions. We show that some of the assumptions traditionally made in the more common mean-field analysis of Boolean networks do not hold in general. For example, we offer some evidence that imbalance, or expected internal inhomogeneity, of transfer functions is a crucial feature that tends to drive quiescent behavior far more strongly than previously observed.

preprint2011arXiv

Noncooperatively Optimized Tolerance: Decentralized Strategic Optimization in Complex Systems

We introduce noncooperatively optimized tolerance (NOT), a generalization of highly optimized tolerance (HOT) that involves strategic (game theoretic) interactions between parties in a complex system. We illustrate our model in the forest fire (percolation) framework. As the number of players increases, our model retains features of HOT, such as robustness, high yield combined with high density, and self-dissimilar landscapes, but also develops features of self-organized criticality (SOC) when the number of players is large enough. For example, the forest landscape becomes increasingly homogeneous and protection from adverse events (lightning strikes) becomes less closely correlated with the spatial distribution of these events. While HOT is a special case of our model, the resemblance to SOC is only partial; for example, the distribution of cascades, while becoming increasingly heavy-tailed as the number of players increases, also deviates more significantly from a power law in this regime. Surprisingly, the system retains considerable robustness even as it becomes fractured, due in part to emergent cooperation between neighboring players. At the same time, increasing homogeneity promotes resilience against changes in the lightning distribution, giving rise to intermediate regimes where the system is robust to a particular distribution of adverse events, yet not very fragile to changes.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

Source provenance

Where this author record came from

arxivconfidence 95%

external id: arxiv:2605.05009:author:4:yevgeniy-vorobeychik

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.11165:author:5:yevgeniy-vorobeychik

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2604.27487:author:3:yevgeniy-vorobeychik

Imported May 20, 2026Synced May 20, 2026

arxivconfidence 95%

external id: arxiv:2605.15519:author:5:yevgeniy-vorobeychik

Imported May 20, 2026Synced May 20, 2026

4 works

Bo Li

Researcher

Bo Li contributes to research discovery and scholarly infrastructure.

Open to collaborate

4 works

Junlin Wu

Researcher

Junlin Wu contributes to research discovery and scholarly infrastructure.

Open to collaborate

4 works

Milind Tambe

Researcher

Milind Tambe contributes to research discovery and scholarly infrastructure.

Open to collaborate

3 works

Kai Zhou

Researcher

Kai Zhou contributes to research discovery and scholarly infrastructure.

Open to collaborate

Yevgeniy Vorobeychik

What is connected

Connect this record

See the researcher in context

Building this map preview

40 published item(s)

COSMOS: Model-Agnostic Personalized Federated Learning with Clustered Server Models and Pseudo-Label-Only Communication

DiffVAS: Diffusion-Guided Visual Active Search in Partially Observable Environments

Learned Neighbor Trust for Collaborative Deployment in Model-Agnostic Decentralized Learning

Low Rank Adaptation for Adversarial Perturbation

Residual-PAC Privacy: Automatic Privacy Control Beyond the Gaussian Barrier

Robust Deep Reinforcement Learning through Bootstrapped Opportunistic Curriculum

A Game-Theoretic Approach for Hierarchical Epidemic Control

A Rotating Proposer Mechanism for Team Formation

Adversarial Robustness of Deep Sensor Fusion Models

Certifying Safety in Reinforcement Learning under Adversarial Perturbation Attacks

Computing Equilibria in Binary Networked Public Goods Games

CROP: Certifying Robust Policies for Reinforcement Learning through Functional Smoothing

Learning Generative Deception Strategies in Combinatorial Masking Games

Manipulating Elections by Changing Voter Perceptions

Networked Restless Multi-Armed Bandits for Mobile Interventions

Proceedings of the Artificial Intelligence for Cyber Security (AICS) Workshop at AAAI 2022

Removing Malicious Nodes from Networks

Reward Delay Attacks on Deep Reinforcement Learning

Solving Structured Hierarchical Games Using Differential Backward Induction

Multi-Scale Games: Representing and Solving Games on Networks with Group Structure

Optimizing Graph Structure for Targeted Diffusion

Strategic Evasion of Centrality Measures

Adversarial Deep Reinforcement Learning based Adaptive Moving Target Defense

Defending Against Physically Realizable Attacks on Image Classification

Election Control by Manipulating Issue Significance

On Algorithmic Decision Procedures in Emergency Response Systems in Smart and Connected Communities

Robust Collective Classification against Structural Attacks

A General Retraining Framework for Scalable Adversarial Classification

Data Poisoning Attacks on Factorization-Based Collaborative Filtering

Predicting Human Cooperation

Robust High-Dimensional Linear Regression

Scheduling Resource-Bounded Monitoring Devices for Event Detection and Isolation in Networks

Mechanism Design for Team Formation

Multidefender Security Games

Characterizing short-term stability for Boolean networks over any distribution of transfer functions

Computing Optimal Security Strategies for Interdependent Assets

Constrained Automated Mechanism Design for Infinite Games of Incomplete Information

Simulation-Based Game Theoretic Analysis of Keyword Auctions with Low-Dimensional Bidding Strategies

Influence and Dynamic Behavior in Random Boolean Networks

Noncooperatively Optimized Tolerance: Decentralized Strategic Optimization in Complex Systems