Source author record

Quanyan Zhu

Quanyan Zhu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

59works

27topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

Decentralized No-Regret Frequency-Time Scheduling for FMCW Radar Interference Avoidance

Automotive FMCW radars are indispensable to modern ADAS and autonomous-driving systems, but their increasing density has intensified the risk of mutual interference. Existing mitigation techniques, including reactive receiver-side suppression, proactive waveform design, and cooperative scheduling, often face limitations in scalability, reliance on side-channel communication, or degradation of range-Doppler resolution. Building on our earlier work on decentralized Frequency-Domain No-Regret hopping, this paper introduces a unified time-frequency game-theoretic framework that enables radars to adapt across both spectral and temporal resources. We formulate the interference-avoidance problem as a repeated anti-coordination game, in which each radar autonomously updates a mixed strategy over frequency subbands and chirp-level time offsets using regret-minimization dynamics. We show that the proposed Time-Frequency No-Regret Hopping algorithm achieves vanishing external and swap regret, and that the induced empirical play converges to an $\varepsilon$-coarse correlated equilibrium or a correlated equilibrium. Theoretical analysis provides regret bounds in the joint domain, revealing how temporal adaptation implicitly regularizes frequency selection and enhances robustness against asynchronous interference. Numerical experiments with multi-radar scenarios demonstrate substantial improvements in SINR, collision rate, and range-Doppler quality compared with time-frequency random hopping and centralized Nash-based benchmarks.

preprint2024arXiv

Integrated Cyber-Physical Resiliency for Power Grids under IoT-Enabled Dynamic Botnet Attacks

The wide adoption of Internet of Things (IoT)-enabled energy devices improves the quality of life, but simultaneously, it enlarges the attack surface of the power grid system. The adversary can gain illegitimate control of a large number of these devices and use them as a means to compromise the physical grid operation, a mechanism known as the IoT botnet attack. This paper aims to improve the resiliency of cyber-physical power grids to such attacks. Specifically, we use an epidemic model to understand the dynamic botnet formation, which facilitates the assessment of the cyber layer vulnerability of the grid. The attacker aims to exploit this vulnerability to enable a successful physical compromise, while the system operator's goal is to ensure a normal operation of the grid by mitigating cyber risks. We develop a cross-layer game-theoretic framework for strategic decision-making to enhance cyber-physical grid resiliency. The cyber-layer game guides the system operator on how to defend against the botnet attacker as the first layer of defense, while the dynamic game strategy at the physical layer further counteracts the adversarial behavior in real time for improved physical resilience. A number of case studies on the IEEE-39 bus system are used to corroborate the devised approach.

preprint2023arXiv

A Rolling Horizon Game Considering Network Effect in Cluster Forming for Dynamic Resilient Multiagent Systems

A two-player game-theoretic problem on resilient graphs in a multiagent consensus setting is formulated. An attacker is capable to disable some of the edges of the network with the objective to divide the agents into clusters by emitting jamming signals while, in response, the defender recovers some of the edges by increasing the transmission power for the communication signals. Specifically, we consider repeated games between the attacker and the defender where the optimal strategies for the two players are derived in a rolling horizon fashion based on utility functions that take both the agents' states and the sizes of clusters (known as network effect) into account. The players' actions at each discrete-time step are constrained by their energy for transmissions of the signals, with a less strict constraint for the attacker. Necessary conditions and sufficient conditions of agent consensus are derived, which are influenced by the energy constraints. The number of clusters of agents at infinite time in the face of attacks and recoveries are also characterized. Simulation results are provided to demonstrate the effects of players' actions on the cluster forming and to illustrate the players' performance for different horizon parameters.

preprint2023arXiv

An Introduction of System-Scientific Approaches to Cognitive Security

Human cognitive capacities and the needs of human-centric solutions for "Industry 5.0" make humans an indispensable component in Cyber-Physical Systems (CPSs), referred to as Human-Cyber-Physical Systems (HCPSs), where AI-powered technologies are incorporated to assist and augment humans. The close integration between humans and technologies in Section 1.1 and cognitive attacks in Section 1.2.4 poses emerging security challenges, where attacks can exploit vulnerabilities of human cognitive processes, affect their behaviors, and ultimately damage the HCPS. Defending HCPSs against cognitive attacks requires a new security paradigm, which we refer to as "cognitive security" in Section 1.2.5. The vulnerabilities of human cognitive systems and the associated methods of exploitation distinguish cognitive security from "cognitive reliability" and give rise to a distinctive CIA triad, as shown in Sections 1.2.5.1 and 1.2.5.2, respectively. Section 1.2.5.3 introduces cognitive and technical defense methods that deter the kill chain of cognitive attacks and achieve cognitive security. System scientific perspectives in Section 1.3 offer a promising direction to address the new challenges of cognitive security by developing quantitative, modular, multi-scale, and transferable solutions.

preprint2023arXiv

QoS Based Contract Design for Profit Maximization in IoT-Enabled Data Markets

The massive deployment of Internet of Things (IoT) devices, including sensors and actuators, is ushering in smart and connected communities of the future. The massive deployment of Internet of Things (IoT) devices, including sensors and actuators, is ushering in smart and connected communities of the future. The availability of real-time and high-quality sensor data is crucial for various IoT applications, particularly in healthcare, energy, transportation, etc. However, data collection may have to be outsourced to external service providers (SPs) due to cost considerations or lack of specialized equipment. Hence, the data market plays a critical role in such scenarios where SPs have different quality levels of available data, and IoT users have different application-specific data needs. The pairing between data available to the SP and users in the data market requires an effective mechanism design that considers the SPs' profitability and the quality-of-service (QoS) needs of the users. We develop a generic framework to analyze and enable such interactions efficiently, leveraging tools from contract theory and mechanism design theory. It can enable and empower emerging data sharing paradigms such as Sensing-as-a-Service (SaaS). The contract design creates a pricing structure for on-demand sensing data for IoT users. By considering a continuum of user types, we capture a diverse range of application requirements and propose optimal pricing and allocation rules that ensure QoS provisioning and maximum profitability for the SP. Furthermore, we provide analytical solutions for fixed distributions of user types to analyze the developed approach. For comparison, we consider the benchmark case assuming complete information of the user types and obtain optimal contract solutions. Finally, a case study is presented to demonstrate the efficacy of the proposed contract design framework.

preprint2022arXiv

A Pursuit-Evasion Differential Game with Strategic Information Acquisition

This paper studies a two-person linear-quadratic-Gaussian pursuit-evasion differential game with costly but controlled information. One player can decide when to observe the other player's state. However, one observation of another player's state comes with two costs: the direct cost of observing and the implicit cost of exposing his state. We call games of this type a Pursuit-Evasion-Exposure-Concealment (PEEC) game. The PEEC game constitutes two types of strategies: The control strategies and the observation strategies. We fully characterize the Nash control strategies of the PEEC game using techniques such as completing squares and the calculus of variations. We show that the derivation of the Nash observation strategies and the Nash control strategies can be decoupled. We develop a set of necessary conditions that facilitate the numerical computation of the Nash observation strategies. We show, in theory, that players with less maneuverability prefer concealment to exposure. We also show that when the game's horizon goes to infinity, the Nash observation strategy is to observe periodically, and the expected distance between the pursuer and the evader goes to zero with a bounded second moment. We conducted a series of numerical experiments to study the proposed PEEC game. We illustrate the numerical results using both figures and animation. Numerical results show that the pursuer can maintain high-grade performance even when the number of observations is limited. We also show that an evader with low maneuverability can still escape if the evader increases his stealthiness.

preprint2022arXiv

Accountability and Insurance in IoT Supply Chain

Supply chain security has become a growing concern in security risk analysis of the Internet of Things (IoT) systems. Their highly connected structures have significantly enlarged the attack surface, making it difficult to track the source of the risk posed by malicious or compromised suppliers. This chapter presents a system-scientific framework to study the accountability in IoT supply chains and provides a holistic risk analysis technologically and socio-economically. We develop stylized models and quantitative approaches to evaluate the accountability of the suppliers. Two case studies are used to illustrate accountability measures for scenarios with single and multiple agents. Finally, we present the contract design and cyber insurance as economic solutions to mitigate supply chain risks. They are incentive-compatible mechanisms that encourage truth-telling of the supplier and facilitate reliable accountability investigation for the buyer.

preprint2022arXiv

ADVERT: An Adaptive and Data-Driven Attention Enhancement Mechanism for Phishing Prevention

Attacks exploiting the innate and the acquired vulnerabilities of human users have posed severe threats to cybersecurity. This work proposes ADVERT, a human-technical solution that generates adaptive visual aids in real-time to prevent users from inadvertence and reduce their susceptibility to phishing attacks. Based on the eye-tracking data, we extract visual states and attention states as system-level sufficient statistics to characterize the user's visual behaviors and attention status. By adopting a data-driven approach and two learning feedback of different time scales, this work lays out a theoretical foundation to analyze, evaluate, and particularly modify humans' attention processes while they vet and recognize phishing emails. We corroborate the effectiveness, efficiency, and robustness of ADVERT through a case study based on the data set collected from human subject experiments conducted at New York University. The results show that the visual aids can statistically increase the attention level and improve the accuracy of phishing recognition from 74.6% to a minimum of 86%. The meta-adaptation can further improve the accuracy to 91.5% (resp. 93.7%) in less than 3 (resp. 50) tuning stages.

preprint2022arXiv

Autonomous and Resilient Control for Optimal LEO Satellite Constellation Coverage Against Space Threats

LEO satellite constellation coverage has served as the base platform for various space applications. However, the rapidly evolving security environment such as orbit debris and adversarial space threats are greatly endangering the security of satellite constellation and integrity of the satellite constellation coverage. As on-orbit repairs are challenging, a distributed and autonomous protection mechanism is necessary to ensure the adaptation and self-healing of the satellite constellation coverage from different attacks. To this end, we establish an integrative and distributed framework to enable resilient satellite constellation coverage planning and control in a single orbit. Each satellite can make decisions individually to recover from adversarial and non-adversarial attacks and keep providing coverage service. We first provide models and methodologies to measure the coverage performance. Then, we formulate the joint resilient coverage planning-control problem as a two-stage problem. A coverage game is proposed to find the equilibrium constellation deployment for resilient coverage planning and an agent-based algorithm is developed to compute the equilibrium. The multi-waypoint Model Predictive Control (MPC) methodology is adopted to achieve autonomous self-healing control. Finally, we use a typical LEO satellite constellation as a case study to corroborate the results.

preprint2022arXiv

Bayesian Promised Persuasion: Dynamic Forward-Looking Multiagent Delegation with Informational Burning

This work studies a dynamic mechanism design problem in which a principal delegates decision makings to a group of privately-informed agents without the monetary transfer or burning. We consider that the principal privately possesses complete knowledge about the state transitions and study how she can use her private observation to support the incentive compatibility of the delegation via informational burning, a process we refer to as the looking-forward persuasion. The delegation mechanism is formulated in which the agents form belief hierarchies due to the persuasion and play a dynamic Bayesian game. We propose a novel randomized mechanism, known as Bayesian promised delegation (BPD), in which the periodic incentive compatibility is guaranteed by persuasions and promises of future delegations. We show that the BPD can achieve the same optimal social welfare as the original mechanism in stationary Markov perfect Bayesian equilibria. A revelation-principle-like design regime is established to show that the persuasion with belief hierarchies can be fully characterized by correlating the randomization of the agents' local BPD mechanisms with the persuasion as a direct recommendation of the future promises.

preprint2022arXiv

Controlling Fake News by Tagging: A Branching Process Analysis

The spread of fake news on online social networks (OSNs) has become a matter of concern. These platforms are also used for propagating important authentic information. Thus, there is a need for mitigating fake news without significantly influencing the spread of real news. We leverage users' inherent capabilities of identifying fake news and propose a warning-based control mechanism to curb this spread. Warnings are based on previous users' responses that indicate the authenticity of the news. We use population-size dependent continuous-time multi-type branching processes to describe the spreading under the warning mechanism. We also have new results towards these branching processes. The (time) asymptotic proportions of the individual populations are derived using stochastic approximation tools. Using these, relevant type 1, type 2 performances are derived and an appropriate optimization problem is solved. The proposed mechanism effectively controls fake news, with negligible influence on the propagation of authentic news. We validate performance measures using Monte Carlo simulations on network connections provided by Twitter data.

preprint2022arXiv

Multi-Agent Learning for Resilient Distributed Control Systems

Resilience describes a system's ability to function under disturbances and threats. Many critical infrastructures, including smart grids and transportation networks, are large-scale complex systems consisting of many interdependent subsystems. Decentralized architecture becomes a key resilience design paradigm for large-scale systems. In this book chapter, we present a multi-agent system (MAS) framework for distributed large-scale control systems and discuss the role of MAS learning in resiliency. This chapter introduces the creation of an artificial intelligence (AI) stack in the MAS to provide computational intelligence for subsystems to detect, respond, and recover. We discuss the application of learning methods at the cyber and physical layers of the system. The discussions focus on distributed learning algorithms for subsystems to respond to each other, and game-theoretic learning for them to respond to disturbances and adversarial behaviors. The book chapter presents a case study of distributed renewable energy systems to elaborate on the MAS architecture and its interface with the AI stack.

preprint2022arXiv

On Poisoned Wardrop Equilibrium in Congestion Games

Recent years have witnessed a growing number of attack vectors against increasingly interconnected traffic networks. Informational attacks have emerged as the prominent ones that aim to poison traffic data, misguide users, and manipulate traffic patterns. To study the impact of this class of attacks, we propose a game-theoretic framework where the attacker, as a Stackelberg leader, falsifies the traffic conditions to change the traffic pattern predicted by the Wardrop traffic equilibrium, achieved by the users, or the followers. The intended shift of the Wardrop equilibrium is a consequence of strategic informational poisoning. Leveraging game-theoretic and sensitivity analysis, we quantify the system-level impact of the attack by characterizing the concept of poisoned Price of Anarchy, which compares the poisoned Wardrop equilibrium and its non-poisoned system optimal counterpart. We use an evacuation case study to show that the Stackelberg equilibrium can be found through a two-time scale zeroth-order learning process and demonstrate the disruptive effects of informational poisoning, indicating a compelling need for defense policies to mitigate such security threats.

preprint2022arXiv

RADAMS: Resilient and Adaptive Alert and Attention Management Strategy against Informational Denial-of-Service (IDoS) Attacks

Attacks exploiting human attentional vulnerability have posed severe threats to cybersecurity. In this work, we identify and formally define a new type of proactive attentional attacks called Informational Denial-of-Service (IDoS) attacks that generate a large volume of feint attacks to overload human operators and hide real attacks among feints. We incorporate human factors (e.g., levels of expertise, stress, and efficiency) and empirical psychological results (e.g., the Yerkes-Dodson law and the sunk cost fallacy) to model the operators' attention dynamics and their decision-making processes along with the real-time alert monitoring and inspection. To assist human operators in dismissing the feints and escalating the real attacks timely and accurately, we develop a Resilient and Adaptive Data-driven alert and Attention Management Strategy (RADAMS) that de-emphasizes alerts selectively based on the abstracted category labels of the alerts. RADAMS uses reinforcement learning to achieve a customized and transferable design for various human operators and evolving IDoS attacks. The integrated modeling and theoretical analysis lead to the Product Principle of Attention (PPoA), fundamental limits, and the tradeoff among crucial human and economic factors. Experimental results corroborate that the proposed strategy outperforms the default strategy and can reduce the IDoS risk by as much as 20%. Besides, the strategy is resilient to large variations of costs, attack frequencies, and human attention capacities. We have recognized interesting phenomena such as attentional risk equivalency, attacker's dilemma, and the half-truth optimal attack strategy.

preprint2022arXiv

Reinforcement Learning for Linear Quadratic Control is Vulnerable Under Cost Manipulation

In this work, we study the deception of a Linear-Quadratic-Gaussian (LQG) agent by manipulating the cost signals. We show that a small falsification of the cost parameters will only lead to a bounded change in the optimal policy. The bound is linear on the amount of falsification the attacker can apply to the cost parameters. We propose an attack model where the attacker aims to mislead the agent into learning a `nefarious' policy by intentionally falsifying the cost parameters. We formulate the attack's problem as a convex optimization problem and develop necessary and sufficient conditions to check the achievability of the attacker's goal. We showcase the adversarial manipulation on two types of LQG learners: the batch RL learner and the other is the adaptive dynamic programming (ADP) learner. Our results demonstrate that with only 2.296% of falsification on the cost data, the attacker misleads the batch RL into learning the 'nefarious' policy that leads the vehicle to a dangerous position. The attacker can also gradually trick the ADP learner into learning the same `nefarious' policy by consistently feeding the learner a falsified cost signal that stays close to the actual cost signal. The paper aims to raise people's awareness of the security threats faced by RL-enabled control systems.

preprint2022arXiv

The Inverse Problem of Linear-Quadratic Differential Games: When is a Control Strategies Profile Nash?

This paper aims to formulate and study the inverse problem of non-cooperative linear quadratic games: Given a profile of control strategies, find cost parameters for which this profile of control strategies is Nash. We formulate the problem as a leader-followers problem, where a leader aims to implant a desired profile of control strategies among selfish players. In this paper, we leverage frequency-domain techniques to develop a necessary and sufficient condition on the existence of cost parameters for a given profile of stabilizing control strategies to be Nash under a given linear system. The necessary and sufficient condition includes the circle criterion for each player and a rank condition related to the transfer function of each player. The condition provides an analytical method to check the existence of such cost parameters, while previous studies need to solve a convex feasibility problem numerically to answer the same question. We develop an identity in frequency-domain representation to characterize the cost parameters, which we refer to as the Kalman equation. The Kalman equation reduces redundancy in the time-domain analysis that involves solving a convex feasibility problem. Using the Kalman equation, we also show the leader can enforce the same Nash profile by applying penalties on the shared state instead of penalizing the player for other players' actions to avoid the impression of unfairness.

preprint2021arXiv

Convergence of Bayesian Nash Equilibrium in Infinite Bayesian Games under Discretization

We prove the existence of Bayesian Nash Equilibrium (BNE) of general-sum Bayesian games with continuous types and finite actions under the conditions that the utility functions and the prior type distributions are continuous concerning the players' types. Moreover, there exists a sequence of discretized Bayesian games whose BNE strategies converge weakly to a BNE strategy of the infinite Bayesian game. Our proof establishes a connection between the equilibria of the infinite Bayesian game and those of finite approximations, which leads to an algorithm to construct $\varepsilon$-BNE of infinite Bayesian games by discretizing players' type spaces.

preprint2021arXiv

Feedback Capacity of Parallel ACGN Channels and Kalman Filter: Power Allocation with Feedback

In this paper, we relate the feedback capacity of parallel additive colored Gaussian noise (ACGN) channels to a variant of the Kalman filter. By doing so, we obtain lower bounds on the feedback capacity of such channels, as well as the corresponding feedback (recursive) coding schemes, which are essentially power allocation policies with feedback, to achieve the bounds. The results are seen to reduce to existing lower bounds in the case of a single ACGN feedback channel, whereas when it comes to parallel additive white Gaussian noise (AWGN) channels with feedback, the recursive coding scheme reduces to a feedback "water-filling" power allocation policy.

preprint2021arXiv

On the Equilibrium Elicitation of Markov Games Through Information Design

This work considers a novel information design problem and studies how the craft of payoff-relevant environmental signals solely can influence the behaviors of intelligent agents. The agents' strategic interactions are captured by an incomplete-information Markov game, in which each agent first selects one environmental signal from multiple signal sources as additional payoff-relevant information and then takes an action. There is a rational information designer (designer) who possesses one signal source and aims to control the equilibrium behaviors of the agents by designing the information structure of her signals sent to the agents. An obedient principle is established which states that it is without loss of generality to focus on the direct information design when the information design incentivizes each agent to select the signal sent by the designer, such that the design process avoids the predictions of the agents' strategic selection behaviors. We then introduce the design protocol given a goal of the designer referred to as obedient implementability (OIL) and characterize the OIL in a class of obedient perfect Bayesian Markov Nash equilibria (O-PBME). A new framework for information design is proposed based on an approach of maximizing the optimal slack variables. Finally, we formulate the designer's goal selection problem and characterize it in terms of information design by establishing a relationship between the O-PBME and the Bayesian Markov correlated equilibria, in which we build upon the revelation principle in classic information design in economics. The proposed approach can be applied to elicit desired behaviors of multi-agent systems in competing as well as cooperating settings and be extended to heterogeneous stochastic games in the complete- and the incomplete-information environments.

preprint2021arXiv

Relativistic Control: Feedback Control of Relativistic Dynamics

Strictly speaking, Newton's second law of motion is only an approximation of the so-called relativistic dynamics, i.e., Einstein's modification of the second law based on his theory of special relativity. Although the approximation is almost exact when the velocity of the dynamical system is far less than the speed of light, the difference will become larger and larger (and will eventually go to infinity) as the velocity approaches the speed of light. Correspondingly, feedback control of such dynamics should also take this modification into consideration (though it will render the system nonlinear), especially when the velocity is relatively large. Towards this end, we start this note by studying the state-space representation of the relativistic dynamics. We then investigate on how to employ the feedback linearization approach for such relativistic dynamics, based upon which an additional linear controller may then be designed. As such, the feedback linearization together with the linear controller compose the overall relativistic feedback control law. We also provide discussions on, e.g., controllability, state feedback and output feedback, as well as PID control, in the relativistic setting.

preprint2021arXiv

Relativistic Rocket Control (Relativistic Space-Travel Flight Control): Feedback Control of Relativistic Dynamics Propelled by Ejecting Mass

In this short note, we investigate the feedback control of relativistic dynamics propelled by mass ejection, modeling, e.g., the relativistic rocket control or the relativistic (space-travel) flight control. As an extreme case, we also examine the control of relativistic photon rockets which are propelled by ejecting photons.

preprint2021arXiv

Self-Triggered Markov Decision Processes

In this paper, we study Markov Decision Processes (MDPs) with self-triggered strategies, where the idea of self-triggered control is extended to more generic MDP models. This extension broadens the application of self-triggering policies to a broader range of systems. We study the co-design problems of the control policy and the triggering policy to optimize two pre-specified cost criteria. The first cost criterion is introduced by incorporating a pre-specified update penalty into the traditional MDP cost criteria to reduce the use of communication resources. Under this criteria, a novel dynamic programming (DP) equation called DP equation with optimized lookahead to proposed to solve for the self-triggering policy under this criteria. The second self-triggering policy is to maximize the triggering time while still guaranteeing a pre-specified level of sub-optimality. Theoretical underpinnings are established for the computation and implementation of both policies. Through a gridworld numerical example, we illustrate the two policies' effectiveness in reducing sources consumption and demonstrate the trade-offs between resource consumption and system performance.

preprint2021arXiv

The Spectral-Domain $\mathcal{W}_2$ Wasserstein Distance for Elliptical Processes and the Spectral-Domain Gelbrich Bound

In this short note, we introduce the spectral-domain $\mathcal{W}_2$ Wasserstein distance for elliptical stochastic processes in terms of their power spectra. We also introduce the spectral-domain Gelbrich bound for processes that are not necessarily elliptical.

preprint2020arXiv

A Receding-Horizon MDP Approach for Performance Evaluation of Moving Target Defense in Networks

In this paper, we study the problem of assessing the effectiveness of a proactive defense-by-detection policy with a network-based moving target defense. We model the network system using a probabilistic attack graph--a graphical security model. Given a network system with a proactive defense strategy, an intelligent attacker needs to perform reconnaissance repeatedly to learn about the locations of intrusion detection systems and re-plan optimally to reach the target while avoiding detection. To compute the attacker's strategy for security evaluation, we develop a receding-horizon planning algorithm using a risk-sensitive Markov decision process with a time-varying reward function. Finally, we implement both defense and attack strategies in a synthetic network and analyze how the frequency of network randomization and the number of detection systems can influence the success rate of the attacker. This study provides insights for designing proactive defense strategies against online and multi-stage attacks by a resourceful attacker.

preprint2020arXiv

Deceptive Kernel Function on Observations of Discrete POMDP

This paper studies the deception applied on agent in a partially observable Markov decision process. We introduce deceptive kernel function (the kernel) applied to agent's observations in a discrete POMDP. Based on value iteration, value function approximation and POMCP three characteristic algorithms used by agent, we analyze its belief being misled by falsified observations as the kernel's outputs and anticipate its probable threat on agent's reward and potentially other performance. We validate our expectation and explore more detrimental effects of the deception by experimenting on two POMDP problems. The result shows that the kernel applied on agent's observation can affect its belief and substantially lower its resulting rewards; meantime certain implementation of the kernel could induce other abnormal behaviors by the agent.

preprint2020arXiv

Differentially Private Collaborative Intrusion Detection Systems For VANETs

Vehicular ad hoc network (VANET) is an enabling technology in modern transportation systems for providing safety and valuable information, and yet vulnerable to a number of attacks from passive eavesdropping to active interfering. Intrusion detection systems (IDSs) are important devices that can mitigate the threats by detecting malicious behaviors. Furthermore, the collaborations among vehicles in VANETs can improve the detection accuracy by communicating their experiences between nodes. To this end, distributed machine learning is a suitable framework for the design of scalable and implementable collaborative detection algorithms over VANETs. One fundamental barrier to collaborative learning is the privacy concern as nodes exchange data among them. A malicious node can obtain sensitive information of other nodes by inferring from the observed data. In this paper, we propose a privacy-preserving machine-learning based collaborative IDS (PML-CIDS) for VANETs. The proposed algorithm employs the alternating direction method of multipliers (ADMM) to a class of empirical risk minimization (ERM) problems and trains a classifier to detect the intrusions in the VANETs. We use the differential privacy to capture the privacy notation of the PML-CIDS and propose a method of dual variable perturbation to provide dynamic differential privacy. We analyze theoretical performance and characterize the fundamental tradeoff between the security and privacy of the PML-CIDS. We also conduct numerical experiments using the NSL-KDD dataset to corroborate the results on the detection accuracy, security-privacy tradeoffs, and design.

preprint2020arXiv

Distributed Stabilization of Two Interdependent Markov Jump Linear Systems with Partial Information

In this paper, we study the stabilization of two interdependent Markov jump linear systems (MJLSs) with partial information, where the interdependency arises as the transition of the mode of one system depends on the states of the other system. First, we formulate a framework for the two interdependent MJLSs to capture the interactions between various entities in the system, where the modes of the system cannot be observed directly. Instead, a signal which contains information of the modes can be obtained. Then, depending on the scope of the available system state information (global or local), we design centralized and distributed controllers, respectively, that can stochastically stabilize the overall interdependent MJLS. In addition, the sufficient stabilization conditions for the system under both types of information structure are derived. Finally, we provide a numerical example to illustrate the effectiveness of the designed controllers.

preprint2020arXiv

Implementability of Honest Multi-Agent Sequential Decision-Making with Dynamic Population

We study the design of decision-making mechanism for resource allocations over a multi-agent system in a dynamic environment. Agents' privately observed preference over resources evolves over time and the population is dynamic due to the adoption of stopping rules. The proposed model designs the rules of encounter for agents participating in the dynamic mechanism by specifying an allocation rule and three payment rules to elicit agents' coupled decision makings of honest preference reporting and optimal stopping over multiple periods. The mechanism provides a special posted-price payment rule that depends only on each agent's realized stopping time to directly influence the population dynamics. This letter focuses on the theoretical implementability of the rules in perfect Bayesian Nash equilibrium and characterizes necessary and sufficient conditions to guarantee agents' honest equilibrium behaviors over periods. We provide the design principles to construct the payments in terms of the allocation rules and identify the restrictions of the designer's ability to influence the population dynamics. The established conditions make the designer's problem of finding multiple rules to determine an optimal allocation rule.

preprint2020arXiv

Infinite-Horizon Linear-Quadratic-Gaussian Control with Costly Measurements

In this paper, we consider an infinite horizon Linear-Quadratic-Gaussian control problem with controlled and costly measurements. A control strategy and a measurement strategy are co-designed to optimize the trade-off among control performance, actuating costs, and measurement costs. We address the co-design and co-optimization problem by establishing a dynamic programming equation with controlled lookahead. By leveraging the dynamic programming equation, we fully characterize the optimal control strategy and the measurement strategy analytically. The optimal control is linear in the state estimate that depends on the measurement strategy. We prove that the optimal measurement strategy is independent of the measured state and is periodic. And the optimal period length is determined by the cost of measurements and system parameters. We demonstrate the potential application of the co-design and co-optimization problem in an optimal self-triggered control paradigm. Two examples are provided to show the effectiveness of the optimal measurement strategy in reducing the overhead of measurements while keeping the system performance.

preprint2020arXiv

Manipulating Reinforcement Learning: Poisoning Attacks on Cost Signals

This chapter studies emerging cyber-attacks on reinforcement learning (RL) and introduces a quantitative approach to analyze the vulnerabilities of RL. Focusing on adversarial manipulation on the cost signals, we analyze the performance degradation of TD($λ$) and $Q$-learning algorithms under the manipulation. For TD($λ$), the approximation learned from the manipulated costs has an approximation error bound proportional to the magnitude of the attack. The effect of the adversarial attacks on the bound does not depend on the choice of $λ$. In $Q$-learning, we show that $Q$-learning algorithms converge under stealthy attacks and bounded falsifications on cost signals. We characterize the relation between the falsified cost and the $Q$-factors as well as the policy learned by the learning agent which provides fundamental limits for feasible offensive and defensive moves. We propose a robust region in terms of the cost within which the adversary can never achieve the targeted policy. We provide conditions on the falsified cost which can mislead the agent to learn an adversary's favored policy. A case study of TD($λ$) learning is provided to corroborate the results.

preprint2020arXiv

Modeling and Assessment of IoT Supply Chain Security Risks: The Role of Structural and Parametric Uncertainties

Supply chain security threats pose new challenges to security risk modeling techniques for complex ICT systems such as the IoT. With established techniques drawn from attack trees and reliability analysis providing needed points of reference, graph-based analysis can provide a framework for considering the role of suppliers in such systems. We present such a framework here while highlighting the need for a component-centered model. Given resource limitations when applying this model to existing systems, we study various classes of uncertainties in model development, including structural uncertainties and uncertainties in the magnitude of estimated event probabilities. Using case studies, we find that structural uncertainties constitute a greater challenge to model utility and as such should receive particular attention. Best practices in the face of these uncertainties are proposed.

preprint2020arXiv

Optimal Control of Joint Multi-Virus Infection and Information Spreading

Nowadays, epidemic models provide an appropriate tool for describing the propagation of biological viruses in human or animal populations, or rumours and other kinds of information in social networks and malware in both computer and ad hoc networks. Commonly, there are exist multiple types of malware infecting a network of computing devices, or different messages can spread over the social network. Information spreading and virus propagation are interdependent processes. To capture such independencies, we integrate two epidemic models into one holistic framework, known as the modified Susceptible-Warned-Infected-Recovered-Susceptible (SWIRS) model. The first epidemic model describes the information spreading regarding the risk of malware attacks and possible preventive procedures. The second one describes the propagation of multiple viruses over the network of devices. To minimize the impact of the virus spreading and improve the protection of the networks, we consider an optimal control problem with two types of control strategies: information spreading among healthy nodes and the treatment of infected nodes. We obtain the structure of optimal control strategies and study the condition of epidemic outbreaks. The main results are extended to the case of the network of two connected clusters. Numerical examples are used to corroborate the theoretical findings.

preprint2020arXiv

Optimal Two-Sided Market Mechanism Design for Large-Scale Data Sharing and Trading in Massive IoT Networks

The development of the Internet of Things (IoT) generates a significant amount of data that contains valuable knowledge for system operations and business opportunities. Since the data is the property of the IoT data owners, the access to the data requires permission from the data owners, which gives rise to a potential market opportunity for the IoT data sharing and trading to create economic values and market opportunities for both data owners and buyers. In this work, we leverage optimal mechanism design theory to develop a monopolist matching platform for data trading over massive IoT networks. The proposed mechanism is composed of a pair of matching and payment rules for each side of the market. We analyze the incentive compatibility of the market and characterize the optimal mechanism with a class of cut-off matching rules for both welfare-maximization and revenue-maximization mechanisms and study three matching behaviors including complete-matched, bottom-eliminated, and top-reserved.

preprint2020arXiv

QoE Based Revenue Maximizing Dynamic Resource Allocation and Pricing for Fog-Enabled Mission-Critical IoT Applications

Fog computing is becoming a vital component for Internet of things (IoT) applications, acting as its computational engine. Mission-critical IoT applications are highly sensitive to latency, which depends on the physical location of the cloud server. Fog nodes of varying response rates are available to the cloud service provider (CSP) and it is faced with a challenge of forwarding the sequentially received IoT data to one of the fog nodes for processing. Since the arrival times and nature of requests is random, it is important to optimally classify the requests in real-time and allocate available virtual machine instances (VMIs) at the fog nodes to provide a high QoE to the users and consequently generate higher revenues for the CSP. In this paper, we use a pricing policy based on the QoE of the applications as a result of the allocation and obtain an optimal dynamic allocation rule based on the statistical information of the computational requests. The developed solution is statistically optimal, dynamic, and implementable in real-time as opposed to other static matching schemes in the literature. The performance of the proposed framework has been evaluated using simulations and the results show significant improvement as compared with benchmark schemes.

preprint2020arXiv

Security of Distributed Machine Learning: A Game-Theoretic Approach to Design Secure DSVM

Distributed machine learning algorithms play a significant role in processing massive data sets over large networks. However, the increasing reliance on machine learning on information and communication technologies (ICTs) makes it inherently vulnerable to cyber threats. This work aims to develop secure distributed algorithms to protect the learning from data poisoning and network attacks. We establish a game-theoretic framework to capture the conflicting goals of a learner who uses distributed support vector machines (SVMs) and an attacker who is capable of modifying training data and labels. We develop a fully distributed and iterative algorithm to capture real-time reactions of the learner at each node to adversarial behaviors. The numerical results show that distributed SVM is prone to fail in different types of attacks, and their impact has a strong dependence on the network structure and attack capabilities.

preprint2019arXiv

Cyber Insurance

This chapter will first present a principal-agent game-theoretic model to capture the interactions between one insurer and one user. The insurer is deemed as the principal who does not have incomplete information about user's security policies. The user, which refers to the infrastructure operator or the customer, implements his local protection and pays a premium to the insurer. The insurer designs an incentive compatible insurance mechanism that includes the premium and the coverage policy, while the user determines whether to participate in the insurance and his effort to defend against attacks. The chapter will also focus on an attack-aware cyber insurance model by introducing the adversarial behaviors into the framework. The behavior of an attacker determines the type of cyber threats, e.g. denial of service (DoS) attacks, data breaches, phishing and spoofing. The distinction of threat types plays a role in determining the type of losses and the coverage policies. The data breaches can lead to not only financial losses but also damage of the reputations. The coverage may only cover certain agreed percentage of the financial losses.

preprint2019arXiv

PhD Forum: Enabling Autonomic IoT for Smart Urban Services

The development of autonomous cyber-physical systems (CPS) and advances towards the fifth generation (5G) of wireless technology is promising to revolutionize many industry verticals such as Healthcare, Transportation, Energy, Retail Services, Building Automation, Education, etc., leading to the realization of the smart city paradigm. The Internet of Things (IoT), enables powerful and unprecedented capabilities for intelligent and autonomous operation. We leverage ideas from Network Science, Optimization & Decision Theory, Incentive Mechanism Design, and Data Science/Machine Learning to achieve key design goals, in IoT-enabled urban systems, such as efficiency, security & resilience, and economics.

preprint2016arXiv

A Game-Theoretic Framework for Resilient and Distributed Generation Control of Renewable Energies in Microgrids

The integration of microgrids that depend on the renewable distributed energy resources with the current power systems is a critical issue in the smart grid. In this paper, we propose a non-cooperative game-theoretic framework to study the strategic behavior of distributed microgrids that generate renewable energies and characterize the power generation solutions by using the Nash equilibrium concept. Our framework not only incorporates economic factors but also takes into account the stability and efficiency of the microgrids, including the power flow constraints and voltage angle regulations. We develop two decentralized update schemes for microgrids and show their convergence to a unique Nash equilibrium. Also, we propose a novel fully distributed PMU-enabled algorithm which only needs the information of voltage angle at the bus. To show the resiliency of the distributed algorithm, we introduce two failure models of the smart grid. Case studies based on the IEEE 14-bus system are used to corroborate the effectiveness and resiliency of the proposed algorithms.

preprint2016arXiv

A Stackelberg Game Perspective on the Conflict Between Machine Learning and Data Obfuscation

Data is the new oil; this refrain is repeated extensively in the age of internet tracking, machine learning, and data analytics. Social network analysis, cookie-based advertising, and government surveillance are all evidence of the use of data for commercial and national interests. Public pressure, however, is mounting for the protection of privacy. Frameworks such as differential privacy offer machine learning algorithms methods to guarantee limits to information disclosure, but they are seldom implemented. Recently, however, developers have made significant efforts to undermine tracking through obfuscation tools that hide user characteristics in a sea of noise. These services highlight an emerging clash between tracking and data obfuscation. In this paper, we conceptualize this conflict through a dynamic game between users and a machine learning algorithm that uses empirical risk minimization. First, a machine learner declares a privacy protection level, and then users respond by choosing their own perturbation amounts. We study the interaction between the users and the learner using a Stackelberg game. The utility functions quantify accuracy using expected loss and privacy in terms of the bounds of differential privacy. In equilibrium, we find selfish users tend to cause significant utility loss to trackers by perturbing heavily, in a phenomenon reminiscent of public good games. Trackers, however, can improve the balance by proactively perturbing the data themselves. While other work in this area has studied privacy markets and mechanism design for truthful reporting of user information, we take a different viewpoint by considering both user and learner perturbation.

preprint2016arXiv

Coding Schemes for Securing Cyber-Physical Systems Against Stealthy Data Injection Attacks

This paper considers a method of coding the sensor outputs in order to detect stealthy false data injection attacks. An intelligent attacker can design a sequence of data injection to sensors and actuators that pass the state estimator and statistical fault detector, based on knowledge of the system parameters. To stay undetected, the injected data should increase the state estimation errors while keep the estimation residues small. We employ a coding matrix to change the original sensor outputs to increase the estimation residues under intelligent data injection attacks. This is a low cost method compared with encryption schemes over all sensor measurements in communication networks. We show the conditions of a feasible coding matrix under the assumption that the attacker does not have knowledge of the exact coding matrix. An algorithm is developed to compute a feasible coding matrix, and, we show that in general, multiple feasible coding matrices exist. To defend against attackers who estimates the coding matrix via sensor and actuator measurements, time-varying coding matrices are designed according to the detection requirements. A heuristic algorithm to decide the time length of updating a coding matrix is then proposed.

preprint2016arXiv

Dynamic Privacy For Distributed Machine Learning Over Network

Privacy-preserving distributed machine learning becomes increasingly important due to the recent rapid growth of data. This paper focuses on a class of regularized empirical risk minimization (ERM) machine learning problems, and develops two methods to provide differential privacy to distributed learning algorithms over a network. We first decentralize the learning algorithm using the alternating direction method of multipliers (ADMM), and propose the methods of dual variable perturbation and primal variable perturbation to provide dynamic differential privacy. The two mechanisms lead to algorithms that can provide privacy guarantees under mild conditions of the convexity and differentiability of the loss function and the regularizer. We study the performance of the algorithms, and show that the dual variable perturbation outperforms its primal counterpart. To design an optimal privacy mechanisms, we analyze the fundamental tradeoff between privacy and accuracy, and provide guidelines to choose privacy parameters. Numerical experiments using customer information database are performed to corroborate the results on privacy and utility tradeoffs and design.

preprint2016arXiv

Interdependent Network Formation Games

Designing optimal interdependent networks is important for the robustness and efficiency of national critical infrastructures. Here, we establish a two-person game-theoretic model in which two network designers choose to maximize the global connectivity independently. This framework enables decentralized network design by using iterative algorithms. After a finite number of steps, the algorithm will converge to a Nash equilibrium, and yields the equilibrium topology of the network. We corroborate our results by using numerical experiments, and compare the Nash equilibrium solutions with their team solution counterparts. The experimental results of the game method and the team method provide design guidelines to increase the efficiency of the interdependent network formation games. Moreover, the proposed game framework can be generally applied to a diverse number of applications, including power system networks and international air transportation networks.

preprint2016arXiv

Resilient and Decentralized Control of Multi-level Cooperative Mobile Networks to Maintain Connectivity under Adversarial Environment

Network connectivity plays an important role in the information exchange between different agents in the multi-level networks. In this paper, we establish a game-theoretic framework to capture the uncoordinated nature of the decision-making at different layers of the multi-level networks. Specifically, we design a decentralized algorithm that aims to maximize the algebraic connectivity of the global network iteratively. In addition, we show that the designed algorithm converges to a Nash equilibrium asymptotically and yields an equilibrium network. To study the network resiliency, we introduce three adversarial attack models and characterize their worst-case impacts on the network performance. Case studies based on a two-layer mobile robotic network are used to corroborate the effectiveness and resiliency of the proposed algorithm and show the interdependency between different layers of the network during the recovery processes.

preprint2016arXiv

Two-Party Privacy Games: How Users Perturb When Learners Preempt

Internet tracking technologies and wearable electronics provide a vast amount of data to machine learning algorithms. This stock of data stands to increase with the developments of the internet of things and cyber-physical systems. Clearly, these technologies promise benefits. But they also raise the risk of sensitive information disclosure. To mitigate this risk, machine learning algorithms can add noise to outputs according to the formulations provided by differential privacy. At the same time, users can fight for privacy by injecting noise into the data that they report. In this paper, we conceptualize the interactions between privacy and accuracy and between user (input) perturbation and learner (output) perturbation in machine learning, using the frameworks of empirical risk minimization, differential privacy, and Stackelberg games. In particular, we solve for the Stackelberg equilibrium for the case of an averaging query. We find that, in equilibrium, either the users perturb their data before submission or the learner perturbs the machine learning output, but never both. Specifically, the learner perturbs if and only if the number of users is greater than a threshold which increases with the degree to which incentives are misaligned. Provoked by these conclusions - and by some observations from privacy ethics - we also suggest future directions. While other work in this area has studied privacy markets and mechanism design for truthful reporting of user information, we take a different viewpoint by considering both user and learner perturbation. We hope that this effort will open the door to future work in the area of differential privacy games.

preprint2015arXiv

Deception by Design: Evidence-Based Signaling Games for Network Defense

Deception plays a critical role in the financial industry, online markets, national defense, and countless other areas. Understanding and harnessing deception - especially in cyberspace - is both crucial and difficult. Recent work in this area has used game theory to study the roles of incentives and rational behavior. Building upon this work, we employ a game-theoretic model for the purpose of mechanism design. Specifically, we study a defensive use of deception: implementation of honeypots for network defense. How does the design problem change when an adversary develops the ability to detect honeypots? We analyze two models: cheap-talk games and an augmented version of those games that we call cheap-talk games with evidence, in which the receiver can detect deception with some probability. Our first contribution is this new model for deceptive interactions. We show that the model includes traditional signaling games and complete information games as special cases. We also demonstrate numerically that deception detection sometimes eliminate pure-strategy equilibria. Finally, we present the surprising result that the utility of a deceptive defender can sometimes increase when an adversary develops the ability to detect deception. These results apply concretely to network defense. They are also general enough for the large and critical body of strategic interactions that involve deception.

preprint2015arXiv

Evolutionary Poisson Games for Controlling Large Population Behaviors

Emerging applications in engineering such as crowd-sourcing and (mis)information propagation involve a large population of heterogeneous users or agents in a complex network who strategically make dynamic decisions. In this work, we establish an evolutionary Poisson game framework to capture the random, dynamic and heterogeneous interactions of agents in a holistic fashion, and design mechanisms to control their behaviors to achieve a system-wide objective. We use the antivirus protection challenge in cyber security to motivate the framework, where each user in the network can choose whether or not to adopt the software. We introduce the notion of evolutionary Poisson stable equilibrium for the game, and show its existence and uniqueness. Online algorithms are developed using the techniques of stochastic approximation coupled with the population dynamics, and they are shown to converge to the optimal solution of the controller problem. Numerical examples are used to illustrate and corroborate our results.

preprint2015arXiv

Flip the Cloud: Cyber-Physical Signaling Games in the Presence of Advanced Persistent Threats

Access to the cloud has the potential to provide scalable and cost effective enhancements of physical devices through the use of advanced computational processes run on apparently limitless cyber infrastructure. On the other hand, cyber-physical systems and cloud-controlled devices are subject to numerous design challenges; among them is that of security. In particular, recent advances in adversary technology pose Advanced Persistent Threats (APTs) which may stealthily and completely compromise a cyber system. In this paper, we design a framework for the security of cloud-based systems that specifies when a device should trust commands from the cloud which may be compromised. This interaction can be considered as a game between three players: a cloud defender/administrator, an attacker, and a device. We use traditional signaling games to model the interaction between the cloud and the device, and we use the recently proposed FlipIt game to model the struggle between the defender and attacker for control of the cloud. Because attacks upon the cloud can occur without knowledge of the defender, we assume that strategies in both games are picked according to prior commitment. This framework requires a new equilibrium concept, which we call Gestalt Equilibrium, a fixed-point that expresses the interdependence of the signaling and FlipIt games. We present the solution to this fixed-point problem under certain parameter cases, and illustrate an example application of cloud control of an unmanned vehicle. Our results contribute to the growing understanding of cloud-controlled systems.

preprint2015arXiv

Protection and Deception: Discovering Game Theory and Cyber Literacy through a Novel Board Game Experience

Cyber literacy merits serious research attention because it addresses a confluence of specialization and generalization; cybersecurity is often conceived of as approachable only by a technological intelligentsia, yet its interdependent nature demands education for a broad population. Therefore, educational tools should lead participants to discover technical knowledge in an accessible and attractive framework. In this paper, we present Protection and Deception (P&G), a novel two-player board game. P&G has three main contributions. First, it builds cyber literacy by giving participants "hands-on" experience with game pieces that have the capabilities of cyber-attacks such as worms, masquerading attacks/spoofs, replay attacks, and Trojans. Second, P&G teaches the important game-theoretic concepts of asymmetric information and resource allocation implicitly and non-obtrusively through its game play. Finally, it strives for the important objective of security education for underrepresented minorities and people without explicit technical experience. We tested P&G at a community center in Manhattan with middle- and high school students, and observed enjoyment and increased cyber literacy along with suggestions for improvement of the game. Together with these results, our paper also presents images of the attractive board design and 3D printed game pieces, together with a Monte-Carlo analysis that we used to ensure a balanced gaming experience.

preprint2013arXiv

Approximate dynamic programming using fluid and diffusion approximations with applications to power management

Neuro-dynamic programming is a class of powerful techniques for approximating the solution to dynamic programming equations. In their most computationally attractive formulations, these techniques provide the approximate solution only within a prescribed finite-dimensional function class. Thus, the question that always arises is how should the function class be chosen? The goal of this paper is to propose an approach using the solutions to associated fluid and diffusion approximations. In order to illustrate this approach, the paper focuses on an application to dynamic speed scaling for power management in computer processors.

preprint2012arXiv

Risk-Sensitive Mean Field Games

In this paper, we study a class of risk-sensitive mean-field stochastic differential games. We show that under appropriate regularity conditions, the mean-field value of the stochastic differential game with exponentiated integral cost functional coincides with the value function described by a Hamilton-Jacobi-Bellman (HJB) equation with an additional quadratic term. We provide an explicit solution of the mean-field best response when the instantaneous cost functions are log-quadratic and the state dynamics are affine in the control. An equivalent mean-field risk-neutral problem is formulated and the corresponding mean-field equilibria are characterized in terms of backward-forward macroscopic McKean-Vlasov equations, Fokker-Planck-Kolmogorov equations, and HJB equations. We provide numerical examples on the mean field behavior to illustrate both linear and McKean-Vlasov dynamics.

preprint2012arXiv

SODEXO: A System Framework for Deployment and Exploitation of Deceptive Honeybots in Social Networks

As social networking sites such as Facebook and Twitter are becoming increasingly popular, a growing number of malicious attacks, such as phishing and malware, are exploiting them. Among these attacks, social botnets have sophisticated infrastructure that leverages compromised users accounts, known as bots, to automate the creation of new social networking accounts for spamming and malware propagation. Traditional defense mechanisms are often passive and reactive to non-zero-day attacks. In this paper, we adopt a proactive approach for enhancing security in social networks by infiltrating botnets with honeybots. We propose an integrated system named SODEXO which can be interfaced with social networking sites for creating deceptive honeybots and leveraging them for gaining information from botnets. We establish a Stackelberg game framework to capture strategic interactions between honeybots and botnets, and use quantitative methods to understand the tradeoffs of honeybots for their deployment and exploitation in social networks. We design a protection and alert system that integrates both microscopic and macroscopic models of honeybots and optimally determines the security strategies for honeybots. We corroborate the proposed mechanism with extensive simulations and comparisons with passive defenses.

preprint2011arXiv

A Constrained Evolutionary Gaussian Multiple Access Channel Game

In this paper, we formulate an evolutionary multiple access channel game with continuous-variable actions and coupled rate constraints. We characterize Nash equilibria of the game and show that the pure Nash equilibria are Pareto optimal and also resilient to deviations by coalitions of any size, i.e., they are strong equilibria. We use the concepts of price of anarchy and strong price of anarchy to study the performance of the system. The paper also addresses how to select one specifc equilibrium solution using the concepts of normalized equilibrium and evolutionary stable strategies. We examine the long-run behavior of these strategies under several classes of evolutionary game dynamics such as Brown-von Neumann-Nash dynamics, and replicator dynamics.

preprint2011arXiv

Enabling Differentiated Services Using Generalized Power Control Model in Optical Networks

This paper considers a generalized framework to study OSNR optimization-based end-to-end link level power control problems in optical networks. We combine favorable features of game-theoretical approach and central cost approach to allow different service groups within the network. We develop solutions concepts for both cases of empty and nonempty feasible sets. In addition, we derive and prove the convergence of a distributed iterative algorithm for different classes of users. In the end, we use numerical examples to illustrate the novel framework.

preprint2011arXiv

Evolutionary Games for Multiple Access Control

In this paper, we formulate an evolutionarymultiple access control game with continuousvariable actions and coupled constraints. We characterize equilibria of the game and show that the pure equilibria are Pareto optimal and also resilient to deviations by coalitions of any size, i.e., they are strong equilibria. We use the concepts of price of anarchy and strong price of anarchy to study the performance of the system. The paper also addresses how to select one specific equilibrium solution using the concepts of normalized equilibrium and evolutionarily stable strategies. We examine the long-run behavior of these strategies under several classes of evolutionary game dynamics, such as Brown-von Neumann-Nash dynamics, Smith dynamics and replicator dynamics. In addition, we examine correlated equilibrium for the single-receiver model. Correlated strategies are based on signaling structures before making decisions on rates. We then focus on evolutionary games for hybrid additive white Gaussian noise multiple access channel with multiple users and multiple receivers, where each user chooses a rate and splits it over the receivers. Users have coupled constraints determined by the capacity regions. Building upon the static game, we formulate a system of hybrid evolutionary game dynamics using G-function dynamics and Smith dynamics on rate control and channel selection, respectively. We show that the evolving game has an equilibrium and illustrate these dynamics with numerical examples.

preprint2011arXiv

Heterogeneous Learning in Zero-Sum Stochastic Games with Incomplete Information

Learning algorithms are essential for the applications of game theory in a networking environment. In dynamic and decentralized settings where the traffic, topology and channel states may vary over time and the communication between agents is impractical, it is important to formulate and study games of incomplete information and fully distributed learning algorithms which for each agent requires a minimal amount of information regarding the remaining agents. In this paper, we address this major challenge and introduce heterogeneous learning schemes in which each agent adopts a distinct learning pattern in the context of games with incomplete information. We use stochastic approximation techniques to show that the heterogeneous learning schemes can be studied in terms of their deterministic ordinary differential equation (ODE) counterparts. Depending on the learning rates of the players, these ODEs could be different from the standard replicator dynamics, (myopic) best response (BR) dynamics, logit dynamics, and fictitious play dynamics. We apply the results to a class of security games in which the attacker and the defender adopt different learning schemes due to differences in their rationality levels and the information they acquire.

preprint2011arXiv

Indices of Power in Optimal IDS Default Configuration: Theory and Examples

Intrusion Detection Systems (IDSs) are becoming essential to protecting modern information infrastructures. The effectiveness of an IDS is directly related to the computational resources at its disposal. However, it is difficult to guarantee especially with an increasing demand of network capacity and rapid proliferation of attacks. On the other hand, modern intrusions often come as sequences of attacks to reach some predefined goals. It is therefore critical to identify the best default IDS configuration to attain the highest possible overall protection within a given resource budget. This paper proposes a game theory based solution to the problem of optimal signature-based IDS configuration under resource constraints. We apply the concepts of indices of power, namely, Shapley value and Banzhaf-Coleman index, from cooperative game theory to quantify the influence or contribution of libraries in an IDS with respect to given attack graphs. Such valuations take into consideration the knowledge on common attack graphs and experienced system attacks and are used to configure an IDS optimally at its default state by solving a knapsack optimization problem.

preprint2011arXiv

Prices of Anarchy, Information, and Cooperation in Differential Games

The price of anarchy (PoA) has been widely used in static games to quantify the loss of efficiency due to noncooperation. Here, we extend this concept to a general differential games framework. In addition, we introduce the price of information (PoI) to characterize comparative game performances under different information structures, as well as the price of cooperation to capture the extent of benefit or loss a player accrues as a result of altruistic behavior. We further characterize PoA and PoI for a class of scalar linear quadratic differential games under open-loop and closed-loop feedback information structures. We also obtain some explicit bounds on these indices in a large population regime.

preprint2010arXiv

A Distributed Sequential Algorithm for Collaborative Intrusion Detection Networks

Collaborative intrusion detection networks are often used to gain better detection accuracy and cost efficiency as compared to a single host-based intrusion detection system (IDS). Through cooperation, it is possible for a local IDS to detect new attacks that may be known to other experienced acquaintances. In this paper, we present a sequential hypothesis testing method for feedback aggregation for each individual IDS in the net- work. Our simulation results corroborate our theoretical results and demonstrate the properties of cost efficiency and accuracy compared to other heuristic methods. The analytical result on the lower-bound of the average number of acquaintances for consultation is essential for the design and configuration of IDSs in a collaborative environment.

preprint2010arXiv

Dynamic Interference Minimization Routing Game for On-Demand Cognitive Pilot Channel

In this paper, we introduce a distributed dynamic routing algorithm for secondary users (SUs) to minimize their interference with the primary users (PUs) in multi-hop cognitive radio (CR) networks. We use the medial axis with a relaxation factor as a reference path which is contingent on the states of the PUs. Along the axis, we construct a hierarchical structure for multiple sources to reach cognitive pilot channel (CPC) base stations. We use a temporal and spatial dynamic non-cooperative game to model the interactions among SUs as well as their influences from PUs in the multi-hop structure of the network. A multi-stage fictitious play learning is used for distributed routing in multi-hop CR networks. We obtain a set of mixed (behavioral) Nash equilibrium strategies of the dynamic game in closed form by backward induction. The proposed algorithm minimizes the overall interference and the average packet delay along the routing path from SU nodes to CPC base stations in an optimal and distributed manner

Quanyan Zhu

What is connected

Connect this record

See the researcher in context

Building this map preview

59 published item(s)

Decentralized No-Regret Frequency-Time Scheduling for FMCW Radar Interference Avoidance

Integrated Cyber-Physical Resiliency for Power Grids under IoT-Enabled Dynamic Botnet Attacks

A Rolling Horizon Game Considering Network Effect in Cluster Forming for Dynamic Resilient Multiagent Systems

An Introduction of System-Scientific Approaches to Cognitive Security

QoS Based Contract Design for Profit Maximization in IoT-Enabled Data Markets

A Pursuit-Evasion Differential Game with Strategic Information Acquisition

Accountability and Insurance in IoT Supply Chain

ADVERT: An Adaptive and Data-Driven Attention Enhancement Mechanism for Phishing Prevention

Autonomous and Resilient Control for Optimal LEO Satellite Constellation Coverage Against Space Threats

Bayesian Promised Persuasion: Dynamic Forward-Looking Multiagent Delegation with Informational Burning

Controlling Fake News by Tagging: A Branching Process Analysis

Multi-Agent Learning for Resilient Distributed Control Systems

On Poisoned Wardrop Equilibrium in Congestion Games

RADAMS: Resilient and Adaptive Alert and Attention Management Strategy against Informational Denial-of-Service (IDoS) Attacks

Reinforcement Learning for Linear Quadratic Control is Vulnerable Under Cost Manipulation

The Inverse Problem of Linear-Quadratic Differential Games: When is a Control Strategies Profile Nash?

Convergence of Bayesian Nash Equilibrium in Infinite Bayesian Games under Discretization

Feedback Capacity of Parallel ACGN Channels and Kalman Filter: Power Allocation with Feedback

On the Equilibrium Elicitation of Markov Games Through Information Design

Relativistic Control: Feedback Control of Relativistic Dynamics

Relativistic Rocket Control (Relativistic Space-Travel Flight Control): Feedback Control of Relativistic Dynamics Propelled by Ejecting Mass

Self-Triggered Markov Decision Processes

The Spectral-Domain $\mathcal{W}_2$ Wasserstein Distance for Elliptical Processes and the Spectral-Domain Gelbrich Bound

A Receding-Horizon MDP Approach for Performance Evaluation of Moving Target Defense in Networks

Deceptive Kernel Function on Observations of Discrete POMDP

Differentially Private Collaborative Intrusion Detection Systems For VANETs

Distributed Stabilization of Two Interdependent Markov Jump Linear Systems with Partial Information

Implementability of Honest Multi-Agent Sequential Decision-Making with Dynamic Population

Infinite-Horizon Linear-Quadratic-Gaussian Control with Costly Measurements

Manipulating Reinforcement Learning: Poisoning Attacks on Cost Signals

Modeling and Assessment of IoT Supply Chain Security Risks: The Role of Structural and Parametric Uncertainties

Optimal Control of Joint Multi-Virus Infection and Information Spreading

Optimal Two-Sided Market Mechanism Design for Large-Scale Data Sharing and Trading in Massive IoT Networks

QoE Based Revenue Maximizing Dynamic Resource Allocation and Pricing for Fog-Enabled Mission-Critical IoT Applications

Security of Distributed Machine Learning: A Game-Theoretic Approach to Design Secure DSVM

Cyber Insurance

PhD Forum: Enabling Autonomic IoT for Smart Urban Services

A Game-Theoretic Framework for Resilient and Distributed Generation Control of Renewable Energies in Microgrids

A Stackelberg Game Perspective on the Conflict Between Machine Learning and Data Obfuscation

Coding Schemes for Securing Cyber-Physical Systems Against Stealthy Data Injection Attacks

Dynamic Privacy For Distributed Machine Learning Over Network

Interdependent Network Formation Games

Resilient and Decentralized Control of Multi-level Cooperative Mobile Networks to Maintain Connectivity under Adversarial Environment

Two-Party Privacy Games: How Users Perturb When Learners Preempt

Deception by Design: Evidence-Based Signaling Games for Network Defense

Evolutionary Poisson Games for Controlling Large Population Behaviors

Flip the Cloud: Cyber-Physical Signaling Games in the Presence of Advanced Persistent Threats

Protection and Deception: Discovering Game Theory and Cyber Literacy through a Novel Board Game Experience

Approximate dynamic programming using fluid and diffusion approximations with applications to power management

Risk-Sensitive Mean Field Games

SODEXO: A System Framework for Deployment and Exploitation of Deceptive Honeybots in Social Networks

A Constrained Evolutionary Gaussian Multiple Access Channel Game

Enabling Differentiated Services Using Generalized Power Control Model in Optical Networks

Evolutionary Games for Multiple Access Control

Heterogeneous Learning in Zero-Sum Stochastic Games with Incomplete Information

Indices of Power in Optimal IDS Default Configuration: Theory and Examples

Prices of Anarchy, Information, and Cooperation in Differential Games

A Distributed Sequential Algorithm for Collaborative Intrusion Detection Networks

Dynamic Interference Minimization Routing Game for On-Demand Cognitive Pilot Channel