Researcher profile

Yuanzhang Xiao

Yuanzhang Xiao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - Emerging
22works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

22 published item(s)

preprint2026arXiv

Integrating Large Language Models into Recommendation via Mutual Augmentation and Adaptive Aggregation

Conventional recommendation methods have achieved notable advancements by harnessing collaborative or sequential information from user behavior. Recently, large language models (LLMs) have gained prominence for their capabilities in understanding and reasoning over textual semantics, and have found utility in various domains, including recommendation. Conventional recommendation methods and LLMs each have their strengths and weaknesses. While conventional methods excel at mining collaborative information and modeling sequential behavior, they struggle with data sparsity and the long-tail problem. LLMs, on the other hand, are proficient at utilizing rich textual contexts but face challenges in mining collaborative or sequential information. Despite their individual successes, there is a significant gap in leveraging their combined potential to enhance recommendation performance. In this paper, we introduce a general and model-agnostic framework known as \textbf{L}arge \textbf{la}nguage model with \textbf{m}utual augmentation and \textbf{a}daptive aggregation for \textbf{Rec}ommendation (\textbf{Llama4Rec}). Llama4Rec synergistically combines conventional and LLM-based recommendation models. Llama4Rec proposes data augmentation and prompt augmentation strategies tailored to enhance the conventional model and LLM respectively. An adaptive aggregation module is adopted to combine the predictions of both kinds of models to refine the final recommendation results. Empirical studies on three real-world datasets validate the superiority of Llama4Rec, demonstrating its consistent outperformance of baseline methods and significant improvements in recommendation performance.

preprint2022arXiv

HySAGE: A Hybrid Static and Adaptive Graph Embedding Network for Context-Drifting Recommendations

The recent popularity of edge devices and Artificial Intelligent of Things (AIoT) has driven a new wave of contextual recommendations, such as location based Point of Interest (PoI) recommendations and computing resource-aware mobile app recommendations. In many such recommendation scenarios, contexts are drifting over time. For example, in a mobile game recommendation, contextual features like locations, battery, and storage levels of mobile devices are frequently drifting over time. However, most existing graph-based collaborative filtering methods are designed under the assumption of static features. Therefore, they would require frequent retraining and/or yield graphical models burgeoning in sizes, impeding their suitability for context-drifting recommendations. In this work, we propose a specifically tailor-made Hybrid Static and Adaptive Graph Embedding (HySAGE) network for context-drifting recommendations. Our key idea is to disentangle the relatively static user-item interaction and rapidly drifting contextual features. Specifically, our proposed HySAGE network learns a relatively static graph embedding from user-item interaction and an adaptive embedding from drifting contextual features. These embeddings are incorporated into an interest network to generate the user interest in some certain context. We adopt an interactive attention module to learn the interactions among static graph embeddings, adaptive contextual embeddings, and user interest, helping to achieve a better final representation. Extensive experiments on real-world datasets demonstrate that HySAGE significantly improves the performance of the existing state-of-the-art recommendation algorithms.

preprint2022arXiv

Personalized Federated Recommendation via Joint Representation Learning, User Clustering, and Model Adaptation

Federated recommendation applies federated learning techniques in recommendation systems to help protect user privacy by exchanging models instead of raw user data between user devices and the central server. Due to the heterogeneity in user's attributes and local data, attaining personalized models is critical to help improve the federated recommendation performance. In this paper, we propose a Graph Neural Network based Personalized Federated Recommendation (PerFedRec) framework via joint representation learning, user clustering, and model adaptation. Specifically, we construct a collaborative graph and incorporate attribute information to jointly learn the representation through a federated GNN. Based on these learned representations, we cluster users into different user groups and learn personalized models for each cluster. Then each user learns a personalized model by combining the global federated model, the cluster-level federated model, and the user's fine-tuned local model. To alleviate the heavy communication burden, we intelligently select a few representative users (instead of randomly picked users) from each cluster to participate in training. Experiments on real-world datasets show that our proposed method achieves superior performance over existing methods.

preprint2022arXiv

Towards Communication Efficient and Fair Federated Personalized Sequential Recommendation

Federated recommendations leverage the federated learning (FL) techniques to make privacy-preserving recommendations. Though recent success in the federated recommender system, several vital challenges remain to be addressed: (i) The majority of federated recommendation models only consider the model performance and the privacy-preserving ability, while ignoring the optimization of the communication process; (ii) Most of the federated recommenders are designed for heterogeneous systems, causing unfairness problems during the federation process; (iii) The personalization techniques have been less explored in many federated recommender systems. In this paper, we propose a Communication efficient and Fair personalized Federated personalized Sequential Recommendation algorithm (CF-FedSR) to tackle these challenges. CF-FedSR introduces a communication-efficient scheme that employs adaptive client selection and clustering-based sampling to accelerate the training process. A fairness-aware model aggregation algorithm that can adaptively capture the data and performance imbalance among different clients to address the unfairness problems is proposed. The personalization module assists clients in making personalized recommendations and boosts the recommendation performance via local fine-tuning and model adaption. Extensive experimental results show the effectiveness and efficiency of our proposed method.

preprint2015arXiv

Distributed Interference Management Policies for Heterogeneous Small Cell Networks

We study the problem of interference management in large-scale small cell networks, where each user equipment (UE) needs to determine in a distributed manner when and at what power level it should transmit to its serving small cell base station (SBS) such that a given network performance criterion is maximized subject to minimum quality of service (QoS) requirements by the UEs. We first propose a distributed algorithm for the UE-SBS pairs to find a subset of weakly interfering UE-SBS pairs, namely the maximal independent sets (MISs) of the interference graph in logarithmic time (with respect to the number of UEs). Then we propose a novel problem formulation which enables UE-SBS pairs to determine the optimal fractions of time occupied by each MIS in a distributed manner. We analytically bound the performance of our distributed policy in terms of the competitive ratio with respect to the optimal network performance, which is obtained in a centralized manner with NP (non-deterministic polynomial time) complexity. Remarkably, the competitive ratio is independent of the network size, which guarantees scalability in terms of performance for arbitrarily large networks. Through simulations, we show that our proposed policies achieve significant performance improvements (from 150% to 700%) over the existing policies.

preprint2015arXiv

Efficient Interference Management Policies for Femtocell Networks

Managing interference in a network of macrocells underlaid with femtocells presents an important, yet challenging problem. A majority of spatial (frequency/time) reuse based approaches partition the users based on coloring the interference graph, which is shown to be suboptimal. Some spatial time reuse based approaches schedule the maximal independent sets (MISs) in a cyclic, (weighted) round-robin fashion, which is inefficient for delay-sensitive applications. Our proposed policies schedule the MISs in a non-cyclic fashion, which aim to optimize any given network performance criterion for delay-sensitive applications while fulfilling minimum throughput requirements of the users. Importantly, we do not take the interference graph as given as in existing works; we propose an optimal construction of the interference graph. We prove that under certain conditions, the proposed policy achieves the optimal network performance. For large networks, we propose a low-complexity algorithm for computing the proposed policy. We show that the policy computed achieves a constant competitive ratio (with respect to the optimal network performance), which is independent of the network size, under wide range of deployment scenarios. The policy can be implemented in a decentralized manner by the users. Compared to the existing policies, our proposed policies can achieve improvement of up to 130 % in large-scale deployments.

preprint2014arXiv

Energy-Efficient Nonstationary Spectrum Sharing

We develop a novel design framework for energy-efficient spectrum sharing among autonomous users who aim to minimize their energy consumptions subject to minimum throughput requirements. Most existing works proposed stationary spectrum sharing policies, in which users transmit at fixed power levels. Since users transmit simultaneously under stationary policies, to fulfill minimum throughput requirements, they need to transmit at high power levels to overcome interference. To improve energy efficiency, we construct nonstationary spectrum sharing policies, in which the users transmit at time-varying power levels. Specifically, we focus on TDMA (time-division multiple access) policies in which one user transmits at each time (but not in a round-robin fashion). The proposed policy can be implemented by each user running a low-complexity algorithm in a decentralized manner. It achieves high energy efficiency even when the users have erroneous and binary feedback about their interference levels. Moreover, it can adapt to the dynamic entry and exit of users. The proposed policy is also deviation-proof, namely autonomous users will find it in their self-interests to follow it. Compared to existing policies, the proposed policy can achieve an energy saving of up to 90% when the number of users is high.

preprint2014arXiv

Foresighted Demand Side Management

We consider a smart grid with an independent system operator (ISO), and distributed aggregators who have energy storage and purchase energy from the ISO to serve its customers. All the entities in the system are foresighted: each aggregator seeks to minimize its own long-term payments for energy purchase and operational costs of energy storage by deciding how much energy to buy from the ISO, and the ISO seeks to minimize the long-term total cost of the system (e.g. energy generation costs and the aggregators' costs) by dispatching the energy production among the generators. The decision making of the entities is complicated for two reasons. First, the information is decentralized: the ISO does not know the aggregators' states (i.e. their energy consumption requests from customers and the amount of energy in their storage), and each aggregator does not know the other aggregators' states or the ISO's state (i.e. the energy generation costs and the status of the transmission lines). Second, the coupling among the aggregators is unknown to them. Specifically, each aggregator's energy purchase affects the price, and hence the payments of the other aggregators. However, none of them knows how its decision influences the price because the price is determined by the ISO based on its state. We propose a design framework in which the ISO provides each aggregator with a conjectured future price, and each aggregator distributively minimizes its own long-term cost based on its conjectured price as well as its local information. The proposed framework can achieve the social optimum despite being decentralized and involving complex coupling among the various entities.

preprint2014arXiv

Incentive Design in Peer Review: Rating and Repeated Endogenous Matching

Peer review (e.g., grading assignments in Massive Open Online Courses (MOOCs), academic paper review) is an effective and scalable method to evaluate the products (e.g., assignments, papers) of a large number of agents when the number of dedicated reviewing experts (e.g., teaching assistants, editors) is limited. Peer review poses two key challenges: 1) identifying the reviewers' intrinsic capabilities (i.e., adverse selection) and 2) incentivizing the reviewers to exert high effort (i.e., moral hazard). Some works in mechanism design address pure adverse selection using one-shot matching rules, and pure moral hazard was addressed in repeated games with exogenously given and fixed matching rules. However, in peer review systems exhibiting both adverse selection and moral hazard, one-shot or exogenous matching rules do not link agents' current behavior with future matches and future payoffs, and as we prove, will induce myopic behavior (i.e., exerting the lowest effort) resulting in the lowest review quality. In this paper, we propose for the first time a solution that simultaneously solves adverse selection and moral hazard. Our solution exploits the repeated interactions of agents, utilizes ratings to summarize agents' past review quality, and designs matching rules that endogenously depend on agents' ratings. Our proposed matching rules are easy to implement and require no knowledge about agents' private information (e.g., their benefit and cost functions). Yet, they are effective in guiding the system to an equilibrium where the agents are incentivized to exert high effort and receive ratings that precisely reflect their review quality. Using several illustrative examples, we quantify the significant performance gains obtained by our proposed mechanism as compared to existing one-shot or exogenous matching rules.

preprint2014arXiv

Non-stationary Resource Allocation Policies for Delay-constrained Video Streaming: Application to Video over Internet-of-Things-enabled Networks

Due to the high bandwidth requirements and stringent delay constraints of multi-user wireless video transmission applications, ensuring that all video senders have sufficient transmission opportunities to use before their delay deadlines expire is a longstanding research problem. We propose a novel solution that addresses this problem without assuming detailed packet-level knowledge, which is unavailable at resource allocation time. Instead, we translate the transmission delay deadlines of each sender's video packets into a monotonically-decreasing weight distribution within the considered time horizon. Higher weights are assigned to the slots that have higher probability for deadline-abiding delivery. Given the sets of weights of the senders' video streams, we propose the low-complexity Delay-Aware Resource Allocation (DARA) approach to compute the optimal slot allocation policy that maximizes the deadline-abiding delivery of all senders. A unique characteristic of the DARA approach is that it yields a non-stationary slot allocation policy that depends on the allocation of previous slots. We prove that the DARA approach is optimal for weight distributions that are exponentially decreasing in time. We further implement our framework for real-time video streaming in wireless personal area networks that are gaining significant traction within the new Internet-of-Things (IoT) paradigm. For multiple surveillance videos encoded with H.264/AVC and streamed via the 6tisch framework that simulates the IoT-oriented IEEE 802.15.4e TSCH medium access control, our solution is shown to be the only one that ensures all video bitstreams are delivered with acceptable quality in a deadline-abiding manner.

preprint2013arXiv

Demand Side Management in Smart Grids using a Repeated Game Framework

Demand side management (DSM) is a key solution for reducing the peak-time power consumption in smart grids. To provide incentives for consumers to shift their consumption to off-peak times, the utility company charges consumers differential pricing for using power at different times of the day. Consumers take into account these differential prices when deciding when and how much power to consume daily. Importantly, while consumers enjoy lower billing costs when shifting their power usage to off-peak times, they also incur discomfort costs due to the altering of their power consumption patterns. Existing works propose stationary strategies for the myopic consumers to minimize their short-term billing and discomfort costs. In contrast, we model the interaction emerging among self-interested, foresighted consumers as a repeated energy scheduling game and prove that the stationary strategies are suboptimal in terms of long-term total billing and discomfort costs. Subsequently, we propose a novel framework for determining optimal nonstationary DSM strategies, in which consumers can choose different daily power consumption patterns depending on their preferences, routines, and needs. As a direct consequence of the nonstationary DSM policy, different subsets of consumers are allowed to use power in peak times at a low price. The subset of consumers that are selected daily to have their joint discomfort and billing costs minimized is determined based on the consumers' power consumption preferences as well as on the past history of which consumers have shifted their usage previously. Importantly, we show that the proposed strategies are incentive-compatible. Simulations confirm that, given the same peak-to-average ratio, the proposed strategy can reduce the total cost (billing and discomfort costs) by up to 50% compared to existing DSM strategies.

preprint2013arXiv

Designing Efficient Resource Sharing For Impatient Players Using Limited Monitoring

The problem of efficient sharing of a resource is nearly ubiquitous. Except for pure public goods, each agent's use creates a negative externality; often the negative externality is so strong that efficient sharing is impossible in the short run. We show that, paradoxically, the impossibility of efficient sharing in the short run enhances the possibility of efficient sharing in the long run, even if outcomes depend stochastically on actions, monitoring is limited and users are not patient. We base our analysis on the familiar framework of repeated games with imperfect public monitoring, but we extend the framework to view the monitoring structure as chosen by a designer who balances the benefits and costs of more accurate observations and reports. Our conclusions are much stronger than in the usual folk theorems: we do not require a rich signal structure or patient users and provide an explicit online construction of equilibrium strategies.

preprint2013arXiv

Incentive Design for Direct Load Control Programs

We study the problem of optimal incentive design for voluntary participation of electricity customers in a Direct Load Scheduling (DLS) program, a new form of Direct Load Control (DLC) based on a three way communication protocol between customers, embedded controls in flexible appliances, and the central entity in charge of the program. Participation decisions are made in real-time on an event-based basis, with every customer that needs to use a flexible appliance considering whether to join the program given current incentives. Customers have different interpretations of the level of risk associated with committing to pass over the control over the consumption schedule of their devices to an operator, and these risk levels are only privately known. The operator maximizes his expected profit of operating the DLS program by posting the right participation incentives for different appliance types, in a publicly available and dynamically updated table. Customers are then faced with the dynamic decision making problem of whether to take the incentives and participate or not. We define an optimization framework to determine the profit-maximizing incentives for the operator. In doing so, we also investigate the utility that the operator expects to gain from recruiting different types of devices. These utilities also provide an upper-bound on the benefits that can be attained from any type of demand response program.

preprint2013arXiv

Optimal Distributed Resource Allocation for Decode-and-Forward Relay Networks

This paper presents a distributed resource allocation algorithm to jointly optimize the power allocation, channel allocation and relay selection for decode-and-forward (DF) relay networks with a large number of sources, relays, and destinations. The well-known dual decomposition technique cannot directly be applied to resolve this problem, because the achievable data rate of DF relaying is not strictly concave, and thus the local resource allocation subproblem may have non-unique solutions. We resolve this non-strict concavity problem by using the idea of the proximal point method, which adds quadratic terms to make the objective function strictly concave. However, the proximal solution adds an extra layer of iterations over typical duality based approaches, which can significantly slow down the speed of convergence. To address this key weakness, we devise a fast algorithm without the need for this additional layer of iterations, which converges to the optimal solution. Our algorithm only needs local information exchange, and can easily adapt to variations of network size and topology. We prove that our distributed resource allocation algorithm converges to the optimal solution. A channel resource adjustment method is further developed to provide more channel resources to the bottleneck links and realize traffic load balance. Numerical results are provided to illustrate the benefits of our algorithm.

preprint2013arXiv

Optimal Foresighted Multi-User Wireless Video

Recent years have seen an explosion in wireless video communication systems. Optimization in such systems is crucial - but most existing methods intended to optimize the performance of multi-user wireless video transmission are inefficient. Some works (e.g. Network Utility Maximization (NUM)) are myopic: they choose actions to maximize instantaneous video quality while ignoring the future impact of these actions. Such myopic solutions are known to be inferior to foresighted solutions that optimize the long-term video quality. Alternatively, foresighted solutions such as rate-distortion optimized packet scheduling focus on single-user wireless video transmission, while ignoring the resource allocation among the users. In this paper, we propose an optimal solution for performing joint foresighted resource allocation and packet scheduling among multiple users transmitting video over a shared wireless network. A key challenge in developing foresighted solutions for multiple video users is that the users' decisions are coupled. To decouple the users' decisions, we adopt a novel dual decomposition approach, which differs from the conventional optimization solutions such as NUM, and determines foresighted policies. Specifically, we propose an informationally-decentralized algorithm in which the network manager updates resource "prices" (i.e. the dual variables associated with the resource constraints), and the users make individual video packet scheduling decisions based on these prices. Because a priori knowledge of the system dynamics is almost never available at run-time, the proposed solution can learn online, concurrently with performing the foresighted optimization. Simulation results show 7 dB and 3 dB improvements in Peak Signal-to-Noise Ratio (PSNR) over myopic solutions and existing foresighted solutions, respectively.

preprint2013arXiv

Socially-Optimal Design of Service Exchange Platforms with Imperfect Monitoring

In service exchange platforms, anonymous users exchange services with each other: clients request services and are matched to servers who provide services. Because providing good-quality services requires effort, in any single interaction a server will have no incentive to exert effort and will shirk. We show that if current servers will later become clients and want good-quality services, shirking can be eliminated by rating protocols, which maintain ratings for each user, prescribe behavior in each client-server interaction, and update ratings based on whether observed/reported behavior conforms with prescribed behavior. The rating protocols proposed are the first to achieve social optimum even when observation/reporting is imperfect (quality is incorrectly assessed/reported or reports are lost). The proposed protocols are remarkably simple, requiring only binary ratings and three possible prescribed behaviors. Key to the efficacy of the proposed protocols is that they are nonstationary, and tailor prescriptions to both current and past rating distributions.

preprint2012arXiv

Designing Information Revelation and Intervention with an Application to Flow Control

There are many familiar situations in which a manager seeks to design a system in which users share a resource, but outcomes depend on the information held and actions taken by users. If communication is possible, the manager can ask users to report their private information and then, using this information, instruct them on what actions they should take. If the users are compliant, this reduces the manager's optimization problem to a well-studied problem of optimal control. However, if the users are self-interested and not compliant, the problem is much more complicated: when asked to report their private information, the users might lie; upon receiving instructions, the users might disobey. Here we ask whether the manager can design the system to get around both of these difficulties. To do so, the manager must provide for the users the incentives to report truthfully and to follow the instructions, despite the fact that the users are self-interested. For a class of environments that includes many resource allocation games in communication networks, we provide tools for the manager to design an efficient system. In addition to reports and recommendations, the design we employ allows the manager to intervene in the system after the users take actions. In an abstracted environment, we find conditions under which the manager can achieve the same outcome it could if users were compliant, and conditions under which it does not. We then apply our framework and results to design a flow control management system.

preprint2012arXiv

Dynamic Spectrum Sharing Among Repeatedly Interacting Selfish Users With Imperfect Monitoring

We develop a novel design framework for dynamic distributed spectrum sharing among secondary users (SUs) who adjust their power levels to compete for spectrum opportunities while satisfying the interference temperature (IT) constraints imposed by primary users. The considered interaction among the SUs is characterized by the following three features. First, since the SUs are decentralized, they are selfish and aim to maximize their own long-term payoffs from utilizing the network rather than obeying the prescribed allocation of a centralized controller. Second, the SUs interact with each other repeatedly and they can coexist in the system for a long time. Third, the SUs have limited and imperfect monitoring ability: they only observe whether the IT constraints are violated, and their observation is imperfect due to the erroneous measurements. To capture these features, we model the interaction of the SUs as a repeated game with imperfect monitoring. We first characterize the set of Pareto optimal payoffs that can be achieved by deviation-proof spectrum sharing policies, which are policies that the selfish users find it in their interest to comply with. Next, for any given payoff in this set, we show how to construct a deviation-proof policy to achieve it. The constructed deviation-proof policy is amenable to distributed implementation, and allows users to transmit in a time-division multiple-access (TDMA) fashion. In the presence of strong multi-user interference, our policy outperforms existing spectrum sharing policies that dictate users to transmit at constant power levels simultaneously. Moreover, our policy can achieve Pareto optimality even when the SUs have limited and imperfect monitoring ability, as opposed to existing solutions based on repeated games, which require perfect monitoring abilities.

preprint2012arXiv

Pricing and Intervention in Slotted-Aloha: Technical Report

In many wireless communication networks a common channel is shared by multiple users who must compete to gain access to it. The operation of the network by self-interested and strategic users usually leads to the overuse of the channel resources and to substantial inefficiencies. Hence, incentive schemes are needed to overcome the inefficiencies of non-cooperative equilibrium. In this work we consider a slotted-Aloha like random access protocol and two incentive schemes: pricing and intervention. We provide some criteria for the designer of the protocol to choose one scheme between them and to design the best policy for the selected scheme, depending on the system parameters. Our results show that intervention can achieve the maximum efficiency in the perfect monitoring scenario. In the imperfect monitoring scenario, instead, the performance of the system depends on the information held by the different entities and, in some cases, there exists a threshold for the number of users such that, for a number of users lower than the threshold, intervention outperforms pricing, whereas, for a number of users higher than the threshold pricing outperforms intervention.

preprint2012arXiv

Technology Choices and Pricing Policies in Public and Private Wireless Networks

This paper studies the provision of a wireless network by a monopolistic provider who may be either benevolent (seeking to maximize social welfare) or selfish (seeking to maximize provider profit). The paper addresses questions that do not seem to have been studied before in the engineering literature on wireless networks: Under what circumstances is it feasible for a provider, either benevolent or selfish, to operate a network in such a way as to cover costs? How is the optimal behavior of a benevolent provider different from the optimal behavior of a selfish provider, and how does this difference affect social welfare? And, most importantly, how does the medium access control (MAC) technology influence the answers to these questions? To address these questions, we build a general model, and provide analysis and simulations for simplified but typical scenarios; the focus in these scenarios is on the contrast between the outcomes obtained under carrier-sensing multiple access (CSMA) and outcomes obtained under time-division multiple access (TDMA). Simulation results demonstrate that differences in MAC technology can have a significant effect on social welfare, on provider profit, and even on the (financial) feasibility of a wireless network.

preprint2011arXiv

Intervention in Power Control Games With Selfish Users

We study the power control problem in wireless ad hoc networks with selfish users. Without incentive schemes, selfish users tend to transmit at their maximum power levels, causing significant interference to each other. In this paper, we study a class of incentive schemes based on intervention to induce selfish users to transmit at desired power levels. An intervention scheme can be implemented by introducing an intervention device that can monitor the power levels of users and then transmit power to cause interference to users. We mainly consider first-order intervention rules based on individual transmit powers. We derive conditions on design parameters and the intervention capability to achieve a desired outcome as a (unique) Nash equilibrium and propose a dynamic adjustment process that the designer can use to guide users and the intervention device to the desired outcome. The effect of using intervention rules based on aggregate receive power is also analyzed. Our results show that with perfect monitoring intervention schemes can be designed to achieve any positive power profile while using interference from the intervention device only as a threat. We also analyze the case of imperfect monitoring and show that a performance loss can occur. Lastly, simulation results are presented to illustrate the performance improvement from using intervention rules and compare the performances of different intervention rules.

preprint2011arXiv

Repeated Games With Intervention: Theory and Applications in Communications

In communication systems where users share common resources, users' selfish behavior usually results in suboptimal resource utilization. There have been extensive works that model communication systems with selfish users as one-shot games and propose incentive schemes to achieve Pareto optimal action profiles as non-cooperative equilibria. However, in many communication systems, due to strong negative externalities among users, the sets of feasible payoffs in one-shot games are nonconvex. Thus, it is possible to expand the set of feasible payoffs by having users choose convex combinations of different payoffs. In this paper, we propose a repeated game model generalized by intervention. First, we use repeated games to convexify the set of feasible payoffs in one-shot games. Second, we combine conventional repeated games with intervention, originally proposed for one-shot games, to achieve a larger set of equilibrium payoffs and loosen requirements for users' patience to achieve it. We study the problem of maximizing a welfare function defined on users' equilibrium payoffs, subject to minimum payoff guarantees. Given the optimal equilibrium payoff, we derive the minimum intervention capability required and design corresponding equilibrium strategies. The proposed generalized repeated game model applies to various communication systems, such as power control and flow control.