Source author record

Moshe Tennenholtz

Moshe Tennenholtz appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Science and Game Theory Artificial Intelligence Machine Learning Computation and Language Information Retrieval Cryptography and Security econ.TH Social and Information Networks cs.CY Data Structures and Algorithms Human-Computer Interaction math.CO Multiagent Systems Networking and Internet Architecture

Catalog footprint

What is connected

40works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Predicting Decisions of AI Agents from Limited Interaction through Text-Tabular Modeling

AI agents negotiate and transact in natural language with unfamiliar counterparts: a buyer bot facing an unknown seller, or a procurement assistant negotiating with a supplier. In such interactions, the counterpart's LLM, prompts, control logic, and rule-based fallbacks are hidden, while each decision can have monetary consequences. We ask whether an agent can predict an unfamiliar counterpart's next decision from a few interactions. To avoid real-world logging confounds, we study this problem in controlled bargaining and negotiation games, formulating it as target-adaptive text-tabular prediction: each decision point is a table row combining structured game state, offer history, and dialogue, while $K$ previous games of the same target agent, i.e., the counterpart being modeled, are provided in the prompt as labeled adaptation examples. Our model is built on a tabular foundation model that represents rows using game-state features and LLM-based text representations, and adds LLM-as-Observer as an additional representation: a small frozen LLM reads the decision-time state and dialogue; its answer is discarded, and its hidden state becomes a decision-oriented feature, making the LLM an encoder rather than a direct few-shot predictor. Training on 13 frontier-LLM agents and testing on 91 held-out scaffolded agents, the full model outperforms direct LLM-as-Predictor prompting and game+text features baselines. Within this tabular model, Observer features contribute beyond the other feature schemes: at $K=16$, they improve response-prediction AUC by about 4 points across both tasks and reduce bargaining offer-prediction error by 14%. These results show that formulating counterpart prediction as a target-adaptive text-tabular task enables effective adaptation, and that hidden LLM representations expose decision-relevant signals that direct prompting does not surface.

preprint2022arXiv

Budget-Constrained Reinforcement of Ranked Objects

Commercial entries, such as hotels, are ranked according to score by a search engine or recommendation system, and the score of each can be improved upon by making a targeted investment, e.g., advertising. We study the problem of how a principal, who owns or supports a set of entries, can optimally allocate a budget to maximize their ranking. Representing the set of ranked scores as a probability distribution over scores, we treat this question as a game between distributions. We show that, in the general case, the best ranking is achieved by equalizing the scores of several disjoint score ranges. We show that there is a unique optimal reinforcement strategy, and provide an efficient algorithm implementing it.

preprint2022arXiv

Long-term Data Sharing under Exclusivity Attacks

The quality of learning generally improves with the scale and diversity of data. Companies and institutions can therefore benefit from building models over shared data. Many cloud and blockchain platforms, as well as government initiatives, are interested in providing this type of service. These cooperative efforts face a challenge, which we call ``exclusivity attacks''. A firm can share distorted data, so that it learns the best model fit, but is also able to mislead others. We study protocols for long-term interactions and their vulnerability to these attacks, in particular for regression and clustering tasks. We conclude that the choice of protocol, as well as the number of Sybil identities an attacker may control, is material to vulnerability.

preprint2022arXiv

Pareto-Improving Data-Sharing

We study the effects of data sharing between firms on prices, profits, and consumer welfare. Although indiscriminate sharing of consumer data decreases firm profits due to the subsequent increase in competition, selective sharing can be beneficial. We show that there are data-sharing mechanisms that are strictly Pareto-improving, simultaneously increasing firm profits and consumer welfare. Within the class of Pareto-improving mechanisms, we identify one that maximizes firm profits and one that maximizes consumer welfare.

preprint2022arXiv

Predicting Decisions in Language Based Persuasion Games

Sender-receiver interactions, and specifically persuasion games, are widely researched in economic modeling and artificial intelligence. However, in the classic persuasion games setting, the messages sent from the expert to the decision-maker (DM) are abstract or well-structured signals rather than natural language messages. This paper addresses the use of natural language in persuasion games. For this purpose, we conduct an online repeated interaction experiment. At each trial of the interaction, an informed expert aims to sell an uninformed decision-maker a vacation in a hotel, by sending her a review that describes the hotel. While the expert is exposed to several scored reviews, the decision-maker observes only the single review sent by the expert, and her payoff in case she chooses to take the hotel is a random draw from the review score distribution available to the expert only. We also compare the behavioral patterns in this experiment to the equivalent patterns in similar experiments where the communication is based on the numerical values of the reviews rather than the reviews' text, and observe substantial differences which can be explained through an equilibrium analysis of the game. We consider a number of modeling approaches for our verbal communication setup, differing from each other in the model type (deep neural network vs. linear classifier), the type of features used by the model (textual, behavioral or both) and the source of the textual features (DNN-based vs. hand-crafted). Our results demonstrate that given a prefix of the interaction sequence, our models can predict the future decisions of the decision-maker, particularly when a sequential modeling approach and hand-crafted textual features are applied. Further analysis of the hand-crafted textual features allows us to make initial observations about the aspects of text that drive decision making in our setup

preprint2021arXiv

Designing an Automatic Agent for Repeated Language based Persuasion Games

Persuasion games are fundamental in economics and AI research and serve as the basis for important applications. However, work on this setup assumes communication with stylized messages that do not consist of rich human language. In this paper we consider a repeated sender (expert) -- receiver (decision maker) game, where the sender is fully informed about the state of the world and aims to persuade the receiver to accept a deal by sending one of several possible natural language reviews. We design an automatic expert that plays this repeated game, aiming to achieve the maximal payoff. Our expert is implemented within the Monte Carlo Tree Search (MCTS) algorithm, with deep learning models that exploit behavioral and linguistic signals in order to predict the next action of the decision maker, and the future payoff of the expert given the state of the game and a candidate review. We demonstrate the superiority of our expert over strong baselines, its adaptability to different decision makers, and that its selected reviews are nicely adapted to the proposed deal.

preprint2021arXiv

Protecting the Protected Group: Circumventing Harmful Fairness

Machine Learning (ML) algorithms shape our lives. Banks use them to determine if we are good borrowers; IT companies delegate them recruitment decisions; police apply ML for crime-prediction, and judges base their verdicts on ML. However, real-world examples show that such automated decisions tend to discriminate against protected groups. This potential discrimination generated a huge hype both in media and in the research community. Quite a few formal notions of fairness were proposed, which take a form of constraints a "fair" algorithm must satisfy. We focus on scenarios where fairness is imposed on a self-interested party (e.g., a bank that maximizes its revenue). We find that the disadvantaged protected group can be worse off after imposing a fairness constraint. We introduce a family of \textit{Welfare-Equalizing} fairness constraints that equalize per-capita welfare of protected groups, and include \textit{Demographic Parity} and \textit{Equal Opportunity} as particular cases. In this family, we characterize conditions under which the fairness constraint helps the disadvantaged group. We also characterize the structure of the optimal \textit{Welfare-Equalizing} classifier for the self-interested party, and provide an algorithm to compute it. Overall, our \textit{Welfare-Equalizing} fairness approach provides a unified framework for discussing fairness in classification in the presence of a self-interested party.

preprint2020arXiv

Fiduciary Bandits

Recommendation systems often face exploration-exploitation tradeoffs: the system can only learn about the desirability of new options by recommending them to some user. Such systems can thus be modeled as multi-armed bandit settings; however, users are self-interested and cannot be made to follow recommendations. We ask whether exploration can nevertheless be performed in a way that scrupulously respects agents' interests---i.e., by a system that acts as a fiduciary. More formally, we introduce a model in which a recommendation system faces an exploration-exploitation tradeoff under the constraint that it can never recommend any action that it knows yields lower reward in expectation than an agent would achieve if it acted alone. Our main contribution is a positive result: an asymptotically optimal, incentive compatible, and ex-ante individually rational recommendation algorithm.

preprint2020arXiv

Incentive-Compatible Selection Mechanisms for Forests

Given a directed forest-graph, a probabilistic \emph{selection mechanism} is a probability distribution over the vertex set. A selection mechanism is \emph{incentive-compatible} (IC), if the probability assigned to a vertex does not change when we alter its outgoing edge (or even remove it). The quality of a selection mechanism is the worst-case ratio between the expected progeny under the mechanism's distribution and the maximal progeny in the forest. In this paper we prove an upper bound of 4/5 and a lower bound of $ 1/\ln16\approx0.36 $ for the quality of any IC selection mechanism. The lower bound is achieved by two novel mechanisms and is a significant improvement to the results of Babichenko et al. (WWW '18). The first, simpler mechanism, has the nice feature of generating distributions which are fair (i.e., monotone and proportional). The downside of this mechanism is that it is not exact (i.e., the probabilities might sum-up to less than 1). Our second, more involved mechanism, is exact but not fair. We also prove an impossibility for an IC mechanism that is both exact and fair and has a positive quality.

preprint2020arXiv

Learning under Invariable Bayesian Safety

A recent body of work addresses safety constraints in explore-and-exploit systems. Such constraints arise where, for example, exploration is carried out by individuals whose welfare should be balanced with overall welfare. In this paper, we adopt a model inspired by recent work on a bandit-like setting for recommendations. We contribute to this line of literature by introducing a safety constraint that should be respected in every round and determines that the expected value in each round is above a given threshold. Due to our modeling, the safe explore-and-exploit policy deserves careful planning, or otherwise, it will lead to sub-optimal welfare. We devise an asymptotically optimal algorithm for the setting and analyze its instance-dependent convergence rate.

preprint2020arXiv

Multi-Issue Social Learning

We consider social learning where agents can only observe part of the population (modeled as neighbors on an undirected graph), face many decision problems, and arrival order of the agents is unknown. The central question we pose is whether there is a natural observability graph that prevents the information cascade phenomenon. We introduce the `celebrities graph' and prove that indeed it allows for proper information aggregation in large populations even when the order at which agents decide is random and even when different issues are decided in different orders.

preprint2020arXiv

Predicting Strategic Behavior from Free Text

The connection between messaging and action is fundamental both to web applications, such as web search and sentiment analysis, and to economics. However, while prominent online applications exploit messaging in natural (human) language in order to predict non-strategic action selection, the economics literature focuses on the connection between structured stylized messaging to strategic decisions in games and multi-agent encounters. This paper aims to connect these two strands of research, which we consider highly timely and important due to the vast online textual communication on the web. Particularly, we introduce the following question: can free text expressed in natural language serve for the prediction of action selection in an economic context, modeled as a game? In order to initiate the research on this question, we introduce the study of an individual's action prediction in a one-shot game based on free text he/she provides, while being unaware of the game to be played. We approach the problem by attributing commonsensical personality attributes via crowd-sourcing to free texts written by individuals, and employing transductive learning to predict actions taken by these individuals in one-shot games based on these attributes. Our approach allows us to train a single classifier that can make predictions with respect to actions taken in multiple games. In experiments with three well-studied games, our algorithm compares favorably with strong alternative approaches. In ablation analysis, we demonstrate the importance of our modeling choices---the representation of the text with the commonsensical personality attributes and our classifier---to the predictive power of our model.

preprint2020arXiv

Privacy, Altruism, and Experience: Estimating the Perceived Value of Internet Data for Medical Uses

People increasingly turn to the Internet when they have a medical condition. The data they create during this process is a valuable source for medical research and for future health services. However, utilizing these data could come at a cost to user privacy. Thus, it is important to balance the perceived value that users assign to these data with the value of the services derived from them. Here we describe experiments where methods from Mechanism Design were used to elicit a truthful valuation from users for their Internet data and for services to screen people for medical conditions. In these experiments, 880 people from around the world were asked to participate in an auction to provide their data for uses differing in their contribution to the participant, to society, and in the disease they addressed. Some users were offered monetary compensation for their participation, while others were asked to pay to participate. Our findings show that 99\% of people were willing to contribute their data in exchange for monetary compensation and an analysis of their data, while 53\% were willing to pay to have their data analyzed. The average perceived value users assigned to their data was estimated at US\$49. Their value to screen them for a specific cancer was US\$22 while the value of this service offered to the general public was US\$22. Participants requested higher compensation when notified that their data would be used to analyze a more severe condition. They were willing to pay more to have their data analyzed when the condition was more severe, when they had higher education or if they had recently experienced a serious medical condition.

preprint2020arXiv

Ranking-Incentivized Quality Preserving Content Modification

The Web is a canonical example of a competitive retrieval setting where many documents' authors consistently modify their documents to promote them in rankings. We present an automatic method for quality-preserving modification of document content -- i.e., maintaining content quality -- so that the document is ranked higher for a query by a non-disclosed ranking function whose rankings can be observed. The method replaces a passage in the document with some other passage. To select the two passages, we use a learning-to-rank approach with a bi-objective optimization criterion: rank promotion and content-quality maintenance. We used the approach as a bot in content-based ranking competitions. Analysis of the competitions demonstrates the merits of our approach with respect to human content modifications in terms of rank promotion, content-quality maintenance and relevance.

preprint2020arXiv

Representative Committees of Peers

A population of voters must elect representatives among themselves to decide on a sequence of possibly unforeseen binary issues. Voters care only about the final decision, not the elected representatives. The disutility of a voter is proportional to the fraction of issues, where his preferences disagree with the decision. While an issue-by-issue vote by all voters would maximize social welfare, we are interested in how well the preferences of the population can be approximated by a small committee. We show that a k-sortition (a random committee of k voters with the majority vote within the committee) leads to an outcome within the factor 1+O(1/k) of the optimal social cost for any number of voters n, any number of issues $m$, and any preference profile. For a small number of issues m, the social cost can be made even closer to optimal by delegation procedures that weigh committee members according to their number of followers. However, for large m, we demonstrate that the k-sortition is the worst-case optimal rule within a broad family of committee-based rules that take into account metric information about the preference profile of the whole population.

preprint2020arXiv

Studying Ranking-Incentivized Web Dynamics

The ranking incentives of many authors of Web pages play an important role in the Web dynamics. That is, authors who opt to have their pages highly ranked for queries of interest, often respond to rankings for these queries by manipulating their pages; the goal is to improve the pages' future rankings. Various theoretical aspects of this dynamics have recently been studied using game theory. However, empirical analysis of the dynamics is highly constrained due to lack of publicly available datasets.We present an initial such dataset that is based on TREC's ClueWeb09 dataset. Specifically, we used the WayBack Machine of the Internet Archive to build a document collection that contains past snapshots of ClueWeb documents which are highly ranked by some initial search performed for ClueWeb queries. Temporal analysis of document changes in this dataset reveals that findings recently presented for small-scale controlled ranking competitions between documents' authors also hold for Web data. Specifically, documents' authors tend to mimic the content of documents that were highly ranked in the past, and this practice can result in improved ranking.

preprint2016arXiv

A Hydraulic Approach to Equilibria of Resource Selection Games

Drawing intuition from a (physical) hydraulic system, we present a novel framework, constructively showing the existence of a strong Nash equilibrium in resource selection games (i.e., asymmetric singleton congestion games) with nonatomic players, the coincidence of strong equilibria and Nash equilibria in such games, and the uniqueness of the cost of each given resource across all Nash equilibria. Our proofs allow for explicit calculation of Nash equilibrium and for explicit and direct calculation of the resulting (unique) costs of resources, and do not hinge on any fixed-point theorem, on the Minimax theorem or any equivalent result, on linear programming, or on the existence of a potential (though our analysis does provide powerful insights into the potential, via a natural concrete physical interpretation). A generalization of resource selection games, called resource selection games with I.D.-dependent weighting, is defined, and the results are extended to this family, showing the existence of strong equilibria, and showing that while resource costs are no longer unique across Nash equilibria in games of this family, they are nonetheless unique across all strong Nash equilibria, drawing a novel fundamental connection between group deviation and I.D.-congestion. A natural application of the resulting machinery to a large class of constraint-satisfaction problems is also described.

preprint2016arXiv

An Axiomatic Approach to Routing

Information delivery in a network of agents is a key issue for large, complex systems that need to do so in a predictable, efficient manner. The delivery of information in such multi-agent systems is typically implemented through routing protocols that determine how information flows through the network. Different routing protocols exist each with its own benefits, but it is generally unclear which properties can be successfully combined within a given algorithm. We approach this problem from the axiomatic point of view, i.e., we try to establish what are the properties we would seek to see in such a system, and examine the different properties which uniquely define common routing algorithms used today. We examine several desirable properties, such as robustness, which ensures adding nodes and edges does not change the routing in a radical, unpredictable ways; and properties that depend on the operating environment, such as an "economic model", where nodes choose their paths based on the cost they are charged to pass information to the next node. We proceed to fully characterize minimal spanning tree, shortest path, and weakest link routing algorithms, showing a tight set of axioms for each.

preprint2016arXiv

Dynamics of Evolving Social Groups

Exclusive social groups are ones in which the group members decide whether or not to admit a candidate to the group. Examples of exclusive social groups include academic departments and fraternal organizations. In the present paper we introduce an analytic framework for studying the dynamics of exclusive social groups. In our model, every group member is characterized by his opinion, which is represented as a point on the real line. The group evolves in discrete time steps through a voting process carried out by the group's members. Due to homophily, each member votes for the candidate who is more similar to him (i.e., closer to him on the line). An admission rule is then applied to determine which candidate, if any, is admitted. We consider several natural admission rules including majority and consensus. We ask: how do different admission rules affect the composition of the group in the long term? We study both growing groups (where new members join old ones) and fixed-size groups (where new members replace those who quit). Our analysis reveals intriguing phenomena and phase transitions, some of which are quite counterintuitive.

preprint2015arXiv

A Mirage of Market Allocation

Can noncooperative behaviour of merchants lead to a market split that prima facie seems anticompetitive? We introduce a model in which service providers, with ISPs being the main example, aim at optimizing the number of customers using their services, while customers aim at choosing service providers with low customer load (high bandwidth per subscriber, for ISPs). Each service provider chooses between a variety of levels of service (latencies, for ISPs), and as long as it does not lose customers, aims at minimizing its level of service; the minimum level of service required to satisfy a customer varies across customers. We consider a two-stage competition: in the first stage, service providers select their levels of service; in the second stage, customers choose between service providers. In the two-stage game, we show that the competition among service providers possesses a unique Nash equilibrium, which is moreover super-strong; we also show that sequential better-response dynamics of service providers reach this equilibrium, with best-response dynamics doing so surprisingly fast. If service providers choose their levels of service according to this equilibrium, then the unique Nash equilibrium among customers in the second phase is a split of the market between the service providers, based on the customers' minimum acceptable quality of service; moreover, each service provider's chosen level of service is the lowest acceptable by the entirety of its market slice, seemingly making no attempt to attract other customers. Our results show that this prima facie market allocation (collusive split of the market) arises as the unique and highly robust outcome of noncooperative, even myopic, service-provider behaviour. These results are applicable to a wide variety of scenarios, from explaining phenomena observable in food markets, to shedding a surprising light on aspects of location theory.

preprint2015arXiv

Distributed Signaling Games

A recurring theme in recent computer science literature is that proper design of signaling schemes is a crucial aspect of effective mechanisms aiming to optimize social welfare or revenue. One of the research endeavors of this line of work is understanding the algorithmic and computational complexity of designing efficient signaling schemes. In reality, however, information is typically not held by a central authority, but is distributed among multiple sources (third-party "mediators"), a fact that dramatically changes the strategic and combinatorial nature of the signaling problem, making it a game between information providers, as opposed to a traditional mechanism design problem. In this paper we introduce {\em distributed signaling games}, while using display advertising as a canonical example for introducing this foundational framework. A distributed signaling game may be a pure coordination game (i.e., a distributed optimization task), or a non-cooperative game. In the context of pure coordination games, we show a wide gap between the computational complexity of the centralized and distributed signaling problems. On the other hand, we show that if the information structure of each mediator is assumed to be "local", then there is an efficient algorithm that finds a near-optimal ($5$-approximation) distributed signaling scheme. In the context of non-cooperative games, the outcome generated by the mediators' signals may have different value to each (due to the auctioneer's desire to align the incentives of the mediators with his own by relative compensations). We design a mechanism for this problem via a novel application of Shapley's value, and show that it possesses some interesting properties, in particular, it always admits a pure Nash equilibrium, and it never decreases the revenue of the auctioneer.

preprint2015arXiv

Economic Recommendation Systems

In the on-line Explore and Exploit literature, central to Machine Learning, a central planner is faced with a set of alternatives, each yielding some unknown reward. The planner's goal is to learn the optimal alternative as soon as possible, via experimentation. A typical assumption in this model is that the planner has full control over the experiment design and implementation. When experiments are implemented by a society of self-motivated agents the planner can only recommend experimentation but has no power to enforce it. Kremer et al (JPE, 2014) introduce the first study of explore and exploit schemes that account for agents' incentives. In their model it is implicitly assumed that agents do not see nor communicate with each other. Their main result is a characterization of an optimal explore and exploit scheme. In this work we extend Kremer et al (JPE, 2014) by adding a layer of a social network according to which agents can observe each other. It turns out that when observability is factored in the scheme proposed by Kremer et al (JPE, 2014) is no longer incentive compatible. In our main result we provide a tight bound on how many other agents can each agent observe and still have an incentive-compatible algorithm and asymptotically optimal outcome. More technically, for a setting with N agents where the number of nodes with degree greater than N^alpha is bounded by N^beta and 2*alpha+beta < 1 we construct incentive-compatible asymptotically optimal mechanism. The bound 2*alpha+beta < 1 is shown to be tight.

preprint2015arXiv

Mechanism Design with Strategic Mediators

We consider the problem of designing mechanisms that interact with strategic agents through strategic intermediaries (or mediators), and investigate the cost to society due to the mediators' strategic behavior. Selfish agents with private information are each associated with exactly one strategic mediator, and can interact with the mechanism exclusively through that mediator. Each mediator aims to optimize the combined utility of his agents, while the mechanism aims to optimize the combined utility of all agents. We focus on the problem of facility location on a metric induced by a publicly known tree. With non-strategic mediators, there is a dominant strategy mechanism that is optimal. We show that when both agents and mediators act strategically, there is no dominant strategy mechanism that achieves any approximation. We, thus, slightly relax the incentive constraints, and define the notion of a two-sided incentive compatible mechanism. We show that the $3$-competitive deterministic mechanism suggested by Procaccia and Tennenholtz (2013) and Dekel et al. (2010) for lines extends naturally to trees, and is still $3$-competitive as well as two-sided incentive compatible. This is essentially the best possible. We then show that by allowing randomization one can construct a $2$-competitive randomized mechanism that is two-sided incentive compatible, and this is also essentially tight. This result also closes a gap left in the work of Procaccia and Tennenholtz (2013) and Lu et al. (2009) for the simpler problem of designing strategy-proof mechanisms for weighted agents with no mediators on a line, while extending to the more general model of trees. We also investigate a further generalization of the above setting where there are multiple levels of mediators.

preprint2014arXiv

Chasing Ghosts: Competing with Stateful Policies

We consider sequential decision making in a setting where regret is measured with respect to a set of stateful reference policies, and feedback is limited to observing the rewards of the actions performed (the so called "bandit" setting). If either the reference policies are stateless rather than stateful, or the feedback includes the rewards of all actions (the so called "expert" setting), previous work shows that the optimal regret grows like $Θ(\sqrt{T})$ in terms of the number of decision rounds $T$. The difficulty in our setting is that the decision maker unavoidably loses track of the internal states of the reference policies, and thus cannot reliably attribute rewards observed in a certain round to any of the reference policies. In fact, in this setting it is impossible for the algorithm to estimate which policy gives the highest (or even approximately highest) total reward. Nevertheless, we design an algorithm that achieves expected regret that is sublinear in $T$, of the form $O( T/\log^{1/4}{T})$. Our algorithm is based on a certain local repetition lemma that may be of independent interest. We also show that no algorithm can guarantee expected regret better than $O( T/\log^{3/2} T)$.

preprint2013arXiv

Equilibrium in Labor Markets with Few Firms

We study competition between firms in labor markets, following a combinatorial model suggested by Kelso and Crawford [1982]. In this model, each firm is trying to recruit workers by offering a higher salary than its competitors, and its production function defines the utility generated from any actual set of recruited workers. We define two natural classes of production functions for firms, where the first one is based on additive capacities (weights), and the second on the influence of workers in a social network. We then analyze the existence of pure subgame perfect equilibrium (PSPE) in the labor market and its properties. While neither class holds the gross substitutes condition, we show that in both classes the existence of PSPE is guaranteed under certain restrictions, and in particular when there are only two competing firms. As a corollary, there exists a Walrasian equilibrium in a corresponding combinatorial auction, where bidders' valuation functions belong to these classes. While a PSPE may not exist when there are more than two firms, we perform an empirical study of equilibrium outcomes for the case of weight-based games with three firms, which extend our analytical results. We then show that stability can in some cases be extended to coalitional stability, and study the distribution of profit between firms and their workers in weight-based games.

preprint2013arXiv

On Stable Multi-Agent Behavior in Face of Uncertainty

A stable joint plan should guarantee the achievement of a designer's goal in a multi-agent environment, while ensuring that deviations from the prescribed plan would be detected. We present a computational framework where stable joint plans can be studied, as well as several basic results about the representation, verification and synthesis of stable joint plans.

preprint2013arXiv

Pay or Play

We introduce the class of pay or play games, which captures scenarios in which each decision maker is faced with a choice between two actions: one with a fixed payoff and an- other with a payoff dependent on others' selected actions. This is, arguably, the simplest setting that models selection among certain and uncertain outcomes in a multi-agent system. We study the properties of equilibria in such games from both a game-theoretic perspective and a computational perspective. Our main positive result establishes the existence of a semi-strong equilibrium in every such game. We show that although simple, pay of play games contain a large variety of well-studied environments, e.g., vaccination games. We discuss the interesting implications of our results for these environments.

preprint2012arXiv

Collusion in Unrepeated, First-Price Auctions with an Uncertain Number of Participants

We consider the question of whether collusion among bidders (a "bidding ring") can be supported in equilibrium of unrepeated first-price auctions. Unlike previous work on the topic such as that by McAfee and McMillan [1992] and Marshall and Marx [2007], we do not assume that non-colluding agents have perfect knowledge about the number of colluding agents whose bids are suppressed by the bidding ring, and indeed even allow for the existence of multiple cartels. Furthermore, while we treat the association of bidders with bidding rings as exogenous, we allow bidders to make strategic decisions about whether to join bidding rings when invited. We identify a bidding ring protocol that results in an efficient allocation in Bayes{Nash equilibrium, under which non-colluding agents bid straightforwardly, and colluding agents join bidding rings when invited and truthfully declare their valuations to the ring center. We show that bidding rings benefit ring centers and all agents, both members and non-members of bidding rings, at the auctioneer's expense. The techniques we introduce in this paper may also be useful for reasoning about other problems in which agents have asymmetric information about a setting.

preprint2012arXiv

Mechanism Design with Execution Uncertainty

We introduce the notion of fault tolerant mechanism design, which extends the standard game theoretic framework of mechanism design to allow for uncertainty about execution. Specifically, we define the problem of task allocation in which the private information of the agents is not only their costs to attempt the tasks, but also their probabilities of failure. For several different instances of this setting we present technical results, including positive ones in the form of mechanisms that are incentive compatible, individually rational and efficient, and negative ones in the form of impossibility theorems.

preprint2012arXiv

On the Value of Correlation

Correlated equilibrium (Aumann, 1974) generalizes Nash equilibrium to allow correlation devices. Aumann showed an example of a game, and of a correlated equilibrium in this game, in which the agents' surplus (expected sum of payo s) is greater than their surplus in all mixed-strategy equilibria. Following the idea initiated by the price of anarchy literature (Koutsoupias & Papadimitriou, 1999;Papadimitriou, 2001) this suggests the study of two major measures for the value of correlation in a game with non-negative payoffs: 1. The ratio between the maximal surplus obtained in a correlated equilibrium to the maximal surplus obtained in a mixed-strategy equilibrium. We refer to this ratio as the mediation value. 2. The ratio between the maximal surplus to the maximal surplus obtained in a correlated equilibrium. We refer to this ratio as the enforcement value. In this work we initiate the study of the mediation and enforcement values, providing several general results on the value of correlation as captured by these concepts. We also present a set of results for the more specialized case of congestion games (Rosenthal,1973), a class of games that received a lot of attention in the recent literature.

preprint2012arXiv

Reputation Systems: An Axiomatic Approach

Reasoning about agent preferences on a set of alternatives, and the aggregation of such preferences into some social ranking is a fundamental issue in reasoning about uncertainty and multi-agent systems. When the set of agents and the set of alternatives coincide, we get the so-called reputation systems setting. Famous types of reputation systems include page ranking in the context of search engines and traders ranking in the context of e-commerce. In this paper we present the first axiomatic study of reputation systems. We present three basic postulates that the desired/aggregated social ranking should satisfy and prove an impossibility theorem showing that no appropriate social ranking, satisfying all requirements, exists. Then we show that by relaxing any of these requirements an appropriate social ranking can be found. We first study reputation systems with (only) positive feedbacks. This setting refers to systems where agents' votes are interpreted as indications for the importance of other agents, as is the case in page ranking. Following this, we discuss the case of negative feedbacks, a most common situation in e-commerce settings, where traders may complain about the behavior of others. Finally, we discuss the case where both positive and negative feedbacks are available.

preprint2012arXiv

Robust Learning Equilibrium

We introduce robust learning equilibrium. The idea of learning equilibrium is that learning algorithms in multi-agent systems should themselves be in equilibrium rather than only lead to equilibrium. That is, learning equilibrium is immune to strategic deviations: Every agent is better off using its prescribed learning algorithm, if all other agents follow their algorithms, regardless of the unknown state of the environment. However, a learning equilibrium may not be immune to non strategic mistakes. For example, if for a certain period of time there is a failure in the monitoring devices (e.g., the correct input does not reach the agents), then it may not be in equilibrium to follow the algorithm after the devices are corrected. A robust learning equilibrium is immune also to such non-strategic mistakes. The existence of (robust) learning equilibrium is especially challenging when the monitoring devices are 'weak'. That is, the information available to each agent at each stage is limited. We initiate a study of robust learning equilibrium with general monitoring structure and apply it to the context of auctions. We prove the existence of robust learning equilibrium in repeated first-price auctions, and discuss its properties.

preprint2012arXiv

Sequential Information Elicitation in Multi-Agent Systems

We introduce the study of sequential information elicitation in strategic multi-agent systems. In an information elicitation setup a center attempts to compute the value of a function based on private information (a-k-a secrets) accessible to a set of agents. We consider the classical multi-party computation setup where each agent is interested in knowing the result of the function. However, in our setting each agent is strategic,and since acquiring information is costly, an agent may be tempted not spending the efforts of obtaining the information, free-riding on other agents' computations. A mechanism which elicits agents' secrets and performs the desired computation defines a game. A mechanism is 'appropriate' if there exists an equilibrium in which it is able to elicit (sufficiently many) agents' secrets and perform the computation, for all possible secret vectors.We characterize a general efficient procedure for determining an appropriate mechanism, if such mechanism exists. Moreover, we also address the existence problem, providing a polynomial algorithm for verifying the existence of an appropriate mechanism.

preprint2012arXiv

Signaling Schemes for Revenue Maximization

Signaling is an important topic in the study of asymmetric information in economic settings. In particular, the transparency of information available to a seller in an auction setting is a question of major interest. We introduce the study of signaling when conducting a second price auction of a probabilistic good whose actual instantiation is known to the auctioneer but not to the bidders. This framework can be used to model impressions selling in display advertising. We study the problem of computing a signaling scheme that maximizes the auctioneer's revenue in a Bayesian setting. While the general case is proved to be computationally hard, several cases of interest are shown to be polynomially solvable. In addition, we establish a tight bound on the minimum number of signals required to implement an optimal signaling scheme and show that at least half of the maximum social welfare can be preserved within such a scheme.

preprint2012arXiv

Signalling Competition and Social Welfare (Working Paper)

We consider an environment where sellers compete over buyers. All sellers are a-priori identical and strategically signal buyers about the product they sell. In a setting motivated by on-line advertising in display ad exchanges, where firms use second price auctions, a firm's strategy is a decision about its signaling scheme for a stream of goods (e.g. user impressions), and a buyer's strategy is a selection among the firms. In this setting, a single seller will typically provide partial information and consequently a product may be allocated inefficiently. Intuitively, competition among sellers may induce sellers to provide more information in order to attract buyers and thus increase efficiency. Surprisingly, we show that such a competition among firms may yield significant loss in consumers' social welfare with respect to the monopolistic setting. Although we also show that in some cases the competitive setting yields gain in social welfare, we provide a tight bound on that gain, which is shown to be small in respect to the above possible loss. Our model is tightly connected with the literature on bundling in auctions.

preprint2012arXiv

Solving Cooperative Reliability Games

Cooperative games model the allocation of profit from joint actions, following considerations such as stability and fairness. We propose the reliability extension of such games, where agents may fail to participate in the game. In the reliability extension, each agent only "survives" with a certain probability, and a coalition's value is the probability that its surviving members would be a winning coalition in the base game. We study prominent solution concepts in such games, showing how to approximate the Shapley value and how to compute the core in games with few agent types. We also show that applying the reliability extension may stabilize the game, making the core non-empty even when the base game has an empty core.

preprint2012arXiv

Stability Scores: Measuring Coalitional Stability

We introduce a measure for the level of stability against coalitional deviations, called \emph{stability scores}, which generalizes widely used notions of stability in non-cooperative games. We use the proposed measure to compare various Nash equilibria in congestion games, and to quantify the effect of game parameters on coalitional stability. For our main results, we apply stability scores to analyze and compare the Generalized Second Price (GSP) and Vickrey-Clarke-Groves (VCG) ad auctions. We show that while a central result of the ad auctions literature is that the GSP and VCG auctions implement the same outcome in one of the equilibria of GSP, the GSP outcome is far more stable. Finally, a modified version of VCG is introduced, which is group strategy-proof, and thereby achieves the highest possible stability score.

preprint2011arXiv

Approximately Optimal Mechanism Design via Differential Privacy

In this paper we study the implementation challenge in an abstract interdependent values model and an arbitrary objective function. We design a mechanism that allows for approximate optimal implementation of insensitive objective functions in ex-post Nash equilibrium. If, furthermore, values are private then the same mechanism is strategy proof. We cast our results onto two specific models: pricing and facility location. The mechanism we design is optimal up to an additive factor of the order of magnitude of one over the square root of the number of agents and involves no utility transfers. Underlying our mechanism is a lottery between two auxiliary mechanisms: with high probability we actuate a mechanism that reduces players' influence on the choice of the social alternative, while choosing the optimal outcome with high probability. This is where the recent notion of differential privacy is employed. With the complementary probability we actuate a mechanism that is typically far from optimal but is incentive compatible. The joint mechanism inherits the desired properties from both.

preprint2011arXiv

Dueling Algorithms

We revisit classic algorithmic search and optimization problems from the perspective of competition. Rather than a single optimizer minimizing expected cost, we consider a zero-sum game in which an optimization problem is presented to two players, whose only goal is to outperform the opponent. Such games are typically exponentially large zero-sum games, but they often have a rich combinatorial structure. We provide general techniques by which such structure can be leveraged to find minmax-optimal and approximate minmax-optimal strategies. We give examples of ranking, hiring, compression, and binary search duels, among others. We give bounds on how often one can beat the classic optimization algorithms in such duels.

preprint2011arXiv

Mechanism design with uncertain inputs (to err is human, to forgive divine)

We consider a task of scheduling with a common deadline on a single machine. Every player reports to a scheduler the length of his job and the scheduler needs to finish as many jobs as possible by the deadline. For this simple problem, there is a truthful mechanism that achieves maximum welfare in dominant strategies. The new aspect of our work is that in our setting players are uncertain about their own job lengths, and hence are incapable of providing truthful reports (in the strict sense of the word). For a probabilistic model for uncertainty our main results are as follows. 1) Even with relatively little uncertainty, no mechanism can guarantee a constant fraction of the maximum welfare. 2) To remedy this situation, we introduce a new measure of economic efficiency, based on a notion of a {\em fair share} of a player, and design mechanisms that are $Ω(1)$-fair. In addition to its intrinsic appeal, our notion of fairness implies good approximation of maximum welfare in several cases of interest. 3) In our mechanisms the machine is sometimes left idle even though there are jobs that want to use it. We show that this unfavorable aspect is unavoidable, unless one gives up other favorable aspects (e.g., give up $Ω(1)$-fairness). We also consider a qualitative approach to uncertainty as an alternative to the probabilistic quantitative model. In the qualitative approach we break away from solution concepts such as dominant strategies (they are no longer well defined), and instead suggest an axiomatic approach, which amounts to listing desirable properties for mechanisms. We provide a mechanism that satisfies these properties.

Moshe Tennenholtz

What is connected

Connect this record

See the researcher in context

Building this map preview

40 published item(s)

Predicting Decisions of AI Agents from Limited Interaction through Text-Tabular Modeling

Budget-Constrained Reinforcement of Ranked Objects

Long-term Data Sharing under Exclusivity Attacks

Pareto-Improving Data-Sharing

Predicting Decisions in Language Based Persuasion Games

Designing an Automatic Agent for Repeated Language based Persuasion Games

Protecting the Protected Group: Circumventing Harmful Fairness

Fiduciary Bandits

Incentive-Compatible Selection Mechanisms for Forests

Learning under Invariable Bayesian Safety

Multi-Issue Social Learning

Predicting Strategic Behavior from Free Text

Privacy, Altruism, and Experience: Estimating the Perceived Value of Internet Data for Medical Uses

Ranking-Incentivized Quality Preserving Content Modification

Representative Committees of Peers

Studying Ranking-Incentivized Web Dynamics

A Hydraulic Approach to Equilibria of Resource Selection Games

An Axiomatic Approach to Routing

Dynamics of Evolving Social Groups

A Mirage of Market Allocation

Distributed Signaling Games

Economic Recommendation Systems

Mechanism Design with Strategic Mediators

Chasing Ghosts: Competing with Stateful Policies

Equilibrium in Labor Markets with Few Firms

On Stable Multi-Agent Behavior in Face of Uncertainty

Pay or Play

Collusion in Unrepeated, First-Price Auctions with an Uncertain Number of Participants

Mechanism Design with Execution Uncertainty

On the Value of Correlation

Reputation Systems: An Axiomatic Approach

Robust Learning Equilibrium

Sequential Information Elicitation in Multi-Agent Systems

Signaling Schemes for Revenue Maximization

Signalling Competition and Social Welfare (Working Paper)

Solving Cooperative Reliability Games

Stability Scores: Measuring Coalitional Stability

Approximately Optimal Mechanism Design via Differential Privacy

Dueling Algorithms

Mechanism design with uncertain inputs (to err is human, to forgive divine)