Source author record

Ariel D. Procaccia

Ariel D. Procaccia appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Science and Game Theory Artificial Intelligence Data Structures and Algorithms Machine Learning Cryptography and Security math.PR

Catalog footprint

What is connected

17works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Embeddings for Preferences, Not Semantics

Modern AI is opening the door to collective decision-making in which participants express their views as free-form text rather than voting on a fixed set of candidates. A natural idea is to embed these opinions in a vector space so that the substantial literature on facility location problems and fair clustering can be brought to bear. But standard text embeddings measure semantic similarity, whereas distances in facility location problems and fair clustering require what we call \textit{preferential similarity}: a participant's agreement with a piece of text should be inversely related to their distance from it. Off-the-shelf embeddings inherit a coarse preference signal through a correlation between semantic and preferential similarity, but fail to capture preferences when the correlation breaks. We formalize this as an invariance problem: text embedding models encode both a preference-relevant signal (stance and values) and semantic nuisance (style and wording), and the two are observationally correlated, so a geometry that relies on nuisance can appear preference-correct even when it is not. We show that synthetic training data designed to break this correlation provably shifts the optimal scorer away from nuisance-dominated cosine and significantly improves preference prediction across 11 online deliberation datasets.

preprint2021arXiv

District-Fair Participatory Budgeting

Participatory budgeting is a method used by city governments to select public projects to fund based on residents' votes. Many cities use participatory budgeting at a district level. Typically, a budget is divided among districts proportionally to their population, and each district holds an election over local projects and then uses its budget to fund the projects most preferred by its voters. However, district-level participatory budgeting can yield poor social welfare because it does not necessarily fund projects supported across multiple districts. On the other hand, decision making that only takes global social welfare into account can be unfair to districts: A social-welfare-maximizing solution might not fund any of the projects preferred by a district, despite the fact that its constituents pay taxes to the city. Thus, we study how to fairly maximize social welfare in a participatory budgeting setting with a single city-wide election. We propose a notion of fairness that guarantees each district at least as much welfare as it would have received in a district-level election. We show that, although optimizing social welfare subject to this notion of fairness is NP-hard, we can efficiently construct a lottery over welfare-optimal outcomes that is fair in expectation. Moreover, we show that, when we are allowed to slightly relax fairness, we can efficiently compute a fair solution that is welfare-maximizing, but which may overspend the budget.

preprint2020arXiv

Learning and Planning in the Feature Deception Problem

Today's high-stakes adversarial interactions feature attackers who constantly breach the ever-improving security measures. Deception mitigates the defender's loss by misleading the attacker to make suboptimal decisions. In order to formally reason about deception, we introduce the feature deception problem (FDP), a domain-independent model and present a learning and planning framework for finding the optimal deception strategy, taking into account the adversary's preferences which are initially unknown to the defender. We make the following contributions. (1) We show that we can uniformly learn the adversary's preferences using data from a modest number of deception strategies. (2) We propose an approximation algorithm for finding the optimal deception strategy given the learned preferences and show that the problem is NP-hard. (3) We perform extensive experiments to validate our methods and results. In addition, we provide a case study of the credit bureau network to illustrate how FDP implements deception on a real-world problem.

preprint2020arXiv

Loss Functions, Axioms, and Peer Review

It is common to see a handful of reviewers reject a highly novel paper, because they view, say, extensive experiments as far more important than novelty, whereas the community as a whole would have embraced the paper. More generally, the disparate mapping of criteria scores to final recommendations by different reviewers is a major source of inconsistency in peer review. In this paper we present a framework inspired by empirical risk minimization (ERM) for learning the community's aggregate mapping. The key challenge that arises is the specification of a loss function for ERM. We consider the class of $L(p,q)$ loss functions, which is a matrix-extension of the standard class of $L_p$ losses on vectors; here the choice of the loss function amounts to choosing the hyperparameters $p, q \in [1,\infty]$. To deal with the absence of ground truth in our problem, we instead draw on computational social choice to identify desirable values of the hyperparameters $p$ and $q$. Specifically, we characterize $p=q=1$ as the only choice of these hyperparameters that satisfies three natural axiomatic properties. Finally, we implement and apply our approach to reviews from IJCAI 2017.

preprint2020arXiv

The Phantom Steering Effect in Q&A Websites

Badges are commonly used in online platforms as incentives for promoting contributions. It is widely accepted that badges "steer" people's behavior toward increasing their rate of contributions before obtaining the badge. This paper provides a new probabilistic model of user behavior in the presence of badges. By applying the model to data from thousands of users on the Q&A site Stack Overflow, we find that steering is not as widely applicable as was previously understood. Rather, the majority of users remain apathetic toward badges, while still providing a substantial number of contributions to the site. An interesting statistical phenomenon, termed "Phantom Steering," accounts for the interaction data of these users and this may have contributed to some previous conclusions about steering. Our results suggest that a small population, approximately 20%, of users respond to the badge incentives. Moreover, we conduct a qualitative survey of the users on Stack Overflow which provides further evidence that the insights from the model reflect the true behavior of the community. We argue that while badges might contribute toward a suite of effective rewards in an online system, research into other aspects of reward systems such as Stack Overflow reputation points should become a focus of the community.

preprint2016arXiv

An Algorithmic Framework for Strategic Fair Division

We study the paradigmatic fair division problem of allocating a divisible good among agents with heterogeneous preferences, commonly known as cake cutting. Classical cake cutting protocols are susceptible to manipulation. Do their strategic outcomes still guarantee fairness? To address this question we adopt a novel algorithmic approach, by designing a concrete computational framework for fair division---the class of Generalized Cut and Choose (GCC) protocols}---and reasoning about the game-theoretic properties of algorithms that operate in this model. The class of GCC protocols includes the most important discrete cake cutting protocols, and turns out to be compatible with the study of fair division among strategic agents. In particular, GCC protocols are guaranteed to have approximate subgame perfect Nash equilibria, or even exact equilibria if the protocol's tie-breaking rule is flexible. We further observe that the (approximate) equilibria of proportional GCC protocols---which guarantee each of the $n$ agents a $1/n$-fraction of the cake---must be (approximately) proportional. Finally, we design a protocol in this framework with the property that its Nash equilibrium allocations coincide with the set of (contiguous) envy-free allocations.

preprint2016arXiv

Learning Cooperative Games

This paper explores a PAC (probably approximately correct) learning model in cooperative games. Specifically, we are given $m$ random samples of coalitions and their values, taken from some unknown cooperative game; can we predict the values of unseen coalitions? We study the PAC learnability of several well-known classes of cooperative games, such as network flow games, threshold task games, and induced subgraph games. We also establish a novel connection between PAC learnability and core stability: for games that are efficiently learnable, it is possible to find payoff divisions that are likely to be stable using a polynomial number of samples.

preprint2016arXiv

Opting Into Optimal Matchings

We revisit the problem of designing optimal, individually rational matching mechanisms (in a general sense, allowing for cycles in directed graphs), where each player --- who is associated with a subset of vertices --- matches as many of his own vertices when he opts into the matching mechanism as when he opts out. We offer a new perspective on this problem by considering an arbitrary graph, but assuming that vertices are associated with players at random. Our main result asserts that, under certain conditions, any fixed optimal matching is likely to be individually rational up to lower-order terms. We also show that a simple and practical mechanism is (fully) individually rational, and likely to be optimal up to lower-order terms. We discuss the implications of our results for market design in general, and kidney exchange in particular.

preprint2016arXiv

Small Representations of Big Kidney Exchange Graphs

Kidney exchanges are organized markets where patients swap willing but incompatible donors. In the last decade, kidney exchanges grew from small and regional to large and national---and soon, international. This growth results in more lives saved, but exacerbates the empirical hardness of the $\mathcal{NP}$-complete problem of optimally matching patients to donors. State-of-the-art matching engines use integer programming techniques to clear fielded kidney exchanges, but these methods must be tailored to specific models and objective functions, and may fail to scale to larger exchanges. In this paper, we observe that if the kidney exchange compatibility graph can be encoded by a constant number of patient and donor attributes, the clearing problem is solvable in polynomial time. We give necessary and sufficient conditions for losslessly shrinking the representation of an arbitrary compatibility graph. Then, using real compatibility graphs from the UNOS nationwide kidney exchange, we show how many attributes are needed to encode real compatibility graphs. The experiments show that, indeed, small numbers of attributes suffice.

preprint2015arXiv

Ignorance is Almost Bliss: Near-Optimal Stochastic Matching With Few Queries

The stochastic matching problem deals with finding a maximum matching in a graph whose edges are unknown but can be accessed via queries. This is a special case of stochastic $k$-set packing, where the problem is to find a maximum packing of sets, each of which exists with some probability. In this paper, we provide edge and set query algorithms for these two problems, respectively, that provably achieve some fraction of the omniscient optimal solution. Our main theoretical result for the stochastic matching (i.e., $2$-set packing) problem is the design of an \emph{adaptive} algorithm that queries only a constant number of edges per vertex and achieves a $(1-ε)$ fraction of the omniscient optimal solution, for an arbitrarily small $ε>0$. Moreover, this adaptive algorithm performs the queries in only a constant number of rounds. We complement this result with a \emph{non-adaptive} (i.e., one round of queries) algorithm that achieves a $(0.5 - ε)$ fraction of the omniscient optimum. We also extend both our results to stochastic $k$-set packing by designing an adaptive algorithm that achieves a $(\frac{2}{k} - ε)$ fraction of the omniscient optimal solution, again with only $O(1)$ queries per element. This guarantee is close to the best known polynomial-time approximation ratio of $\frac{3}{k+1} -ε$ for the \emph{deterministic} $k$-set packing problem [Furer and Yu, 2013] We empirically explore the application of (adaptations of) these algorithms to the kidney exchange problem, where patients with end-stage renal failure swap willing but incompatible donors. We show on both generated data and on real data from the first 169 match runs of the UNOS nationwide kidney exchange that even a very small number of non-adaptive edge queries per vertex results in large gains in expected successful matches.

preprint2015arXiv

Influence in Classification via Cooperative Game Theory

A dataset has been classified by some unknown classifier into two types of points. What were the most important factors in determining the classification outcome? In this work, we employ an axiomatic approach in order to uniquely characterize an influence measure: a function that, given a set of classified points, outputs a value for each feature corresponding to its influence in determining the classification outcome. We show that our influence measure takes on an intuitive form when the unknown classifier is linear. Finally, we employ our influence measure in order to analyze the effects of user profiling on Google's online display advertising.

preprint2014arXiv

Verifiably Truthful Mechanisms

It is typically expected that if a mechanism is truthful, then the agents would, indeed, truthfully report their private information. But why would an agent believe that the mechanism is truthful? We wish to design truthful mechanisms, whose truthfulness can be verified efficiently (in the computational sense). Our approach involves three steps: (i) specifying the structure of mechanisms, (ii) constructing a verification algorithm, and (iii) measuring the quality of verifiably truthful mechanisms. We demonstrate this approach using a case study: approximate mechanism design without money for facility location.

preprint2013arXiv

A Smooth Transition from Powerlessness to Absolute Power

We study the phase transition of the coalitional manipulation problem for generalized scoring rules. Previously it has been shown that, under some conditions on the distribution of votes, if the number of manipulators is $o(\sqrt{n})$, where $n$ is the number of voters, then the probability that a random profile is manipulable by the coalition goes to zero as the number of voters goes to infinity, whereas if the number of manipulators is $ω(\sqrt{n})$, then the probability that a random profile is manipulable goes to one. Here we consider the critical window, where a coalition has size $c\sqrt{n}$, and we show that as $c$ goes from zero to infinity, the limiting probability that a random profile is manipulable goes from zero to one in a smooth fashion, i.e., there is a smooth phase transition between the two regimes. This result analytically validates recent empirical results, and suggests that deciding the coalitional manipulation problem may be of limited computational hardness in practice.

preprint2013arXiv

Audit Games

Effective enforcement of laws and policies requires expending resources to prevent and detect offenders, as well as appropriate punishment schemes to deter violators. In particular, enforcement of privacy laws and policies in modern organizations that hold large volumes of personal information (e.g., hospitals, banks, and Web services providers) relies heavily on internal audit mechanisms. We study economic considerations in the design of these mechanisms, focusing in particular on effective resource allocation and appropriate punishment schemes. We present an audit game model that is a natural generalization of a standard security game model for resource allocation with an additional punishment parameter. Computing the Stackelberg equilibrium for this game is challenging because it involves solving an optimization problem with non-convex quadratic constraints. We present an additive FPTAS that efficiently computes a solution that is arbitrarily close to the optimal solution.

preprint2012arXiv

A Maximum Likelihood Approach For Selecting Sets of Alternatives

We consider the problem of selecting a subset of alternatives given noisy evaluations of the relative strength of different alternatives. We wish to select a k-subset (for a given k) that provides a maximum likelihood estimate for one of several objectives, e.g., containing the strongest alternative. Although this problem is NP-hard, we show that when the noise level is sufficiently high, intuitive methods provide the optimal solution. We thus generalize classical results about singling out one alternative and identifying the hidden ranking of alternatives by strength. Extensive experiments show that our methods perform well in practical settings.

preprint2012arXiv

Bayesian Vote Manipulation: Optimal Strategies and Impact on Welfare

Most analyses of manipulation of voting schemes have adopted two assumptions that greatly diminish their practical import. First, it is usually assumed that the manipulators have full knowledge of the votes of the nonmanipulating agents. Second, analysis tends to focus on the probability of manipulation rather than its impact on the social choice objective (e.g., social welfare). We relax both of these assumptions by analyzing optimal Bayesian manipulation strategies when the manipulators have only partial probabilistic information about nonmanipulator votes, and assessing the expected loss in social welfare (in the broad sense of the term). We present a general optimization framework for the derivation of optimal manipulation strategies given arbitrary voting rules and distributions over preferences. We theoretically and empirically analyze the optimal manipulability of some popular voting rules using distributions and real data sets that go well beyond the common, but unrealistic, impartial culture assumption. We also shed light on the stark difference between the loss in social welfare and the probability of manipulation by showing that even when manipulation is likely, impact to social welfare is slight (and often negligible).

preprint2010arXiv

Mix and Match

Consider a matching problem on a graph where disjoint sets of vertices are privately owned by self-interested agents. An edge between a pair of vertices indicates compatibility and allows the vertices to match. We seek a mechanism to maximize the number of matches despite self-interest, with agents that each want to maximize the number of their own vertices that match. Each agent can choose to hide some of its vertices, and then privately match the hidden vertices with any of its own vertices that go unmatched by the mechanism. A prominent application of this model is to kidney exchange, where agents correspond to hospitals and vertices to donor-patient pairs. Here hospitals may game an exchange by holding back pairs and harm social welfare. In this paper we seek to design mechanisms that are strategyproof, in the sense that agents cannot benefit from hiding vertices, and approximately maximize efficiency, i.e., produce a matching that is close in cardinality to the maximum cardinality matching. Our main result is the design and analysis of the eponymous Mix-and-Match mechanism; we show that this randomized mechanism is strategyproof and provides a 2-approximation. Lower bounds establish that the mechanism is near optimal.

Ariel D. Procaccia

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

Embeddings for Preferences, Not Semantics

District-Fair Participatory Budgeting

Learning and Planning in the Feature Deception Problem

Loss Functions, Axioms, and Peer Review

The Phantom Steering Effect in Q&A Websites

An Algorithmic Framework for Strategic Fair Division

Learning Cooperative Games

Opting Into Optimal Matchings

Small Representations of Big Kidney Exchange Graphs

Ignorance is Almost Bliss: Near-Optimal Stochastic Matching With Few Queries

Influence in Classification via Cooperative Game Theory

Verifiably Truthful Mechanisms

A Smooth Transition from Powerlessness to Absolute Power

Audit Games

A Maximum Likelihood Approach For Selecting Sets of Alternatives

Bayesian Vote Manipulation: Optimal Strategies and Impact on Welfare

Mix and Match