Source author record

Yiling Chen

Yiling Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Science and Game Theory Artificial Intelligence Human-Computer Interaction Machine Learning cs.CY Multiagent Systems q-fin.TR Information Retrieval Methodology Social and Information Networks

Catalog footprint

What is connected

20works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Forecast Aggregation via Peer Prediction

Crowdsourcing enables the solicitation of forecasts on a variety of prediction tasks from distributed groups of people. How to aggregate the solicited forecasts, which may vary in quality, into an accurate final prediction remains a challenging yet critical question. Studies have found that weighing expert forecasts more in aggregation can improve the accuracy of the aggregated prediction. However, this approach usually requires access to the historical performance data of the forecasters, which are often not available. In this paper, we study the problem of aggregating forecasts without having historical performance data. We propose using peer prediction methods, a family of mechanisms initially designed to truthfully elicit private information in the absence of ground truth verification, to assess the expertise of forecasters, and then using this assessment to improve forecast aggregation. We evaluate our peer-prediction-aided aggregators on a diverse collection of 14 human forecast datasets. Compared with a variety of existing aggregators, our aggregators achieve a significant and consistent improvement on aggregation accuracy measured by the Brier score and the log score. Our results reveal the effectiveness of identifying experts to improve aggregation even without historical data.

preprint2022arXiv

Randomized Wagering Mechanisms

Wagering mechanisms are one-shot betting mechanisms that elicit agents' predictions of an event. For deterministic wagering mechanisms, an existing impossibility result has shown incompatibility of some desirable theoretical properties. In particular, Pareto optimality (no profitable side bet before allocation) can not be achieved together with weak incentive compatibility, weak budget balance and individual rationality. In this paper, we expand the design space of wagering mechanisms to allow randomization and ask whether there are randomized wagering mechanisms that can achieve all previously considered desirable properties, including Pareto optimality. We answer this question positively with two classes of randomized wagering mechanisms: i) one simple randomized lottery-type implementation of existing deterministic wagering mechanisms, and ii) another family of simple and randomized wagering mechanisms which we call surrogate wagering mechanisms, which are robust to noisy ground truth. This family of mechanisms builds on the idea of learning with noisy labels (Natarajan et al. 2013) as well as a recent extension of this idea to the information elicitation without verification setting (Liu and Chen 2018). We show that a broad family of randomized wagering mechanisms satisfy all desirable theoretical properties.

preprint2020arXiv

Computing Equilibria of Prediction Markets via Persuasion

We study the computation of equilibria in prediction markets in perhaps the most fundamental special case with two players and three trading opportunities. To do so, we show equivalence of prediction market equilibria with those of a simpler signaling game with commitment introduced by Kong and Schoenebeck (2018). We then extend their results by giving computationally efficient algorithms for additional parameter regimes. Our approach leverages a new connection between prediction markets and Bayesian persuasion, which also reveals interesting conceptual insights.

preprint2020arXiv

Mathematical Foundations for Social Computing

Social computing encompasses the mechanisms through which people interact with computational systems: crowdsourcing systems, ranking and recommendation systems, online prediction markets, citizen science projects, and collaboratively edited wikis, to name a few. These systems share the common feature that humans are active participants, making choices that determine the input to, and therefore the output of, the system. The output of these systems can be viewed as a joint computation between machine and human, and can be richer than what either could produce alone. The term social computing is often used as a synonym for several related areas, such as "human computation" and subsets of "collective intelligence"; we use it in its broadest sense to encompass all of these things. Social computing is blossoming into a rich research area of its own, with contributions from diverse disciplines including computer science, economics, and other social sciences. Yet a broad mathematical foundation for social computing is yet to be established, with a plethora of under-explored opportunities for mathematical research to impact social computing. As in other fields, there is great potential for mathematical work to influence and shape the future of social computing. However, we are far from having the systematic and principled understanding of the advantages, limitations, and potentials of social computing required to match the impact on applications that has occurred in other fields. In June 2015, we brought together roughly 25 experts in related fields to discuss the promise and challenges of establishing mathematical foundations for social computing. This document captures several of the key ideas discussed.

preprint2020arXiv

Replication Markets: Results, Lessons, Challenges and Opportunities in AI Replication

The last decade saw the emergence of systematic large-scale replication projects in the social and behavioral sciences, (Camerer et al., 2016, 2018; Ebersole et al., 2016; Klein et al., 2014, 2018; Collaboration, 2015). These projects were driven by theoretical and conceptual concerns about a high fraction of "false positives" in the scientific publications (Ioannidis, 2005) (and a high prevalence of "questionable research practices" (Simmons, Nelson, and Simonsohn, 2011). Concerns about the credibility of research findings are not unique to the behavioral and social sciences; within Computer Science, Artificial Intelligence (AI) and Machine Learning (ML) are areas of particular concern (Lucic et al., 2018; Freire, Bonnet, and Shasha, 2012; Gundersen and Kjensmo, 2018; Henderson et al., 2018). Given the pioneering role of the behavioral and social sciences in the promotion of novel methodologies to improve the credibility of research, it is a promising approach to analyze the lessons learned from this field and adjust strategies for Computer Science, AI and ML In this paper, we review approaches used in the behavioral and social sciences and in the DARPA SCORE project. We particularly focus on the role of human forecasting of replication outcomes, and how forecasting can leverage the information gained from relatively labor and resource-intensive replications. We will discuss opportunities and challenges of using these approaches to monitor and improve the credibility of research areas in Computer Science, AI, and ML.

preprint2020arXiv

Selling Information Through Consulting

We consider a monopoly information holder selling information to a budget-constrained decision maker, who may benefit from the seller's information. The decision maker has a utility function that depends on his action and an uncertain state of the world. The seller and the buyer each observe a private signal regarding the state of the world, which may be correlated with each other. The seller's goal is to sell her private information to the buyer and extract maximum possible revenue, subject to the buyer's budget constraints. We consider three different settings with increasing generality, i.e., the seller's signal and the buyer's signal can be independent, correlated, or follow a general distribution accessed through a black-box sampling oracle. For each setting, we design information selling mechanisms which are both optimal and simple in the sense that they can be naturally interpreted, have succinct representations, and can be efficiently computed. Notably, though the optimal mechanism exhibits slightly increasing complexity as the setting becomes more general, all our mechanisms share the same format of acting as a consultant who recommends the best action to the buyer but uses different and carefully designed payment rules for different settings. Each of our optimal mechanisms can be easily computed by solving a single polynomial-size linear program. This significantly simplifies exponential-size LPs solved by the Ellipsoid method in the previous work, which computes the optimal mechanisms in the same setting but without budget limit. Such simplification is enabled by our new characterizations of the optimal mechanism in the (more realistic) budget-constrained setting.

preprint2020arXiv

Strategyproof Facility Location Mechanisms with Richer Action Spaces

We study facility location problems where agents control multiple locations and when reporting their locations can choose to hide some locations (hiding), report some locations more than once (replication) and lie about their locations (manipulation). We fully characterize all facility location mechanisms that are anonymous, efficient, and strategyproof with respect to the richer strategic behavior for this setting. We also provide a characterization with respect to manipulation only. This is the first, to the best of our knowledge, characterization result for the strategyproof facility location mechanisms where each agent controls multiple locations.

preprint2020arXiv

Surrogate Scoring Rules

Strictly proper scoring rules (SPSR) are incentive compatible for eliciting information about random variables from strategic agents when the principal can reward agents after the realization of the random variables. They also quantify the quality of elicited information, with more accurate predictions receiving higher scores in expectation. In this paper, we extend such scoring rules to settings where a principal elicits private probabilistic beliefs but only has access to agents' reports. We name our solution \emph{Surrogate Scoring Rules} (SSR). SSR build on a bias correction step and an error rate estimation procedure for a reference answer defined using agents' reports. We show that, with a single bit of information about the prior distribution of the random variables, SSR in a multi-task setting recover SPSR in expectation, as if having access to the ground truth. Therefore, a salient feature of SSR is that they quantify the quality of information despite the lack of ground truth, just as SPSR do for the setting \emph{with} ground truth. As a by-product, SSR induce \emph{dominant truthfulness} in reporting. Our method is verified both theoretically and empirically using data collected from real human forecasters.

preprint2016arXiv

Group buying with bundle discounts: computing efficient, stable and fair solutions

We model a market in which nonstrategic vendors sell items of different types and offer bundles at discounted prices triggered by demand volumes. Each buyer acts strategically in order to maximize her utility, given by the difference between product valuation and price paid. Buyers report their valuations in terms of reserve prices on sets of items, and might be willing to pay prices different than the market price in order to subsidize other buyers and to trigger discounts. The resulting price discrimination can be interpreted as a redistribution of the total discount. We consider a notion of stability that looks at unilateral deviations, and show that efficient allocations - the ones maximizing the social welfare - can be stabilized by prices that enjoy desirable properties of rationality and fairness. These dictate that buyers pay higher prices only to subsidize others who contribute to the activation of the desired discounts, and that they pay premiums over the discounted price proportionally to their surplus - the difference between their current utility and the utility of their best alternative. Therefore, the resulting price discrimination appears to be desirable to buyers. Building on this existence result, and letting N, M and c be the numbers of buyers, vendors and product types, we propose a O(N^2+NM^c) algorithm that, given an efficient allocation, computes prices that are rational and fair and that stabilize the market. The algorithm first determines the redistribution of the discount between groups of buyers with an equal product choice, and then computes single buyers' prices. Our results show that if a desirable form of price discrimination is implemented then social efficiency and stability can coexists in a market presenting subtle externalities, and computing individual prices from market prices is tractable.

preprint2016arXiv

Learning to Incentivize: Eliciting Effort via Output Agreement

In crowdsourcing when there is a lack of verification for contributed answers, output agreement mechanisms are often used to incentivize participants to provide truthful answers when the correct answer is hold by the majority. In this paper, we focus on using output agreement mechanisms to elicit effort, in addition to eliciting truthful answers, from a population of workers. We consider a setting where workers have heterogeneous cost of effort exertion and examine the data requester's problem of deciding the reward level in output agreement for optimal elicitation. In particular, when the requester knows the cost distribution, we derive the optimal reward level for output agreement mechanisms. This is achieved by first characterizing Bayesian Nash equilibria of output agreement mechanisms for a given reward level. When the requester does not know the cost distribution, we develop sequential mechanisms that combine learning the cost distribution with incentivizing effort exertion to approximately determine the optimal reward level.

preprint2016arXiv

Sequential Peer Prediction: Learning to Elicit Effort using Posted Prices

Peer prediction mechanisms are often adopted to elicit truthful contributions from crowd workers when no ground-truth verification is available. Recently, mechanisms of this type have been developed to incentivize effort exertion, in addition to truthful elicitation. In this paper, we study a sequential peer prediction problem where a data requester wants to dynamically determine the reward level to optimize the trade-off between the quality of information elicited from workers and the total expected payment. In this problem, workers have homogeneous expertise and heterogeneous cost for exerting effort, both unknown to the requester. We propose a sequential posted-price mechanism to dynamically learn the optimal reward level from workers' contributions and to incentivize effort exertion and truthful reporting. We show that (1) in our mechanism, workers exerting effort according to a non-degenerate threshold policy and then reporting truthfully is an equilibrium that returns highest utility for each worker, and (2) The regret of our learning mechanism w.r.t. offering the optimal reward (price) is upper bounded by $\tilde{O}(T^{3/4})$ where $T$ is the learning horizon. We further show the power of our learning approach when the reports of workers do not necessarily follow the game-theoretic equilibrium.

preprint2015arXiv

Low-Cost Learning via Active Data Procurement

We design mechanisms for online procurement of data held by strategic agents for machine learning tasks. The challenge is to use past data to actively price future data and give learning guarantees even when an agent's cost for revealing her data may depend arbitrarily on the data itself. We achieve this goal by showing how to convert a large class of no-regret algorithms into online posted-price and learning mechanisms. Our results in a sense parallel classic sample complexity guarantees, but with the key resource being money rather than quantity of data: With a budget constraint $B$, we give robust risk (predictive error) bounds on the order of $1/\sqrt{B}$. Because we use an active approach, we can often guarantee to do significantly better by leveraging correlations between costs and data. Our algorithms and analysis go through a model of no-regret learning with $T$ arriving pairs (cost, data) and a budget constraint of $B$. Our regret bounds for this model are on the order of $T/\sqrt{B}$ and we give lower bounds on the same order.

preprint2014arXiv

Capturing Variation and Uncertainty in Human Judgment

The well-studied problem of statistical rank aggregation has been applied to comparing sports teams, information retrieval, and most recently to data generated by human judgment. Such human-generated rankings may be substantially different from traditional statistical ranking data. In this work, we show that a recently proposed generalized random utility model reveals distinctive patterns in human judgment across three different domains, and provides a succinct representation of variance in both population preferences and imperfect perception. In contrast, we also show that classical statistical ranking models fail to capture important features from human-generated input. Our work motivates the use of more flexible ranking models for representing and describing the collective preferences or decision-making of human participants.

preprint2014arXiv

Elicitation for Aggregation

We study the problem of eliciting and aggregating probabilistic information from multiple agents. In order to successfully aggregate the predictions of agents, the principal needs to elicit some notion of confidence from agents, capturing how much experience or knowledge led to their predictions. To formalize this, we consider a principal who wishes to elicit predictions about a random variable from a group of Bayesian agents, each of whom have privately observed some independent samples of the random variable, and hopes to aggregate the predictions as if she had directly observed the samples of all agents. Leveraging techniques from Bayesian statistics, we represent confidence as the number of samples an agent has observed, which is quantified by a hyperparameter from a conjugate family of prior distributions. This then allows us to show that if the principal has access to a few samples, she can achieve her aggregation goal by eliciting predictions from agents using proper scoring rules. In particular, if she has access to one sample, she can successfully aggregate the agents' predictions if and only if every posterior predictive distribution corresponds to a unique value of the hyperparameter. Furthermore, this uniqueness holds for many common distributions of interest. When this uniqueness property does not hold, we construct a novel and intuitive mechanism where a principal with two samples can elicit and optimally aggregate the agents' predictions.

preprint2014arXiv

Privacy Games

The problem of analyzing the effect of privacy concerns on the behavior of selfish utility-maximizing agents has received much attention lately. Privacy concerns are often modeled by altering the utility functions of agents to consider also their privacy loss. Such privacy aware agents prefer to take a randomized strategy even in very simple games in which non-privacy aware agents play pure strategies. In some cases, the behavior of privacy aware agents follows the framework of Randomized Response, a well-known mechanism that preserves differential privacy. Our work is aimed at better understanding the behavior of agents in settings where their privacy concerns are explicitly given. We consider a toy setting where agent A, in an attempt to discover the secret type of agent B, offers B a gift that one type of B agent likes and the other type dislikes. As opposed to previous works, B's incentive to keep her type a secret isn't the result of "hardwiring" B's utility function to consider privacy, but rather takes the form of a payment between B and A. We investigate three different types of payment functions and analyze B's behavior in each of the resulting games. As we show, under some payments, B's behavior is very different than the behavior of agents with hardwired privacy concerns and might even be deterministic. Under a different payment we show that B's BNE strategy does fall into the framework of Randomized Response.

preprint2012arXiv

A Utility Framework for Bounded-Loss Market Makers

We introduce a class of utility-based market makers that always accept orders at their risk-neutral prices. We derive necessary and sufficient conditions for such market makers to have bounded loss. We prove that hyperbolic absolute risk aversion utility market makers are equivalent to weighted pseudospherical scoring rule market makers. In particular, Hanson's logarithmic scoring rule market maker corresponds to a negative exponential utility market maker in our framework. We describe a third equivalent formulation based on maintaining a cost function that seems most natural for implementation purposes, and we illustrate how to translate among the three equivalent formulations. We examine the tradeoff between the market's liquidity and the market maker's worst-case loss. For a fixed bound on worst-case loss, some market makers exhibit greater liquidity near uniform prices and some exhibit greater liquidity near extreme prices, but no market maker can exhibit uniformly greater liquidity in all regimes. For a fixed minimum liquidity level, we give the lower bound of market maker's worst-case loss under some regularity conditions.

preprint2012arXiv

Designing Informative Securities

We create a formal framework for the design of informative securities in prediction markets. These securities allow a market organizer to infer the likelihood of events of interest as well as if he knew all of the traders' private signals. We consider the design of markets that are always informative, markets that are informative for a particular signal structure of the participants, and informative markets constructed from a restricted selection of securities. We find that to achieve informativeness, it can be necessary to allow participants to express information that may not be directly of interest to the market organizer, and that understanding the participants' signal structure is important for designing informative prediction markets.

preprint2012arXiv

Truthful Mechanisms for Agents that Value Privacy

Recent work has constructed economic mechanisms that are both truthful and differentially private. In these mechanisms, privacy is treated separately from the truthfulness; it is not incorporated in players' utility functions (and doing so has been shown to lead to non-truthfulness in some cases). In this work, we propose a new, general way of modelling privacy in players' utility functions. Specifically, we only assume that if an outcome $o$ has the property that any report of player $i$ would have led to $o$ with approximately the same probability, then $o$ has small privacy cost to player $i$. We give three mechanisms that are truthful with respect to our modelling of privacy: for an election between two candidates, for a discrete version of the facility location problem, and for a general social choice problem with discrete utilities (via a VCG-like mechanism). As the number $n$ of players increases, the social welfare achieved by our mechanisms approaches optimal (as a fraction of $n$).

preprint2010arXiv

A New Understanding of Prediction Markets Via No-Regret Learning

We explore the striking mathematical connections that exist between market scoring rules, cost function based prediction markets, and no-regret learning. We show that any cost function based prediction market can be interpreted as an algorithm for the commonly studied problem of learning from expert advice by equating trades made in the market with losses observed by the learning algorithm. If the loss of the market organizer is bounded, this bound can be used to derive an O(sqrt(T)) regret bound for the corresponding learning algorithm. We then show that the class of markets with convex cost functions exactly corresponds to the class of Follow the Regularized Leader learning algorithms, with the choice of a cost function in the market corresponding to the choice of a regularizer in the learning problem. Finally, we show an equivalence between market scoring rules and prediction markets with convex cost functions. This implies that market scoring rules can also be interpreted naturally as Follow the Regularized Leader algorithms, and may be of independent interest. These connections provide new insight into how it is that commonly studied markets, such as the Logarithmic Market Scoring Rule, can aggregate opinions into accurate estimates of the likelihood of future events.

preprint2010arXiv

An Optimization-Based Framework for Automated Market-Making

Building on ideas from online convex optimization, we propose a general framework for the design of efficient securities markets over very large outcome spaces. The challenge here is computational. In a complete market, in which one security is offered for each outcome, the market institution can not efficiently keep track of the transaction history or calculate security prices when the outcome space is large. The natural solution is to restrict the space of securities to be much smaller than the outcome space in such a way that securities can be priced efficiently. Recent research has focused on searching for spaces of securities that can be priced efficiently by existing mechanisms designed for complete markets. While there have been some successes, much of this research has led to hardness results. In this paper, we take a drastically different approach. We start with an arbitrary space of securities with bounded payoff, and establish a framework to design markets tailored to this space. We prove that any market satisfying a set of intuitive conditions must price securities via a convex potential function and that the space of reachable prices must be precisely the convex hull of the security payoffs. We then show how the convex potential function can be defined in terms of an optimization over the convex hull of the security payoffs. The optimal solution to the optimization problem gives the security prices. Using this framework, we provide an efficient market for predicting the landing location of an object on a sphere. In addition, we show that we can relax our "no-arbitrage" condition to design a new efficient market maker for pair betting, which is known to be #P-hard to price using existing mechanisms. This relaxation also allows the market maker to charge transaction fees so that the depth of the market can be dynamically increased as the number of trades increases.

Yiling Chen

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

Forecast Aggregation via Peer Prediction

Randomized Wagering Mechanisms

Computing Equilibria of Prediction Markets via Persuasion

Mathematical Foundations for Social Computing

Replication Markets: Results, Lessons, Challenges and Opportunities in AI Replication

Selling Information Through Consulting

Strategyproof Facility Location Mechanisms with Richer Action Spaces

Surrogate Scoring Rules

Group buying with bundle discounts: computing efficient, stable and fair solutions

Learning to Incentivize: Eliciting Effort via Output Agreement

Sequential Peer Prediction: Learning to Elicit Effort using Posted Prices

Low-Cost Learning via Active Data Procurement

Capturing Variation and Uncertainty in Human Judgment

Elicitation for Aggregation

Privacy Games

A Utility Framework for Bounded-Loss Market Makers

Designing Informative Securities

Truthful Mechanisms for Agents that Value Privacy

A New Understanding of Prediction Markets Via No-Regret Learning

An Optimization-Based Framework for Automated Market-Making