Source author record

Jennifer Wortman Vaughan

Jennifer Wortman Vaughan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Science and Game Theory Artificial Intelligence Machine Learning Human-Computer Interaction cs.CY Data Structures and Algorithms q-fin.TR Social and Information Networks

Catalog footprint

What is connected

18works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Assessing the Fairness of AI Systems: AI Practitioners' Processes, Challenges, and Needs for Support

Various tools and practices have been developed to support practitioners in identifying, assessing, and mitigating fairness-related harms caused by AI systems. However, prior research has highlighted gaps between the intended design of these tools and practices and their use within particular contexts, including gaps caused by the role that organizational factors play in shaping fairness work. In this paper, we investigate these gaps for one such practice: disaggregated evaluations of AI systems, intended to uncover performance disparities between demographic groups. By conducting semi-structured interviews and structured workshops with thirty-three AI practitioners from ten teams at three technology companies, we identify practitioners' processes, challenges, and needs for support when designing disaggregated evaluations. We find that practitioners face challenges when choosing performance metrics, identifying the most relevant direct stakeholders and demographic groups on which to focus, and collecting datasets with which to conduct disaggregated evaluations. More generally, we identify impacts on fairness work stemming from a lack of engagement with direct stakeholders or domain experts, business imperatives that prioritize customers over marginalized groups, and the drive to deploy AI systems at scale.

preprint2022arXiv

Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values

Machine learning (ML) interpretability techniques can reveal undesirable patterns in data that models exploit to make predictions--potentially causing harms once deployed. However, how to take action to address these patterns is not always clear. In a collaboration between ML and human-computer interaction researchers, physicians, and data scientists, we develop GAM Changer, the first interactive system to help domain experts and data scientists easily and responsibly edit Generalized Additive Models (GAMs) and fix problematic patterns. With novel interaction techniques, our tool puts interpretability into action--empowering users to analyze, validate, and align model behaviors with their knowledge and values. Physicians have started to use our tool to investigate and fix pneumonia and sepsis risk prediction models, and an evaluation with 7 data scientists working in diverse domains highlights that our tool is easy to use, meets their model editing needs, and fits into their current workflows. Built with modern web technologies, our tool runs locally in users' web browsers or computational notebooks, lowering the barrier to use. GAM Changer is available at the following public demo link: https://interpret.ml/gam-changer.

preprint2022arXiv

Interpretable Distribution Shift Detection using Optimal Transport

We propose a method to identify and characterize distribution shifts in classification datasets based on optimal transport. It allows the user to identify the extent to which each class is affected by the shift, and retrieves corresponding pairs of samples to provide insights on its nature. We illustrate its use on synthetic and natural shift examples. While the results we present are preliminary, we hope that this inspires future work on interpretable methods for analyzing distribution shifts.

preprint2022arXiv

REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research

Transparency around limitations can improve the scientific rigor of research, help ensure appropriate interpretation of research findings, and make research claims more credible. Despite these benefits, the machine learning (ML) research community lacks well-developed norms around disclosing and discussing limitations. To address this gap, we conduct an iterative design process with 30 ML and ML-adjacent researchers to develop and test REAL ML, a set of guided activities to help ML researchers recognize, explore, and articulate the limitations of their research. Using a three-stage interview and survey study, we identify ML researchers' perceptions of limitations, as well as the challenges they face when recognizing, exploring, and articulating limitations. We develop REAL ML to address some of these practical challenges, and highlight additional cultural challenges that will require broader shifts in community norms to address. We hope our study and REAL ML help move the ML research community toward more active and appropriate engagement with limitations.

preprint2022arXiv

Truthful Aggregation of Budget Proposals

We consider a participatory budgeting problem in which each voter submits a proposal for how to divide a single divisible resource (such as money or time) among several possible alternatives (such as public projects or activities) and these proposals must be aggregated into a single aggregate division. Under $\ell_1$ preferences -- for which a voter's disutility is given by the $\ell_1$ distance between the aggregate division and the division he or she most prefers -- the social welfare-maximizing mechanism, which minimizes the average $\ell_1$ distance between the outcome and each voter's proposal, is incentive compatible (Goel et al. 2016). However, it fails to satisfy the natural fairness notion of proportionality, placing too much weight on majority preferences. Leveraging a connection between market prices and the generalized median rules of Moulin (1980), we introduce the independent markets mechanism, which is both incentive compatible and proportional. We unify the social welfare-maximizing mechanism and the independent markets mechanism by defining a broad class of moving phantom mechanisms that includes both. We show that every moving phantom mechanism is incentive compatible. Finally, we characterize the social welfare-maximizing mechanism as the unique Pareto-optimal mechanism in this class, suggesting an inherent tradeoff between Pareto optimality and proportionality.

preprint2022arXiv

Understanding Machine Learning Practitioners' Data Documentation Perceptions, Needs, Challenges, and Desiderata

Data is central to the development and evaluation of machine learning (ML) models. However, the use of problematic or inappropriate datasets can result in harms when the resulting models are deployed. To encourage responsible AI practice through more deliberate reflection on datasets and transparency around the processes by which they are created, researchers and practitioners have begun to advocate for increased data documentation and have proposed several data documentation frameworks. However, there is little research on whether these data documentation frameworks meet the needs of ML practitioners, who both create and consume datasets. To address this gap, we set out to understand ML practitioners' data documentation perceptions, needs, challenges, and desiderata, with the goal of deriving design requirements that can inform future data documentation frameworks. We conducted a series of semi-structured interviews with 14 ML practitioners at a single large, international technology company. We had them answer a list of questions taken from datasheets for datasets (Gebru, 2021). Our findings show that current approaches to data documentation are largely ad hoc and myopic in nature. Participants expressed needs for data documentation frameworks to be adaptable to their contexts, integrated into their existing tools and workflows, and automated wherever possible. Despite the fact that data documentation frameworks are often motivated from the perspective of responsible AI, participants did not make the connection between the questions that they were asked to answer and their responsible AI implications. In addition, participants often had difficulties prioritizing the needs of dataset consumers and providing information that someone unfamiliar with their datasets might need to know. Based on these findings, we derive seven design requirements for future data documentation frameworks.

preprint2020arXiv

Mathematical Foundations for Social Computing

Social computing encompasses the mechanisms through which people interact with computational systems: crowdsourcing systems, ranking and recommendation systems, online prediction markets, citizen science projects, and collaboratively edited wikis, to name a few. These systems share the common feature that humans are active participants, making choices that determine the input to, and therefore the output of, the system. The output of these systems can be viewed as a joint computation between machine and human, and can be richer than what either could produce alone. The term social computing is often used as a synonym for several related areas, such as "human computation" and subsets of "collective intelligence"; we use it in its broadest sense to encompass all of these things. Social computing is blossoming into a rich research area of its own, with contributions from diverse disciplines including computer science, economics, and other social sciences. Yet a broad mathematical foundation for social computing is yet to be established, with a plethora of under-explored opportunities for mathematical research to impact social computing. As in other fields, there is great potential for mathematical work to influence and shape the future of social computing. However, we are far from having the systematic and principled understanding of the advantages, limitations, and potentials of social computing required to match the impact on applications that has occurred in other fields. In June 2015, we brought together roughly 25 experts in related fields to discuss the promise and challenges of establishing mathematical foundations for social computing. This document captures several of the key ideas discussed.

preprint2020arXiv

No-Regret and Incentive-Compatible Online Learning

We study online learning settings in which experts act strategically to maximize their influence on the learning algorithm's predictions by potentially misreporting their beliefs about a sequence of binary events. Our goal is twofold. First, we want the learning algorithm to be no-regret with respect to the best fixed expert in hindsight. Second, we want incentive compatibility, a guarantee that each expert's best strategy is to report his true beliefs about the realization of each event. To achieve this goal, we build on the literature on wagering mechanisms, a type of multi-agent scoring rule. We provide algorithms that achieve no regret and incentive compatibility for myopic experts for both the full and partial information settings. In experiments on datasets from FiveThirtyEight, our algorithms have regret comparable to classic no-regret algorithms, which are not incentive-compatible. Finally, we identify an incentive-compatible algorithm for forward-looking strategic agents that exhibits diminishing regret in practice.

preprint2016arXiv

The Possibilities and Limitations of Private Prediction Markets

We consider the design of private prediction markets, financial markets designed to elicit predictions about uncertain events without revealing too much information about market participants' actions or beliefs. Our goal is to design market mechanisms in which participants' trades or wagers influence the market's behavior in a way that leads to accurate predictions, yet no single participant has too much influence over what others are able to observe. We study the possibilities and limitations of such mechanisms using tools from differential privacy. We begin by designing a private one-shot wagering mechanism in which bettors specify a belief about the likelihood of a future event and a corresponding monetary wager. Wagers are redistributed among bettors in a way that more highly rewards those with accurate predictions. We provide a class of wagering mechanisms that are guaranteed to satisfy truthfulness, budget balance in expectation, and other desirable properties while additionally guaranteeing epsilon-joint differential privacy in the bettors' reported beliefs, and analyze the trade-off between the achievable level of privacy and the sensitivity of a bettor's payment to her own report. We then ask whether it is possible to obtain privacy in dynamic prediction markets, focusing our attention on the popular cost-function framework in which securities with payments linked to future events are bought and sold by an automated market maker. We show that under general conditions, it is impossible for such a market maker to simultaneously achieve bounded worst-case loss and epsilon-differential privacy without allowing the privacy guarantee to degrade extremely quickly as the number of trades grows, making such markets impractical in settings in which privacy is valued. We conclude by suggesting several avenues for potentially circumventing this lower bound.

preprint2015arXiv

Adaptive Contract Design for Crowdsourcing Markets: Bandit Algorithms for Repeated Principal-Agent Problems

Crowdsourcing markets have emerged as a popular platform for matching available workers with tasks to complete. The payment for a particular task is typically set by the task's requester, and may be adjusted based on the quality of the completed work, for example, through the use of "bonus" payments. In this paper, we study the requester's problem of dynamically adjusting quality-contingent payments for tasks. We consider a multi-round version of the well-known principal-agent model, whereby in each round a worker makes a strategic choice of the effort level which is not directly observable by the requester. In particular, our formulation significantly generalizes the budget-free online task pricing problems studied in prior work. We treat this problem as a multi-armed bandit problem, with each "arm" representing a potential contract. To cope with the large (and in fact, infinite) number of arms, we propose a new algorithm, AgnosticZooming, which discretizes the contract space into a finite number of regions, effectively treating each region as a single arm. This discretization is adaptively refined, so that more promising regions of the contract space are eventually discretized more finely. We analyze this algorithm, showing that it achieves regret sublinear in the time horizon and substantially improves over non-adaptive discretization (which is the only competing approach in the literature). Our results advance the state of art on several different topics: the theory of crowdsourcing markets, principal-agent problems, multi-armed bandits, and dynamic pricing.

preprint2015arXiv

Incentivizing High Quality Crowdwork

We study the causal effects of financial incentives on the quality of crowdwork. We focus on performance-based payments (PBPs), bonus payments awarded to workers for producing high quality work. We design and run randomized behavioral experiments on the popular crowdsourcing platform Amazon Mechanical Turk with the goal of understanding when, where, and why PBPs help, identifying properties of the payment, payment structure, and the task itself that make them most effective. We provide examples of tasks for which PBPs do improve quality. For such tasks, the effectiveness of PBPs is not too sensitive to the threshold for quality required to receive the bonus, while the magnitude of the bonus must be large enough to make the reward salient. We also present examples of tasks for which PBPs do not improve quality. Our results suggest that for PBPs to improve quality, the task must be effort-responsive: the task must allow workers to produce higher quality work by exerting more effort. We also give a simple method to determine if a task is effort-responsive a priori. Furthermore, our experiments suggest that all payments on Mechanical Turk are, to some degree, implicitly performance-based in that workers believe their work may be rejected if their performance is sufficiently poor. Finally, we propose a new model of worker behavior that extends the standard principal-agent model from economics to include a worker's subjective beliefs about his likelihood of being paid, and show that the predictions of this model are in line with our experimental findings. This model may be useful as a foundation for theoretical studies of incentives in crowdsourcing markets.

preprint2014arXiv

Market Making with Decreasing Utility for Information

We study information elicitation in cost-function-based combinatorial prediction markets when the market maker's utility for information decreases over time. In the sudden revelation setting, it is known that some piece of information will be revealed to traders, and the market maker wishes to prevent guaranteed profits for trading on the sure information. In the gradual decrease setting, the market maker's utility for (partial) information decreases continuously over time. We design adaptive cost functions for both settings which: (1) preserve the information previously gathered in the market; (2) eliminate (or diminish) rewards to traders for the publicly revealed information; (3) leave the reward structure unaffected for other information; and (4) maintain the market maker's worst-case loss. Our constructions utilize mixed Bregman divergence, which matches our notion of utility for information.

preprint2013arXiv

Online Decision Making in Crowdsourcing Markets: Theoretical Challenges (Position Paper)

Over the past decade, crowdsourcing has emerged as a cheap and efficient method of obtaining solutions to simple tasks that are difficult for computers to solve but possible for humans. The popularity and promise of crowdsourcing markets has led to both empirical and theoretical research on the design of algorithms to optimize various aspects of these markets, such as the pricing and assignment of tasks. Much of the existing theoretical work on crowdsourcing markets has focused on problems that fall into the broad category of online decision making; task requesters or the crowdsourcing platform itself make repeated decisions about prices to set, workers to filter out, problems to assign to specific workers, or other things. Often these decisions are complex, requiring algorithms that learn about the distribution of available tasks or workers over time and take into account the strategic (or sometimes irrational) behavior of workers. As human computation grows into its own field, the time is ripe to address these challenges in a principled way. However, it appears very difficult to capture all pertinent aspects of crowdsourcing markets in a single coherent model. In this paper, we reflect on the modeling issues that inhibit theoretical research on online decision making for crowdsourcing, and identify some steps forward. This paper grew out of the authors' own frustration with these issues, and we hope it will encourage the community to attempt to understand, debate, and ultimately address them. The authors welcome feedback for future revisions of this paper.

preprint2012arXiv

Censored Exploration and the Dark Pool Problem

We introduce and analyze a natural algorithm for multi-venue exploration from censored data, which is motivated by the Dark Pool Problem of modern quantitative finance. We prove that our algorithm converges in polynomial time to a near-optimal allocation policy; prior results for similar problems in stochastic inventory control guaranteed only asymptotic convergence and examined variants in which each venue could be treated independently. Our analysis bears a strong resemblance to that of efficient exploration/ exploitation schemes in the reinforcement learning literature. We describe an extensive experimental evaluation of our algorithm on the Dark Pool Problem using real trading data.

preprint2012arXiv

Designing Informative Securities

We create a formal framework for the design of informative securities in prediction markets. These securities allow a market organizer to infer the likelihood of events of interest as well as if he knew all of the traders' private signals. We consider the design of markets that are always informative, markets that are informative for a particular signal structure of the participants, and informative markets constructed from a restricted selection of securities. We find that to achieve informativeness, it can be necessary to allow participants to express information that may not be directly of interest to the market organizer, and that understanding the participants' signal structure is important for designing informative prediction markets.

preprint2010arXiv

A New Understanding of Prediction Markets Via No-Regret Learning

We explore the striking mathematical connections that exist between market scoring rules, cost function based prediction markets, and no-regret learning. We show that any cost function based prediction market can be interpreted as an algorithm for the commonly studied problem of learning from expert advice by equating trades made in the market with losses observed by the learning algorithm. If the loss of the market organizer is bounded, this bound can be used to derive an O(sqrt(T)) regret bound for the corresponding learning algorithm. We then show that the class of markets with convex cost functions exactly corresponds to the class of Follow the Regularized Leader learning algorithms, with the choice of a cost function in the market corresponding to the choice of a regularizer in the learning problem. Finally, we show an equivalence between market scoring rules and prediction markets with convex cost functions. This implies that market scoring rules can also be interpreted naturally as Follow the Regularized Leader algorithms, and may be of independent interest. These connections provide new insight into how it is that commonly studied markets, such as the Logarithmic Market Scoring Rule, can aggregate opinions into accurate estimates of the likelihood of future events.

preprint2010arXiv

An Optimization-Based Framework for Automated Market-Making

Building on ideas from online convex optimization, we propose a general framework for the design of efficient securities markets over very large outcome spaces. The challenge here is computational. In a complete market, in which one security is offered for each outcome, the market institution can not efficiently keep track of the transaction history or calculate security prices when the outcome space is large. The natural solution is to restrict the space of securities to be much smaller than the outcome space in such a way that securities can be priced efficiently. Recent research has focused on searching for spaces of securities that can be priced efficiently by existing mechanisms designed for complete markets. While there have been some successes, much of this research has led to hardness results. In this paper, we take a drastically different approach. We start with an arbitrary space of securities with bounded payoff, and establish a framework to design markets tailored to this space. We prove that any market satisfying a set of intuitive conditions must price securities via a convex potential function and that the space of reachable prices must be precisely the convex hull of the security payoffs. We then show how the convex potential function can be defined in terms of an optimization over the convex hull of the security payoffs. The optimal solution to the optimization problem gives the security prices. Using this framework, we provide an efficient market for predicting the landing location of an object on a sphere. In addition, we show that we can relax our "no-arbitrage" condition to design a new efficient market maker for pair betting, which is known to be #P-hard to price using existing mechanisms. This relaxation also allows the market maker to charge transaction fees so that the depth of the market can be dynamically increased as the number of trades increases.

preprint2010arXiv

Evolution with Drifting Targets

We consider the question of the stability of evolutionary algorithms to gradual changes, or drift, in the target concept. We define an algorithm to be resistant to drift if, for some inverse polynomial drift rate in the target function, it converges to accuracy 1 -- ε, with polynomial resources, and then stays within that accuracy indefinitely, except with probability ε, at any one time. We show that every evolution algorithm, in the sense of Valiant (2007; 2009), can be converted using the Correlational Query technique of Feldman (2008), into such a drift resistant algorithm. For certain evolutionary algorithms, such as for Boolean conjunctions, we give bounds on the rates of drift that they can resist. We develop some new evolution algorithms that are resistant to significant drift. In particular, we give an algorithm for evolving linear separators over the spherically symmetric distribution that is resistant to a drift rate of O(ε/n), and another algorithm over the more general product normal distributions that resists a smaller drift rate. The above translation result can be also interpreted as one on the robustness of the notion of evolvability itself under changes of definition. As a second result in that direction we show that every evolution algorithm can be converted to a quasi-monotonic one that can evolve from any starting point without the performance ever dipping significantly below that of the starting point. This permits the somewhat unnatural feature of arbitrary performance degradations to be removed from several known robustness translations.

Jennifer Wortman Vaughan

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

Assessing the Fairness of AI Systems: AI Practitioners' Processes, Challenges, and Needs for Support

Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values

Interpretable Distribution Shift Detection using Optimal Transport

REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research

Truthful Aggregation of Budget Proposals

Understanding Machine Learning Practitioners' Data Documentation Perceptions, Needs, Challenges, and Desiderata

Mathematical Foundations for Social Computing

No-Regret and Incentive-Compatible Online Learning

The Possibilities and Limitations of Private Prediction Markets

Adaptive Contract Design for Crowdsourcing Markets: Bandit Algorithms for Repeated Principal-Agent Problems

Incentivizing High Quality Crowdwork

Market Making with Decreasing Utility for Information

Online Decision Making in Crowdsourcing Markets: Theoretical Challenges (Position Paper)

Censored Exploration and the Dark Pool Problem

Designing Informative Securities

A New Understanding of Prediction Markets Via No-Regret Learning

An Optimization-Based Framework for Automated Market-Making

Evolution with Drifting Targets