Source author record

Eric Horvitz

Eric Horvitz appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence cs.CY Machine Learning Human-Computer Interaction Information Retrieval Computation and Language Cryptography and Security Social and Information Networks Digital Libraries physics.soc-ph Computational Engineering, Finance, and Science Computer Vision eess.SY Multimedia Neurons and Cognition Software Engineering Systems and Control

Catalog footprint

What is connected

30works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Can Revealed Preferences Clarify LLM Alignment and Steering?

LLMs are increasingly used to make or support high-stakes decisions under uncertainty, where alignment depends not only on factual accuracy but on how models weigh tradeoffs between different outcomes. We present an empirical pipeline for estimating the implied preferences that an LLM's observed choices optimize: we elicit the model's probability distribution over unknowns along with the choice it would make for the decision task and then fit a discrete choice model to recover the cost function that best rationalizes the model's decisions. We show how this revealed-preference description allows rigorous evaluation of whether models behave in a consistently goal-directed way, whether they can verbalize a description of their objectives which matches their revealed decision policy, and whether prompting can reliably steer those policies to implement a user-specified cost function. We apply this evaluation across four medical diagnosis domains and multiple frontier and open-source models. We find that while many models have a nontrivial degree of internal coherence, they also have significant weaknesses in faithfully reporting or adopting preferences in response to user direction.

preprint2022arXiv

A Search Engine for Discovery of Scientific Challenges and Directions

Keeping track of scientific challenges, advances and emerging directions is a fundamental part of research. However, researchers face a flood of papers that hinders discovery of important knowledge. In biomedicine, this directly impacts human lives. To address this problem, we present a novel task of extraction and search of scientific challenges and directions, to facilitate rapid knowledge discovery. We construct and release an expert-annotated corpus of texts sampled from full-length papers, labeled with novel semantic categories that generalize across many types of challenges and directions. We focus on a large corpus of interdisciplinary work relating to the COVID-19 pandemic, ranging from biomedicine to areas such as AI and economics. We apply a model trained on our data to identify challenges and directions across the corpus and build a dedicated search engine. In experiments with 19 researchers and clinicians using our system, we outperform a popular scientific search engine in assisting knowledge discovery. Finally, we show that models trained on our resource generalize to the wider biomedical domain and to AI papers, highlighting its broad utility. We make our data, model and search engine publicly available. https://challenges.apps.allenai.org/

preprint2022arXiv

Bursting Scientific Filter Bubbles: Boosting Innovation via Novel Author Discovery

Isolated silos of scientific research and the growing challenge of information overload limit awareness across the literature and hinder innovation. Algorithmic curation and recommendation, which often prioritize relevance, can further reinforce these informational "filter bubbles." In response, we describe Bridger, a system for facilitating discovery of scholars and their work. We construct a faceted representation of authors with information gleaned from their papers and inferred author personas, and use it to develop an approach that locates commonalities and contrasts between scientists to balance relevance and novelty. In studies with computer science researchers, this approach helps users discover authors considered useful for generating novel research directions. We also demonstrate an approach for displaying information about authors, boosting the ability to understand the work of new, unfamiliar scholars. Our analysis reveals that Bridger connects authors who have different citation profiles and publish in different venues, raising the prospect of bridging diverse scientific communities.

preprint2022arXiv

Imagined versus Remembered Stories: Quantifying Differences in Narrative Flow

Lifelong experiences and learned knowledge lead to shared expectations about how common situations tend to unfold. Such knowledge of narrative event flow enables people to weave together a story. However, comparable computational tools to evaluate the flow of events in narratives are limited. We quantify the differences between autobiographical and imagined stories by introducing sequentiality, a measure of narrative flow of events, drawing probabilistic inferences from a cutting-edge large language model (GPT-3). Sequentiality captures the flow of a narrative by comparing the probability of a sentence with and without its preceding story context. We applied our measure to study thousands of diary-like stories, collected from crowdworkers about either a recent remembered experience or an imagined story on the same topic. The results show that imagined stories have higher sequentiality than autobiographical stories and that the sequentiality of autobiographical stories increases when the memories are retold several months later. In pursuit of deeper understandings of how sequentiality measures the flow of narratives, we explore proportions of major and minor events in story sentences, as annotated by crowdworkers. We find that lower sequentiality is associated with higher proportions of major events. The methods and results highlight opportunities to use cutting-edge computational analyses, such as sequentiality, on large corpora of matched imagined and autobiographical stories to investigate the influences of memory and reasoning on language generation processes.

preprint2022arXiv

Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging

Details of the designs and mechanisms in support of human-AI collaboration must be considered in the real-world fielding of AI technologies. A critical aspect of interaction design for AI-assisted human decision making are policies about the display and sequencing of AI inferences within larger decision-making workflows. We have a poor understanding of the influences of making AI inferences available before versus after human review of a diagnostic task at hand. We explore the effects of providing AI assistance at the start of a diagnostic session in radiology versus after the radiologist has made a provisional decision. We conducted a user study where 19 veterinary radiologists identified radiographic findings present in patients' X-ray images, with the aid of an AI tool. We employed two workflow configurations to analyze (i) anchoring effects, (ii) human-AI team diagnostic performance and agreement, (iii) time spent and confidence in decision making, and (iv) perceived usefulness of the AI. We found that participants who are asked to register provisional responses in advance of reviewing AI inferences are less likely to agree with the AI regardless of whether the advice is accurate and, in instances of disagreement with the AI, are less likely to seek the second opinion of a colleague. These participants also reported the AI advice to be less useful. Surprisingly, requiring provisional decisions on cases in advance of the display of AI inferences did not lengthen the time participants spent on the task. The study provides generalizable and actionable insights for the deployment of clinical AI tools in human-in-the-loop systems and introduces a methodology for studying alternative designs for human-AI collaboration. We make our experimental platform available as open source to facilitate future research on the influence of alternate designs on human-AI workflows.

preprint2021arXiv

Formation of Social Ties Influences Food Choice: A Campus-Wide Longitudinal Study

Nutrition is a key determinant of long-term health, and social influence has long been theorized to be a key determinant of nutrition. It has been difficult to quantify the postulated role of social influence on nutrition using traditional methods such as surveys, due to the typically small scale and short duration of studies. To overcome these limitations, we leverage a novel source of data: logs of 38 million food purchases made over an 8-year period on the Ecole Polytechnique Federale de Lausanne (EPFL) university campus, linked to anonymized individuals via the smartcards used to make on-campus purchases. In a longitudinal observational study, we ask: How is a person's food choice affected by eating with someone else whose own food choice is healthy vs. unhealthy? To estimate causal effects from the passively observed log data, we control confounds in a matched quasi-experimental design: we identify focal users who at first do not have any regular eating partners but then start eating with a fixed partner regularly, and we match focal users into comparison pairs such that paired users are nearly identical with respect to covariates measured before acquiring the partner, where the two focal users' new eating partners diverge in the healthiness of their respective food choice. A difference-in-differences analysis of the paired data yields clear evidence of social influence: focal users acquiring a healthy-eating partner change their habits significantly more toward healthy foods than focal users acquiring an unhealthy-eating partner. We further identify foods whose purchase frequency is impacted significantly by the eating partner's healthiness of food choice. Beyond the main results, the work demonstrates the utility of passively sensed food purchase logs for deriving insights, with the potential of informing the design of public health interventions and food offerings.

preprint2021arXiv

Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork

AI practitioners typically strive to develop the most accurate systems, making an implicit assumption that the AI system will function autonomously. However, in practice, AI systems often are used to provide advice to people in domains ranging from criminal justice and finance to healthcare. In such AI-advised decision making, humans and machines form a team, where the human is responsible for making final decisions. But is the most accurate AI the best teammate? We argue "No" -- predictable performance may be worth a slight sacrifice in AI accuracy. Instead, we argue that AI systems should be trained in a human-centered manner, directly optimized for team performance. We study this proposal for a specific type of human-AI teaming, where the human overseer chooses to either accept the AI recommendation or solve the task themselves. To optimize the team performance for this setting we maximize the team's expected utility, expressed in terms of the quality of the final decision, cost of verifying, and individual accuracies of people and machines. Our experiments with linear and non-linear models on real-world, high-stakes datasets show that the most accuracy AI may not lead to highest team performance and show the benefit of modeling teamwork during training through improvements in expected team utility across datasets, considering parameters such as human skill and the cost of mistakes. We discuss the shortcoming of current optimization approaches beyond well-studied loss functions such as log-loss, and encourage future work on AI optimization problems motivated by human-AI collaboration.

preprint2021arXiv

Population-Scale Study of Human Needs During the COVID-19 Pandemic: Analysis and Implications

Most work to date on mitigating the COVID-19 pandemic is focused urgently on biomedicine and epidemiology. Yet, pandemic-related policy decisions cannot be made on health information alone. Decisions need to consider the broader impacts on people and their needs. Quantifying human needs across the population is challenging as it requires high geo-temporal granularity, high coverage across the population, and appropriate adjustment for seasonal and other external effects. Here, we propose a computational methodology, building on Maslow's hierarchy of needs, that can capture a holistic view of relative changes in needs following the pandemic through a difference-in-differences approach that corrects for seasonality and volume variations. We apply this approach to characterize changes in human needs across physiological, socioeconomic, and psychological realms in the US, based on more than 35 billion search interactions spanning over 36,000 ZIP codes over a period of 14 months. The analyses reveal that the expression of basic human needs has increased exponentially while higher-level aspirations declined during the pandemic in comparison to the pre-pandemic period. In exploring the timing and variations in statewide policies, we find that the durations of shelter-in-place mandates have influenced social and emotional needs significantly. We demonstrate that potential barriers to addressing critical needs, such as support for unemployment and domestic violence, can be identified through web search interactions. Our approach and results suggest that population-scale monitoring of shifts in human needs can inform policies and recovery efforts for current and anticipated needs.

preprint2020arXiv

AMP: Authentication of Media via Provenance

Advances in graphics and machine learning have led to the general availability of easy-to-use tools for modifying and synthesizing media. The proliferation of these tools threatens to cast doubt on the veracity of all media. One approach to thwarting the flow of fake media is to detect modified or synthesized media through machine learning methods. While detection may help in the short term, we believe that it is destined to fail as the quality of fake media generation continues to improve. Soon, neither humans nor algorithms will be able to reliably distinguish fake versus real content. Thus, pipelines for assuring the source and integrity of media will be required---and increasingly relied upon. We propose AMP, a system that ensures the authentication of media via certifying provenance. AMP creates one or more publisher-signed manifests for a media instance uploaded by a content provider. These manifests are stored in a database allowing fast lookup from applications such as browsers. For reference, the manifests are also registered and signed by a permissioned ledger, implemented using the Confidential Consortium Framework (CCF). CCF employs both software and hardware techniques to ensure the integrity and transparency of all registered manifests. AMP, through its use of CCF, enables a consortium of media providers to govern the service while making all its operations auditable. The authenticity of the media can be communicated to the user via visual elements in the browser, indicating that an AMP manifest has been successfully located and verified.

preprint2020arXiv

An Empirical Analysis of Backward Compatibility in Machine Learning Systems

In many applications of machine learning (ML), updates are performed with the goal of enhancing model performance. However, current practices for updating models rely solely on isolated, aggregate performance analyses, overlooking important dependencies, expectations, and needs in real-world deployments. We consider how updates, intended to improve ML models, can introduce new errors that can significantly affect downstream systems and users. For example, updates in models used in cloud-based classification services, such as image recognition, can cause unexpected erroneous behavior in systems that make calls to the services. Prior work has shown the importance of "backward compatibility" for maintaining human trust. We study challenges with backward compatibility across different ML architectures and datasets, focusing on common settings including data shifts with structured noise and ML employed in inferential pipelines. Our results show that (i) compatibility issues arise even without data shift due to optimization stochasticity, (ii) training on large-scale noisy datasets often results in significant decreases in backward compatibility even when model accuracy increases, and (iii) distributions of incompatible points align with noise bias, motivating the need for compatibility aware de-noising and robustness methods.

preprint2020arXiv

From Data to Knowledge to Action: A Global Enabler for the 21st Century

A confluence of advances in the computer and mathematical sciences has unleashed unprecedented capabilities for enabling true evidence-based decision making. These capabilities are making possible the large-scale capture of data and the transformation of that data into insights and recommendations in support of decisions about challenging problems in science, society, and government. Key advances include jumps in the availability of rich streams of data, precipitous drops in the cost of storing and retrieving massive amounts of data, exponential increases in computing power and memory, and jumps in the prowess of methods for performing machine learning and reasoning. These advances have come together to create an inflection point in our ability to harness large amounts of data for generating insights and guiding decision making. The shift of commerce, science, education, art, and entertainment to the web makes available unprecedented quantities of structured and unstructured databases about human activities - much of it available to anyone who wishes to mine it for insights. In the sciences, new evidential paradigms and sensing technologies are making available great quantities of data, via use of fundamentally new kinds of low-cost sensors (e.g., genomic microarrays) or through viewers that provide unprecedented scope and resolution. The data pose a huge opportunity for data-centric analyses. To date, we have only scratched the surface of the potential for learning from these large-scale data sets. Opportunities abound for tapping our new capabilities more broadly to provide insights to decision makers and to enhance the quality of their actions and policies.

preprint2020arXiv

Learning to Complement Humans

A rising vision for AI in the open world centers on the development of systems that can complement humans for perceptual, diagnostic, and reasoning tasks. To date, systems aimed at complementing the skills of people have employed models trained to be as accurate as possible in isolation. We demonstrate how an end-to-end learning strategy can be harnessed to optimize the combined performance of human-machine teams by considering the distinct abilities of people and machines. The goal is to focus machine learning on problem instances that are difficult for humans, while recognizing instances that are difficult for the machine and seeking human input on them. We demonstrate in two real-world domains (scientific discovery and medical diagnosis) that human-machine teams built via these methods outperform the individual performance of machines and people. We then analyze conditions under which this complementarity is strongest, and which training methods amplify it. Taken together, our work provides the first systematic investigation of how machine learning systems can be trained to complement human reasoning.

preprint2020arXiv

PACT: Privacy Sensitive Protocols and Mechanisms for Mobile Contact Tracing

The global health threat from COVID-19 has been controlled in a number of instances by large-scale testing and contact tracing efforts. We created this document to suggest three functionalities on how we might best harness computing technologies to supporting the goals of public health organizations in minimizing morbidity and mortality associated with the spread of COVID-19, while protecting the civil liberties of individuals. In particular, this work advocates for a third-party free approach to assisted mobile contact tracing, because such an approach mitigates the security and privacy risks of requiring a trusted third party. We also explicitly consider the inferential risks involved in any contract tracing system, where any alert to a user could itself give rise to de-anonymizing information. More generally, we hope to participate in bringing together colleagues in industry, academia, and civil society to discuss and converge on ideas around a critical issue rising with attempts to mitigate the COVID-19 pandemic.

preprint2020arXiv

SciSight: Combining faceted navigation and research group detection for COVID-19 exploratory scientific search

The COVID-19 pandemic has sparked unprecedented mobilization of scientists, generating a deluge of papers that makes it hard for researchers to keep track and explore new directions. Search engines are designed for targeted queries, not for discovery of connections across a corpus. In this paper, we present SciSight, a system for exploratory search of COVID-19 research integrating two key capabilities: first, exploring associations between biomedical facets automatically extracted from papers (e.g., genes, drugs, diseases, patient outcomes); second, combining textual and network information to search and visualize groups of researchers and their ties. SciSight has so far served over $15K$ users with over $42K$ page views and $13\%$ returns.

preprint2020arXiv

SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions

Existing VQA datasets contain questions with varying levels of complexity. While the majority of questions in these datasets require perception for recognizing existence, properties, and spatial relationships of entities, a significant portion of questions pose challenges that correspond to reasoning tasks - tasks that can only be answered through a synthesis of perception and knowledge about the world, logic and / or reasoning. Analyzing performance across this distinction allows us to notice when existing VQA models have consistency issues; they answer the reasoning questions correctly but fail on associated low-level perception questions. For example, in Figure 1, models answer the complex reasoning question "Is the banana ripe enough to eat?" correctly, but fail on the associated perception question "Are the bananas mostly green or yellow?" indicating that the model likely answered the reasoning question correctly but for the wrong reason. We quantify the extent to which this phenomenon occurs by creating a new Reasoning split of the VQA dataset and collecting VQA-introspect, a new dataset1 which consists of 238K new perception questions which serve as sub questions corresponding to the set of perceptual tasks needed to effectively answer the complex reasoning questions in the Reasoning split. Our evaluation shows that state-of-the-art VQA models have comparable performance in answering perception and reasoning questions, but suffer from consistency problems. To address this shortcoming, we propose an approach called Sub-Question Importance-aware Network Tuning (SQuINT), which encourages the model to attend to the same parts of the image when answering the reasoning question and the perception sub question. We show that SQuINT improves model consistency by ~5%, also marginally improving performance on the Reasoning questions in VQA, while also displaying better attention maps.

preprint2016arXiv

Identifying Dogmatism in Social Media: Signals and Models

We explore linguistic and behavioral features of dogmatism in social media and construct statistical models that can identify dogmatic comments. Our model is based on a corpus of Reddit posts, collected across a diverse set of conversational topics and annotated via paid crowdsourcing. We operationalize key aspects of dogmatism described by existing psychology theories (such as over-confidence), finding they have predictive power. We also find evidence for new signals of dogmatism, such as the tendency of dogmatic posts to refrain from signaling cognitive processes. When we use our predictive model to analyze millions of other Reddit posts, we find evidence that suggests dogmatism is a deeper personality trait, present for dogmatic users across many different domains, and that users who engage on dogmatic comments tend to show increases in dogmatic posts themselves.

preprint2016arXiv

Identifying Unknown Unknowns in the Open World: Representations and Policies for Guided Exploration

Predictive models deployed in the real world may assign incorrect labels to instances with high confidence. Such errors or unknown unknowns are rooted in model incompleteness, and typically arise because of the mismatch between training data and the cases encountered at test time. As the models are blind to such errors, input from an oracle is needed to identify these failures. In this paper, we formulate and address the problem of informed discovery of unknown unknowns of any given predictive model where unknown unknowns occur due to systematic biases in the training data. We propose a model-agnostic methodology which uses feedback from an oracle to both identify unknown unknowns and to intelligently guide the discovery. We employ a two-phase approach which first organizes the data into multiple partitions based on the feature similarity of instances and the confidence scores assigned by the predictive model, and then utilizes an explore-exploit strategy for discovering unknown unknowns across these partitions. We demonstrate the efficacy of our framework by varying the underlying causes of unknown unknowns across various applications. To the best of our knowledge, this paper presents the first algorithmic approach to the problem of discovering unknown unknowns of predictive models.

preprint2016arXiv

Influence of Pokémon Go on Physical Activity: Study and Implications

Physical activity helps people maintain a healthy weight and reduces the risk for several chronic diseases. Although this knowledge is widely recognized, adults and children in many countries around the world do not get recommended amounts of physical activity. While many interventions are found to be ineffective at increasing physical activity or reaching inactive populations, there have been anecdotal reports of increased physical activity due to novel mobile games that embed game play in the physical world. The most recent and salient example of such a game is Pokémon Go, which has reportedly reached tens of millions of users in the US and worldwide. We study the effect of Pokémon Go on physical activity through a combination of signals from large-scale corpora of wearable sensor data and search engine logs for 32 thousand users over a period of three months. Pokémon Go players are identified through search engine queries and activity is measured through accelerometry. We find that Pokémon Go leads to significant increases in physical activity over a period of 30 days, with particularly engaged users (i.e., those making multiple search queries for details about game usage) increasing their activity by 1473 steps a day on average, a more than 25% increase compared to their prior activity level ($p<10^{-15}$). In the short time span of the study, we estimate that Pokémon Go has added a total of 144 billion steps to US physical activity. Furthermore, Pokémon Go has been able to increase physical activity across men and women of all ages, weight status, and prior activity levels showing this form of game leads to increases in physical activity with significant implications for public health. We find that Pokémon Go is able to reach low activity populations while all four leading mobile health apps studied in this work largely draw from an already very active population.

preprint2016arXiv

Long-Term Trends in the Public Perception of Artificial Intelligence

Analyses of text corpora over time can reveal trends in beliefs, interest, and sentiment about a topic. We focus on views expressed about artificial intelligence (AI) in the New York Times over a 30-year period. General interest, awareness, and discussion about AI has waxed and waned since the field was founded in 1956. We present a set of measures that captures levels of engagement, measures of pessimism and optimism, the prevalence of specific hopes and concerns, and topics that are linked to discussions about AI over decades. We find that discussion of AI has increased sharply since 2009, and that these discussions have been consistently more optimistic than pessimistic. However, when we examine specific concerns, we find that worries of loss of control of AI, ethical concerns for AI, and the negative impact of AI on work have grown in recent years. We also find that hopes for AI in healthcare and education have increased over time.

preprint2016arXiv

On Human Intellect and Machine Failures: Troubleshooting Integrative Machine Learning Systems

We study the problem of troubleshooting machine learning systems that rely on analytical pipelines of distinct components. Understanding and fixing errors that arise in such integrative systems is difficult as failures can occur at multiple points in the execution workflow. Moreover, errors can propagate, become amplified or be suppressed, making blame assignment difficult. We propose a human-in-the-loop methodology which leverages human intellect for troubleshooting system failures. The approach simulates potential component fixes through human computation tasks and measures the expected improvements in the holistic behavior of the system. The method provides guidance to designers about how they can best improve the system. We demonstrate the effectiveness of the approach on an automated image captioning system that has been pressed into real-world use.

preprint2016arXiv

Toward a Science of Autonomy for Physical Science: Healthcare

In Star Wars Episode V, we see Luke Skywalker being repaired by a surgical robot. In the context of the movie, this doesn't seem surprising or disturbing. After all, it is a long, long time ago, in a galaxy far, far away. It would never happen here. Or could it? Would we accept a robot as our doctor, our surgeon, or our in-home care specialist? Imagine walking into an operating room and no one was there. You are instructed to lie down on the operating table, and the OR system takes over. Would you feel comfortable with this possible future world?

preprint2015arXiv

Inferring and Learning from Neuronal Correspondences

We introduce and study methods for inferring and learning from correspondences among neurons. The approach enables alignment of data from distinct multiunit studies of nervous systems. We show that the methods for inferring correspondences combine data effectively from cross-animal studies to make joint inferences about behavioral decision making that are not possible with the data from a single animal. We focus on data collection, machine learning, and prediction in the representative and long-studied invertebrate nervous system of the European medicinal leech. Acknowledging the computational intractability of the general problem of identifying correspondences among neurons, we introduce efficient computational procedures for matching neurons across animals. The methods include techniques that adjust for missing cells or additional cells in the different data sets that may reflect biological or experimental variation. The methods highlight the value harnessing inference and learning in new kinds of computational microscopes for multiunit neurobiological studies.

preprint2015arXiv

Information Gathering in Networks via Active Exploration

How should we gather information in a network, where each node's visibility is limited to its local neighborhood? This problem arises in numerous real-world applications, such as surveying and task routing in social networks, team formation in collaborative networks and experimental design with dependency constraints. Often the informativeness of a set of nodes can be quantified via a submodular utility function. Existing approaches for submodular optimization, however, require that the set of all nodes that can be selected is known ahead of time, which is often unrealistic. In contrast, we propose a novel model where we start our exploration from an initial node, and new nodes become visible and available for selection only once one of their neighbors has been chosen. We then present a general algorithm NetExp for this problem, and provide theoretical bounds on its performance dependent on structural properties of the underlying network. We evaluate our methodology on various simulated problem instances as well as on data collected from social question answering system deployed within a large enterprise.

preprint2015arXiv

Learning to Hire Teams

Crowdsourcing and human computation has been employed in increasingly sophisticated projects that require the solution of a heterogeneous set of tasks. We explore the challenge of building or hiring an effective team, for performing tasks required for such projects on an ongoing basis, from an available pool of applicants or workers who have bid for the tasks. The recruiter needs to learn workers' skills and expertise by performing online tests and interviews, and would like to minimize the amount of budget or time spent in this process before committing to hiring the team. How can one optimally spend budget to learn the expertise of workers as part of recruiting a team? How can one exploit the similarities among tasks as well as underlying social ties or commonalities among the workers for faster learning? We tackle these decision-theoretic challenges by casting them as an instance of online learning for best action selection. We present algorithms with PAC bounds on the required budget to hire a near-optimal team with high confidence. Furthermore, we consider an embedding of the tasks and workers in an underlying graph that may arise from task similarities or social ties, and that can provide additional side-observations for faster learning. We then quantify the improvement in the bounds that we can achieve depending on the characteristic properties of this graph structure. We evaluate our methodology on simulated problem instances as well as on real-world crowdsourcing data collected from the oDesk platform. Our methodology and results present an interesting direction of research to tackle the challenges faced by a recruiter for contract-based crowdsourcing.

preprint2015arXiv

Metareasoning for Planning Under Uncertainty

The conventional model for online planning under uncertainty assumes that an agent can stop and plan without incurring costs for the time spent planning. However, planning time is not free in most real-world settings. For example, an autonomous drone is subject to nature's forces, like gravity, even while it thinks, and must either pay a price for counteracting these forces to stay in place, or grapple with the state change caused by acquiescing to them. Policy optimization in these settings requires metareasoning---a process that trades off the cost of planning and the potential policy improvement that can be achieved. We formalize and analyze the metareasoning problem for Markov Decision Processes (MDPs). Our work subsumes previously studied special cases of metareasoning and shows that in the general case, metareasoning is at most polynomially harder than solving MDPs with any given algorithm that disregards the cost of thinking. For reasons we discuss, optimal general metareasoning turns out to be impractical, motivating approximations. We present approximate metareasoning procedures which rely on special properties of the BRTDP planning algorithm and explore the effectiveness of our methods on a variety of problems.

preprint2014arXiv

A Utility-Theoretic Approach to Privacy in Online Services

Online offerings such as web search, news portals, and e-commerce applications face the challenge of providing high-quality service to a large, heterogeneous user base. Recent efforts have highlighted the potential to improve performance by introducing methods to personalize services based on special knowledge about users and their context. For example, a users demographics, location, and past search and browsing may be useful in enhancing the results offered in response to web search queries. However, reasonable concerns about privacy by both users, providers, and government agencies acting on behalf of citizens, may limit access by services to such information. We introduce and explore an economics of privacy in personalization, where people can opt to share personal information, in a standing or on-demand manner, in return for expected enhancements in the quality of an online service. We focus on the example of web search and formulate realistic objective functions for search efficacy and privacy. We demonstrate how we can find a provably near-optimal optimization of the utility-privacy tradeoff in an efficient manner. We evaluate our methodology on data drawn from a log of the search activity of volunteer participants. We separately assess users' preferences about privacy and utility via a large-scale survey, aimed at eliciting preferences about peoples' willingness to trade the sharing of personal data in returns for gains in search efficiency. We show that a significant level of personalization can be achieved using a relatively small amount of information about users.

preprint2014arXiv

Events and Controversies: Influences of a Shocking News Event on Information Seeking

It has been suggested that online search and retrieval contributes to the intellectual isolation of users within their preexisting ideologies, where people's prior views are strengthened and alternative viewpoints are infrequently encountered. This so-called "filter bubble" phenomenon has been called out as especially detrimental when it comes to dialog among people on controversial, emotionally charged topics, such as the labeling of genetically modified food, the right to bear arms, the death penalty, and online privacy. We seek to identify and study information-seeking behavior and access to alternative versus reinforcing viewpoints following shocking, emotional, and large-scale news events. We choose for a case study to analyze search and browsing on gun control/rights, a strongly polarizing topic for both citizens and leaders of the United States. We study the period of time preceding and following a mass shooting to understand how its occurrence, follow-on discussions, and debate may have been linked to changes in the patterns of searching and browsing. We employ information-theoretic measures to quantify the diversity of Web domains of interest to users and understand the browsing patterns of users. We use these measures to characterize the influence of news events on these web search and browsing patterns.

preprint2014arXiv

Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (1996)

This is the Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence, which was held in Portland, OR, August 1-4, 1996

preprint2014arXiv

Stochastic Privacy

Online services such as web search and e-commerce applications typically rely on the collection of data about users, including details of their activities on the web. Such personal data is used to enhance the quality of service via personalization of content and to maximize revenues via better targeting of advertisements and deeper engagement of users on sites. To date, service providers have largely followed the approach of either requiring or requesting consent for opting-in to share their data. Users may be willing to share private information in return for better quality of service or for incentives, or in return for assurances about the nature and extend of the logging of data. We introduce \emph{stochastic privacy}, a new approach to privacy centering on a simple concept: A guarantee is provided to users about the upper-bound on the probability that their personal data will be used. Such a probability, which we refer to as \emph{privacy risk}, can be assessed by users as a preference or communicated as a policy by a service provider. Service providers can work to personalize and to optimize revenues in accordance with preferences about privacy risk. We present procedures, proofs, and an overall system for maximizing the quality of services, while respecting bounds on allowable or communicated privacy risk. We demonstrate the methodology with a case study and evaluation of the procedures applied to web search personalization. We show how we can achieve near-optimal utility of accessing information with provable guarantees on the probability of sharing data.

preprint2013arXiv

From Cookies to Cooks: Insights on Dietary Patterns via Analysis of Web Usage Logs

Nutrition is a key factor in people's overall health. Hence, understanding the nature and dynamics of population-wide dietary preferences over time and space can be valuable in public health. To date, studies have leveraged small samples of participants via food intake logs or treatment data. We propose a complementary source of population data on nutrition obtained via Web logs. Our main contribution is a spatiotemporal analysis of population-wide dietary preferences through the lens of logs gathered by a widely distributed Web-browser add-on, using the access volume of recipes that users seek via search as a proxy for actual food consumption. We discover that variation in dietary preferences as expressed via recipe access has two main periodic components, one yearly and the other weekly, and that there exist characteristic regional differences in terms of diet within the United States. In a second study, we identify users who show evidence of having made an acute decision to lose weight. We characterize the shifts in interests that they express in their search queries and focus on changes in their recipe queries in particular. Last, we correlate nutritional time series obtained from recipe queries with time-aligned data on hospital admissions, aimed at understanding how behavioral data captured in Web logs might be harnessed to identify potential relationships between diet and acute health problems. In this preliminary study, we focus on patterns of sodium identified in recipes over time and patterns of admission for congestive heart failure, a chronic illness that can be exacerbated by increases in sodium intake.

Eric Horvitz

What is connected

Connect this record

See the researcher in context

Building this map preview

30 published item(s)

Can Revealed Preferences Clarify LLM Alignment and Steering?

A Search Engine for Discovery of Scientific Challenges and Directions

Bursting Scientific Filter Bubbles: Boosting Innovation via Novel Author Discovery

Imagined versus Remembered Stories: Quantifying Differences in Narrative Flow

Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging

Formation of Social Ties Influences Food Choice: A Campus-Wide Longitudinal Study

Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork

Population-Scale Study of Human Needs During the COVID-19 Pandemic: Analysis and Implications

AMP: Authentication of Media via Provenance

An Empirical Analysis of Backward Compatibility in Machine Learning Systems

From Data to Knowledge to Action: A Global Enabler for the 21st Century

Learning to Complement Humans

PACT: Privacy Sensitive Protocols and Mechanisms for Mobile Contact Tracing

SciSight: Combining faceted navigation and research group detection for COVID-19 exploratory scientific search

SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions

Identifying Dogmatism in Social Media: Signals and Models

Identifying Unknown Unknowns in the Open World: Representations and Policies for Guided Exploration

Influence of Pokémon Go on Physical Activity: Study and Implications

Long-Term Trends in the Public Perception of Artificial Intelligence

On Human Intellect and Machine Failures: Troubleshooting Integrative Machine Learning Systems

Toward a Science of Autonomy for Physical Science: Healthcare

Inferring and Learning from Neuronal Correspondences

Information Gathering in Networks via Active Exploration

Learning to Hire Teams

Metareasoning for Planning Under Uncertainty

A Utility-Theoretic Approach to Privacy in Online Services

Events and Controversies: Influences of a Shocking News Event on Information Seeking

Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (1996)

Stochastic Privacy

From Cookies to Cooks: Insights on Dietary Patterns via Analysis of Web Usage Logs