Source author record

Susan Dumais

Susan Dumais appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Information Retrieval Social and Information Networks

Catalog footprint

What is connected

5works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Characterizing Reading Time on Enterprise Emails

Email is an integral part of people's work and life, enabling them to perform activities such as communicating, searching, managing tasks and storing information. Modern email clients take a step forward and help improve users' productivity by automatically creating reminders, tasks or responses. The act of reading is arguably the only activity that is in common in most -- if not all -- of the interactions that users have with their emails. In this paper, we characterize how users read their enterprise emails, and reveal the various contextual factors that impact reading time. Our approach starts with a reading time analysis based on the reading events from a major email platform, followed by a user study to provide explanations for some discoveries. We identify multiple temporal and user contextual factors that are correlated with reading time. For instance, email reading time is correlated with user devices: on desktop reading time increases through the morning and peaks at noon but on mobile it increases through the evening till midnight. The reading time is also negatively correlated with the screen size. We have established the connection between user status and reading time: users spend more time reading emails when they have fewer meetings and busy hours during the day. In addition, we find that users also reread emails across devices. Among the cross-device reading events, 76% of reread emails are first visited on mobile and then on desktop. Overall, our study is the first to characterize enterprise email reading time on a very large scale. The findings provide insights to develop better metrics and user models for understanding and improving email interactions.

preprint2020arXiv

Detecting Fake News with Weak Social Supervision

Limited labeled data is becoming the largest bottleneck for supervised learning systems. This is especially the case for many real-world tasks where large scale annotated examples are either too expensive to acquire or unavailable due to privacy or data access constraints. Weak supervision has shown to be a good means to mitigate the scarcity of annotated data by leveraging weak labels or injecting constraints from heuristic rules and/or external knowledge sources. Social media has little labeled data but possesses unique characteristics that make it suitable for generating weak supervision, resulting in a new type of weak supervision, i.e., weak social supervision. In this article, we illustrate how various aspects of social media can be used to generate weak social supervision. Specifically, we use the recent research on fake news detection as the use case, where social engagements are abundant but annotated examples are scarce, to show that weak social supervision is effective when facing the little labeled data problem. This article opens the door for learning with weak social supervision for other emerging tasks.

preprint2020arXiv

Learning with Weak Supervision for Email Intent Detection

Email remains one of the most frequently used means of online communication. People spend a significant amount of time every day on emails to exchange information, manage tasks and schedule events. Previous work has studied different ways for improving email productivity by prioritizing emails, suggesting automatic replies or identifying intents to recommend appropriate actions. The problem has been mostly posed as a supervised learning problem where models of different complexities were proposed to classify an email message into a predefined taxonomy of intents or classes. The need for labeled data has always been one of the largest bottlenecks in training supervised models. This is especially the case for many real-world tasks, such as email intent classification, where large scale annotated examples are either hard to acquire or unavailable due to privacy or data access constraints. Email users often take actions in response to intents expressed in an email (e.g., setting up a meeting in response to an email with a scheduling request). Such actions can be inferred from user interaction logs. In this paper, we propose to leverage user actions as a source of weak supervision, in addition to a limited set of annotated examples, to detect intents in emails. We develop an end-to-end robust deep neural network model for email intent identification that leverages both clean annotated data and noisy weak supervision along with a self-paced learning mechanism. Extensive experiments on three different intent detection tasks show that our approach can effectively leverage the weakly supervised data to improve intent detection in emails.

preprint2012arXiv

Web-Based Question Answering: A Decision-Making Perspective

We describe an investigation of the use of probabilistic models and cost-benefit analyses to guide resource-intensive procedures used by a Web-based question answering system. We first provide an overview of research on question-answering systems. Then, we present details on AskMSR, a prototype web-based question answering system. We discuss Bayesian analyses of the quality of answers generated by the system and show how we can endow the system with the ability to make decisions about the number of queries issued to a search engine, given the cost of queries and the expected value of query results in refining an ultimate answer. Finally, we review the results of a set of experiments.

preprint2011arXiv

Mark My Words! Linguistic Style Accommodation in Social Media

The psycholinguistic theory of communication accommodation accounts for the general observation that participants in conversations tend to converge to one another's communicative behavior: they coordinate in a variety of dimensions including choice of words, syntax, utterance length, pitch and gestures. In its almost forty years of existence, this theory has been empirically supported exclusively through small-scale or controlled laboratory studies. Here we address this phenomenon in the context of Twitter conversations. Undoubtedly, this setting is unlike any other in which accommodation was observed and, thus, challenging to the theory. Its novelty comes not only from its size, but also from the non real-time nature of conversations, from the 140 character length restriction, from the wide variety of social relation types, and from a design that was initially not geared towards conversation at all. Given such constraints, it is not clear a priori whether accommodation is robust enough to occur given the constraints of this new environment. To investigate this, we develop a probabilistic framework that can model accommodation and measure its effects. We apply it to a large Twitter conversational dataset specifically developed for this task. This is the first time the hypothesis of linguistic style accommodation has been examined (and verified) in a large scale, real world setting. Furthermore, when investigating concepts such as stylistic influence and symmetry of accommodation, we discover a complexity of the phenomenon which was never observed before. We also explore the potential relation between stylistic influence and network features commonly associated with social status.

Susan Dumais

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Characterizing Reading Time on Enterprise Emails

Detecting Fake News with Weak Social Supervision

Learning with Weak Supervision for Email Intent Detection

Web-Based Question Answering: A Decision-Making Perspective

Mark My Words! Linguistic Style Accommodation in Social Media