Source author record

Kathleen McKeown

Kathleen McKeown appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Artificial Intelligence cs.CY Information Retrieval

Catalog footprint

What is connected

10works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

LLMs as Science Journalists: Supporting Early-stage Researchers in Communicating Their Science to the Public

The scientific community needs tools that help early-stage researchers effectively communicate their findings and innovations to the public. Although existing general-purpose Large Language Models (LLMs) can assist in this endeavor, they are not optimally aligned for it. To address this, we propose a framework for training LLMs to emulate the role of a science journalist that can be used by early-stage researchers to learn how to properly communicate their papers to the general public. We evaluate the usefulness of our trained LLM Journalists in leading conversations with both simulated and human researchers. %compared to the general-purpose ones. Our experiments indicate that LLMs trained using our framework ask more relevant questions that address the societal impact of research, prompting researchers to clarify and elaborate on their findings. In the user study, the majority of participants who interacted with our trained LLM Journalist appreciated it more than interacting with general-purpose LLMs.

preprint2022arXiv

Faithful or Extractive? On Mitigating the Faithfulness-Abstractiveness Trade-off in Abstractive Summarization

Despite recent progress in abstractive summarization, systems still suffer from faithfulness errors. While prior work has proposed models that improve faithfulness, it is unclear whether the improvement comes from an increased level of extractiveness of the model outputs as one naive way to improve faithfulness is to make summarization models more extractive. In this work, we present a framework for evaluating the effective faithfulness of summarization systems, by generating a faithfulnessabstractiveness trade-off curve that serves as a control at different operating points on the abstractiveness spectrum. We then show that the Maximum Likelihood Estimation (MLE) baseline as well as a recently proposed method for improving faithfulness, are both worse than the control at the same level of abstractiveness. Finally, we learn a selector to identify the most faithful and abstractive summary for a given document, and show that this system can attain higher faithfulness scores in human evaluations while being more abstractive than the baseline system on two datasets. Moreover, we show that our system is able to achieve a better faithfulness-abstractiveness trade-off than the control at the same level of abstractiveness.

preprint2022arXiv

Read Top News First: A Document Reordering Approach for Multi-Document News Summarization

A common method for extractive multi-document news summarization is to re-formulate it as a single-document summarization problem by concatenating all documents as a single meta-document. However, this method neglects the relative importance of documents. We propose a simple approach to reorder the documents according to their relative importance before concatenating and summarizing them. The reordering makes the salient content easier to learn by the summarization model. Experiments show that our approach outperforms previous state-of-the-art methods with more complex architectures.

preprint2022arXiv

Seeded Hierarchical Clustering for Expert-Crafted Taxonomies

Practitioners from many disciplines (e.g., political science) use expert-crafted taxonomies to make sense of large, unlabeled corpora. In this work, we study Seeded Hierarchical Clustering (SHC): the task of automatically fitting unlabeled data to such taxonomies using only a small set of labeled examples. We propose HierSeed, a novel weakly supervised algorithm for this task that uses only a small set of labeled seed examples. It is both data and computationally efficient. HierSeed assigns documents to topics by weighing document density against topic hierarchical structure. It outperforms both unsupervised and supervised baselines for the SHC task on three real-world datasets.

preprint2021arXiv

A Unified Feature Representation for Lexical Connotations

Ideological attitudes and stance are often expressed through subtle meanings of words and phrases. Understanding these connotations is critical to recognizing the cultural and emotional perspectives of the speaker. In this paper, we use distant labeling to create a new lexical resource representing connotation aspects for nouns and adjectives. Our analysis shows that it aligns well with human judgments. Additionally, we present a method for creating lexical representations that captures connotations within the embedding space and show that using the embeddings provides a statistically significant improvement on the task of stance detection when data is limited.

preprint2021arXiv

Entity-level Factual Consistency of Abstractive Text Summarization

A key challenge for abstractive summarization is ensuring factual consistency of the generated summary with respect to the original document. For example, state-of-the-art models trained on existing datasets exhibit entity hallucination, generating names of entities that are not present in the source document. We propose a set of new metrics to quantify the entity-level factual consistency of generated summaries and we show that the entity hallucination problem can be alleviated by simply filtering the training data. In addition, we propose a summary-worthy entity classification task to the training process as well as a joint entity and summary generation approach, which yield further improvements in entity level metrics.

preprint2021arXiv

Event-Driven News Stream Clustering using Entity-Aware Contextual Embeddings

We propose a method for online news stream clustering that is a variant of the non-parametric streaming K-means algorithm. Our model uses a combination of sparse and dense document representations, aggregates document-cluster similarity along these multiple representations and makes the clustering decision using a neural classifier. The weighted document-cluster similarity model is learned using a novel adaptation of the triplet loss into a linear classification objective. We show that the use of a suitable fine-tuning objective and external knowledge in pre-trained transformer models yields significant improvements in the effectiveness of contextual embeddings for clustering. Our model achieves a new state-of-the-art on a standard stream clustering dataset of English documents.

preprint2020arXiv

Severing the Edge Between Before and After: Neural Architectures for Temporal Ordering of Events

In this paper, we propose a neural architecture and a set of training methods for ordering events by predicting temporal relations. Our proposed models receive a pair of events within a span of text as input and they identify temporal relations (Before, After, Equal, Vague) between them. Given that a key challenge with this task is the scarcity of annotated data, our models rely on either pretrained representations (i.e. RoBERTa, BERT or ELMo), transfer and multi-task learning (by leveraging complementary datasets), and self-training techniques. Experiments on the MATRES dataset of English documents establish a new state-of-the-art on this task.

preprint2016arXiv

Real-Time Web Scale Event Summarization Using Sequential Decision Making

We present a system based on sequential decision making for the online summarization of massive document streams, such as those found on the web. Given an event of interest (e.g. "Boston marathon bombing"), our system is able to filter the stream for relevance and produce a series of short text updates describing the event as it unfolds over time. Unlike previous work, our approach is able to jointly model the relevance, comprehensiveness, novelty, and timeliness required by time-sensitive queries. We demonstrate a 28.3% improvement in summary F1 and a 43.8% improvement in time-sensitive F1 metrics.

preprint2016arXiv

Using Natural Language Processing and Qualitative Analysis to Intervene in Gang Violence: A Collaboration Between Social Work Researchers and Data Scientists

The U.S. has the highest rate of firearm-related deaths when compared to other industrialized countries. Violence particularly affects low-income, urban neighborhoods in cities like Chicago, which saw a 40% increase in firearm violence from 2014 to 2015 to more than 3,000 shooting victims. While recent studies have found that urban, gang-involved individuals curate a unique and complex communication style within and between social media platforms, organizations focused on reducing gang violence are struggling to keep up with the growing complexity of social media platforms and the sheer volume of data they present. In this paper, describe the Digital Urban Violence Analysis Approach (DUVVA), a collaborative qualitative analysis method used in a collaboration between data scientists and social work researchers to develop a suite of systems for decoding the high- stress language of urban, gang-involved youth. Our approach leverages principles of grounded theory when analyzing approximately 800 tweets posted by Chicago gang members and participation of youth from Chicago neighborhoods to create a language resource for natural language processing (NLP) methods. In uncovering the unique language and communication style, we developed automated tools with the potential to detect aggressive language on social media and aid individuals and groups in performing violence prevention and interruption.

Kathleen McKeown

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

LLMs as Science Journalists: Supporting Early-stage Researchers in Communicating Their Science to the Public

Faithful or Extractive? On Mitigating the Faithfulness-Abstractiveness Trade-off in Abstractive Summarization

Read Top News First: A Document Reordering Approach for Multi-Document News Summarization

Seeded Hierarchical Clustering for Expert-Crafted Taxonomies

A Unified Feature Representation for Lexical Connotations

Entity-level Factual Consistency of Abstractive Text Summarization

Event-Driven News Stream Clustering using Entity-Aware Contextual Embeddings

Severing the Edge Between Before and After: Neural Architectures for Temporal Ordering of Events

Real-Time Web Scale Event Summarization Using Sequential Decision Making

Using Natural Language Processing and Qualitative Analysis to Intervene in Gang Violence: A Collaboration Between Social Work Researchers and Data Scientists