Source author record

Jeffrey Heer

Jeffrey Heer appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Human-Computer Interaction Artificial Intelligence Computation Digital Libraries Machine Learning Programming Languages stat.OT

Catalog footprint

What is connected

6works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Cinematic Techniques in Narrative Visualization

The many genres of narrative visualization (e.g. data comics, data videos) each offer a unique set of affordances and constraints. To better understand a genre that we call cinematic visualizations-3D visualizations that make highly deliberate use of a camera to convey a narrative-we gathered 50 examples and analyzed their traditional cinematic aspects to identify the benefits and limitations of the form. While the cinematic visualization approach can violate traditional rules of visualization, we find that through careful control of the camera, cinematic visualizations enable immersion in data-driven, anthropocentric environments, and can naturally incorporate in-situ narrators, concrete scales, and visual analogies. Our analysis guides our design of a series of cinematic visualizations, created for NASA's Earth Science Communications team. We present one as a case study to convey design guidelines covering cinematography, lighting, set design, and sound, and discuss challenges in creating cinematic visualizations.

preprint2022arXiv

Fidyll: A Compiler for Cross-Format Data Stories & Explorable Explanations

Narrative visualization is a powerful communicative tool that can take on various formats such as interactive articles, slideshows, and data videos. These formats each have their strengths and weaknesses, but existing authoring tools only support one output target. We conducted a series of formative interviews with seven domain experts to understand needs and practices around cross-format data stories, and developed Fidyll, a cross-format compiler for authoring interactive data stories and explorable explanations. Our open-source tool can be used to rapidly create formats including static articles, low-motion articles, interactive articles, slideshows, and videos. We evaluate our system through a series of real-world usage scenarios, showing how it benefits authors in the domains of data journalism, scientific publishing, and nonprofit advocacy. We show how Fidyll, provides expressive leverage by reducing the amount of non-narrative markup that authors need to write by 80-90% compared to Idyll, an existing markup language for authoring interactive articles.

preprint2022arXiv

Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data Relationships

Proper statistical modeling incorporates domain theory about how concepts relate and details of how data were measured. However, data analysts currently lack tool support for recording and reasoning about domain assumptions, data collection, and modeling choices in an integrated manner, leading to mistakes that can compromise scientific validity. For instance, generalized linear mixed-effects models (GLMMs) help answer complex research questions, but omitting random effects impairs the generalizability of results. To address this need, we present Tisane, a mixed-initiative system for authoring generalized linear models with and without mixed-effects. Tisane introduces a study design specification language for expressing and asking questions about relationships between variables. Tisane contributes an interactive compilation process that represents relationships in a graph, infers candidate statistical models, and asks follow-up questions to disambiguate user queries to construct a valid model. In case studies with three researchers, we find that Tisane helps them focus on their goals and assumptions while avoiding past mistakes.

preprint2020arXiv

CORAL: COde RepresentAtion Learning with Weakly-Supervised Transformers for Analyzing Data Analysis

Large scale analysis of source code, and in particular scientific source code, holds the promise of better understanding the data science process, identifying analytical best practices, and providing insights to the builders of scientific toolkits. However, large corpora have remained unanalyzed in depth, as descriptive labels are absent and require expert domain knowledge to generate. We propose a novel weakly supervised transformer-based architecture for computing joint representations of code from both abstract syntax trees and surrounding natural language comments. We then evaluate the model on a new classification task for labeling computational notebook cells as stages in the data analysis process from data import to wrangling, exploration, modeling, and evaluation. We show that our model, leveraging only easily-available weak supervision, achieves a 38% increase in accuracy over expert-supplied heuristics and outperforms a suite of baselines. Our model enables us to examine a set of 118,000 Jupyter Notebooks to uncover common data analysis patterns. Focusing on notebooks with relationships to academic articles, we conduct the largest ever study of scientific code and find that notebook composition correlates with the citation count of corresponding papers.

preprint2020arXiv

Gemini: A Grammar and Recommender System for AnimatedTransitions in Statistical Graphics

Animated transitions help viewers follow changes between related visualizations. Specifying effective animations demands significant effort: authors must select the elements and properties to animate, provide transition parameters, and coordinate the timing of stages. To facilitate this process, we present Gemini, a declarative grammar and recommendation system for animated transitions between single-view statistical graphics. Gemini specifications define transition "steps" in terms of high-level visual components (marks, axes, legends) and composition rules to synchronize and concatenate steps. With this grammar, Gemini can recommend animation designs to augment and accelerate designers' work. Gemini enumerates staged animation designs for given start and end states, and ranks those designs using a cost function informed by prior perceptual studies. To evaluate Gemini, we conduct both a formative study on Mechanical Turk to assess and tune our ranking function, and a summative study in which 8 experienced visualization developers implement animations in D3 that we then compare to Gemini's suggestions. We find that most designs (9/11) are exactly replicable in Gemini, with many (8/11) achievable via edits to suggestions, and that Gemini suggestions avoid multiple participant errors.

preprint2020arXiv

Paths Explored, Paths Omitted, Paths Obscured: Decision Points & Selective Reporting in End-to-End Data Analysis

Drawing reliable inferences from data involves many, sometimes arbitrary, decisions across phases of data collection, wrangling, and modeling. As different choices can lead to diverging conclusions, understanding how researchers make analytic decisions is important for supporting robust and replicable analysis. In this study, we pore over nine published research studies and conduct semi-structured interviews with their authors. We observe that researchers often base their decisions on methodological or theoretical concerns, but subject to constraints arising from the data, expertise, or perceived interpretability. We confirm that researchers may experiment with choices in search of desirable results, but also identify other reasons why researchers explore alternatives yet omit findings. In concert with our interviews, we also contribute visualizations for communicating decision processes throughout an analysis. Based on our results, we identify design opportunities for strengthening end-to-end analysis, for instance via tracking and meta-analysis of multiple decision paths.

Jeffrey Heer

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Cinematic Techniques in Narrative Visualization

Fidyll: A Compiler for Cross-Format Data Stories & Explorable Explanations

Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data Relationships

CORAL: COde RepresentAtion Learning with Weakly-Supervised Transformers for Analyzing Data Analysis

Gemini: A Grammar and Recommender System for AnimatedTransitions in Statistical Graphics

Paths Explored, Paths Omitted, Paths Obscured: Decision Points & Selective Reporting in End-to-End Data Analysis