Source author record

Michael Correll

Michael Correll appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Human-Computer Interaction Computation and Language Machine Learning

Catalog footprint

What is connected

5works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

OSCAR: A Semantic-based Data Binning Approach

Binning is applied to categorize data values or to see distributions of data. Existing binning algorithms often rely on statistical properties of data. However, there are semantic considerations for selecting appropriate binning schemes. Surveys, for instance, gather respondent data for demographic-related questions such as age, salary, number of employees, etc., that are bucketed into defined semantic categories. In this paper, we leverage common semantic categories from survey data and Tableau Public visualizations to identify a set of semantic binning categories. We employ these semantic binning categories in OSCAR: a method for automatically selecting bins based on the inferred semantic type of the field. We conducted a crowdsourced study with 120 participants to better understand user preferences for bins generated by OSCAR vs. binning provided in Tableau. We find that maps and histograms using binned values generated by OSCAR are preferred by users as compared to binning schemes based purely on the statistical properties of the data.

preprint2022arXiv

Recommendations for Visualization Recommendations: Exploring Preferences and Priorities in Public Health

The promise of visualization recommendation systems is that analysts will be automatically provided with relevant and high-quality visualizations that will reduce the work of manual exploration or chart creation. However, little research to date has focused on what analysts value in the design of visualization recommendations. We interviewed 18 analysts in the public health sector and explored how they made sense of a popular in-domain dataset. in service of generating visualizations to recommend to others. We also explored how they interacted with a corpus of both automatically- and manually-generated visualization recommendations, with the goal of uncovering how the design values of these analysts are reflected in current visualization recommendation systems. We find that analysts champion simple charts with clear takeaways that are nonetheless connected with existing semantic information or domain hypotheses. We conclude by recommending that visualization recommendation designers explore ways of integrating context and expectation into their systems.

preprint2021arXiv

User Ex Machina : Simulation as a Design Probe in Human-in-the-Loop Text Analytics

Topic models are widely used analysis techniques for clustering documents and surfacing thematic elements of text corpora. These models remain challenging to optimize and often require a "human-in-the-loop" approach where domain experts use their knowledge to steer and adjust. However, the fragility, incompleteness, and opacity of these models means even minor changes could induce large and potentially undesirable changes in resulting model. In this paper we conduct a simulation-based analysis of human-centered interactions with topic models, with the objective of measuring the sensitivity of topic models to common classes of user actions. We find that user interactions have impacts that differ in magnitude but often negatively affect the quality of the resulting modelling in a way that can be difficult for the user to evaluate. We suggest the incorporation of sensitivity and "multiverse" analyses to topic model interfaces to surface and overcome these deficiencies.

preprint2021arXiv

Why Shouldn't All Charts Be Scatter Plots? Beyond Precision-Driven Visualizations

A central concept in information visualization research and practice is the notion of visual variable effectiveness, or the perceptual precision at which values are decoded given visual channels of encoding. Formative work from Cleveland & McGill has shown that position along a common axis is the most effective visual variable for comparing individual values. One natural conclusion is that any chart that is not a dot plot or scatterplot is deficient and should be avoided. In this paper we refute a caricature of this "scatterplots only" argument as a way to call for new perspectives on how information visualization is researched, taught, and evaluated.

preprint2020arXiv

What Do We Actually Learn from Evaluations in the "Heroic Era" of Visualization?

We often point to the relative increase in the amount and sophistication of evaluations of visualization systems versus the earliest days of the field as evidence that we are maturing as a field. I am not so convinced. In particular, I feel that evaluations of visualizations, as they are ordinarily performed in the field or asked for by reviewers, fail to tell us very much that is useful or transferable about visualization systems, regardless of the statistical rigor or ecological validity of the evaluation. Through a series of thought experiments, I show how our current conceptions of visualization evaluations can be incomplete, capricious, or useless for the goal of furthering the field, more in line with the "heroic age" of medical science than the rigorous evidence-based field we might aspire to be. I conclude by suggesting that our models for designing evaluations, and our priorities as a field, should be revisited.