Source author record

Enrico Bertini

Enrico Bertini appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Human-Computer Interaction Machine Learning Artificial Intelligence cs.CY

Catalog footprint

What is connected

14works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Cognitive Affordances in Visualization: Related Constructs, Design Factors, and Framework

Classically, affordance research investigates how the shape of objects communicates actions to potential users. Cognitive affordances, a subset of this research, characterize how the design of objects influences cognitive actions, such as information processing. Within visualization, cognitive affordances inform how graphs' design decisions communicate information to their readers. Although several related concepts exist in visualization, a formal translation of affordance theory to visualization is still lacking. In this paper, we review and translate affordance theory to visualization by formalizing how cognitive affordances operate within a visualization context. We also review common methods and terms, and compare related constructs to cognitive affordances in visualization. Based on a synthesis of research from psychology, human computer interaction, and visualization, we propose a framework of cognitive affordances in visualization that enumerates design decisions and reader characteristics that influence a visualization's hierarchy of communicated information. Finally, we demonstrate how this framework can guide the evaluation and redesign of visualizations.

preprint2022arXiv

Misinformed by Visualization: What Do We Learn From Misinformative Visualizations?

Data visualization is powerful in persuading an audience. However, when it is done poorly or maliciously, a visualization may become misleading or even deceiving. Visualizations give further strength to the dissemination of misinformation on the Internet. The visualization research community has long been aware of visualizations that misinform the audience, mostly associated with the terms "lie" and "deceptive." Still, these discussions have focused only on a handful of cases. To better understand the landscape of misleading visualizations, we open-coded over one thousand real-world visualizations that have been reported as misleading. From these examples, we discovered 74 types of issues and formed a taxonomy of misleading elements in visualizations. We found four directions that the research community can follow to widen the discussion on misleading visualizations: (1) informal fallacies in visualizations, (2) exploiting conventions and data literacy, (3) deceptive tricks in uncommon charts, and (4) understanding the designers' dilemma. This work lays the groundwork for these research directions, especially in understanding, detecting, and preventing them.

preprint2022arXiv

Visual Exploration of Machine Learning Model Behavior with Hierarchical Surrogate Rule Sets

One of the potential solutions for model interpretation is to train a surrogate model: a more transparent model that approximates the behavior of the model to be explained. Typically, classification rules or decision trees are used due to the intelligibility of their logic-based expressions. However, decision trees can grow too deep and rule sets can become too large to approximate a complex model. Unlike paths on a decision tree that must share ancestor nodes (conditions), rules are more flexible. However, the unstructured visual representation of rules makes it hard to make inferences across rules. To address these issues, we present a workflow that includes novel algorithmic and interactive solutions. First, we present Hierarchical Surrogate Rules (HSR), an algorithm that generates hierarchical rules based on user-defined parameters. We also contribute SuRE, a visual analytics (VA) system that integrates HSR and interactive surrogate rule visualizations. Particularly, we present a novel feature-aligned tree to overcome the shortcomings of existing rule visualizations. We evaluate the algorithm in terms of parameter sensitivity, time performance, and comparison with surrogate decision trees and find that it scales reasonably well and outperforms decision trees in many respects. We also evaluate the visualization and the VA system by a usability study with 24 volunteers and an observational study with 7 domain experts. Our investigation shows that the participants can use feature-aligned trees to perform non-trivial tasks with very high accuracy. We also discuss many interesting observations that can be useful for future research on designing effective rule-based VA systems.

preprint2021arXiv

Mapping the Landscape of COVID-19 Crisis Visualizations

In response to COVID-19, a vast number of visualizations have been created to communicate information to the public. Information exposure in a public health crisis can impact people's attitudes towards and responses to the crisis and risks, and ultimately the trajectory of a pandemic. As such, there is a need for work that documents, organizes, and investigates what COVID-19 visualizations have been presented to the public. We address this gap through an analysis of 668 COVID-19 visualizations. We present our findings through a conceptual framework derived from our analysis, that examines who, (uses) what data, (to communicate) what messages, in what form, under what circumstances in the context of COVID-19 crisis visualizations. We provide a set of factors to be considered within each component of the framework. We conclude with directions for future crisis visualization research.

preprint2021arXiv

Visualizing Rule Sets: Exploration and Validation of a Design Space

Rule sets are often used in Machine Learning (ML) as a way to communicate the model logic in settings where transparency and intelligibility are necessary. Rule sets are typically presented as a text-based list of logical statements (rules). Surprisingly, to date there has been limited work on exploring visual alternatives for presenting rules. In this paper, we explore the idea of designing alternative representations of rules, focusing on a number of visual factors we believe have a positive impact on rule readability and understanding. The paper presents an initial design space for visualizing rule sets and a user study exploring their impact. The results show that some design factors have a strong impact on how efficiently readers can process the rules while having minimal impact on accuracy. This work can help practitioners employ more effective solutions when using rules as a communication strategy to understand ML models.

preprint2021arXiv

Why Shouldn't All Charts Be Scatter Plots? Beyond Precision-Driven Visualizations

A central concept in information visualization research and practice is the notion of visual variable effectiveness, or the perceptual precision at which values are decoded given visual channels of encoding. Formative work from Cleveland & McGill has shown that position along a common axis is the most effective visual variable for comparing individual values. One natural conclusion is that any chart that is not a dot plot or scatterplot is deficient and should be avoided. In this paper we refute a caricature of this "scatterplots only" argument as a way to call for new perspectives on how information visualization is researched, taught, and evaluated.

preprint2020arXiv

Human Factors in Model Interpretability: Industry Practices, Challenges, and Needs

As the use of machine learning (ML) models in product development and data-driven decision-making processes became pervasive in many domains, people's focus on building a well-performing model has increasingly shifted to understanding how their model works. While scholarly interest in model interpretability has grown rapidly in research communities like HCI, ML, and beyond, little is known about how practitioners perceive and aim to provide interpretability in the context of their existing workflows. This lack of understanding of interpretability as practiced may prevent interpretability research from addressing important needs, or lead to unrealistic solutions. To bridge this gap, we conducted 22 semi-structured interviews with industry practitioners to understand how they conceive of and design for interpretability while they plan, build, and use their models. Based on a qualitative analysis of our results, we differentiate interpretability roles, processes, goals and strategies as they exist within organizations making heavy use of ML models. The characterization of interpretability work that emerges from our analysis suggests that model interpretability frequently involves cooperation and mental model comparison between people in different roles, often aimed at building trust not only between people and models but also between people within the organization. We present implications for design that discuss gaps between the interpretability challenges that practitioners face in their practice and approaches proposed in the literature, highlighting possible research directions that can better address real-world needs.

preprint2020arXiv

Melody: Generating and Visualizing Machine Learning Model Summary to Understand Data and Classifiers Together

With the increasing sophistication of machine learning models, there are growing trends of developing model explanation techniques that focus on only one instance (local explanation) to ensure faithfulness to the original model. While these techniques provide accurate model interpretability on various data primitive (e.g., tabular, image, or text), a holistic Explainable Artificial Intelligence (XAI) experience also requires a global explanation of the model and dataset to enable sensemaking in different granularity. Thus, there is a vast potential in synergizing the model explanation and visual analytics approaches. In this paper, we present MELODY, an interactive algorithm to construct an optimal global overview of the model and data behavior by summarizing the local explanations using information theory. The result (i.e., an explanation summary) does not require additional learning models, restrictions of data primitives, or the knowledge of machine learning from the users. We also design MELODY UI, an interactive visual analytics system to demonstrate how the explanation summary connects the dots in various XAI tasks from a global overview to local inspections. We present three usage scenarios regarding tabular, image, and text classifications to illustrate how to generalize model interpretability of different data. Our experiments show that our approaches: (1) provides a better explanation summary compared to a straightforward information-theoretic summarization and (2) achieves a significant speedup in the end-to-end data modeling pipeline.

preprint2020arXiv

PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines

In recent years, a wide variety of automated machine learning (AutoML) methods have been proposed to search and generate end-to-end learning pipelines. While these techniques facilitate the creation of models for real-world applications, given their black-box nature, the complexity of the underlying algorithms, and the large number of pipelines they derive, it is difficult for their developers to debug these systems. It is also challenging for machine learning experts to select an AutoML system that is well suited for a given problem or class of problems. In this paper, we present the PipelineProfiler, an interactive visualization tool that allows the exploration and comparison of the solution space of machine learning (ML) pipelines produced by AutoML systems. PipelineProfiler is integrated with Jupyter Notebook and can be used together with common data science tools to enable a rich set of analyses of the ML pipelines and provide insights about the algorithms that generated them. We demonstrate the utility of our tool through several use cases where PipelineProfiler is used to better understand and improve a real-world AutoML system. Furthermore, we validate our approach by presenting a detailed analysis of a think-aloud experiment with six data scientists who develop and evaluate AutoML tools.

preprint2020arXiv

Towards Evaluating Exploratory Model Building Process with AutoML Systems

The use of Automated Machine Learning (AutoML) systems are highly open-ended and exploratory. While rigorously evaluating how end-users interact with AutoML is crucial, establishing a robust evaluation methodology for such exploratory systems is challenging. First, AutoML is complex, including multiple sub-components that support a variety of sub-tasks for synthesizing ML pipelines, such as data preparation, problem specification, and model generation, making it difficult to yield insights that tell us which components were successful or not. Second, because the usage pattern of AutoML is highly exploratory, it is not possible to rely solely on widely used task efficiency and effectiveness metrics as success metrics. To tackle the challenges in evaluation, we propose an evaluation methodology that (1) guides AutoML builders to divide their AutoML system into multiple sub-system components, and (2) helps them reason about each component through visualization of end-users' behavioral patterns and attitudinal data. We conducted a study to understand when, how, why, and applying our methodology can help builders to better understand their systems and end-users. We recruited 3 teams of professional AutoML builders. The teams prepared their own systems and let 41 end-users use the systems. Using our methodology, we visualized end-users' behavioral and attitudinal data and distributed the results to the teams. We analyzed the results in two directions: what types of novel insights the AutoML builders learned from end-users, and (2) how the evaluation methodology helped the builders to understand workflows and the effectiveness of their systems. Our findings suggest new insights explaining future design opportunities in the AutoML domain as well as how using our methodology helped the builders to determine insights and let them draw concrete directions for improving their systems.

preprint2020arXiv

Towards Ground Truth Explainability on Tabular Data

In data science, there is a long history of using synthetic data for method development, feature selection and feature engineering. Our current interest in synthetic data comes from recent work in explainability. Today's datasets are typically larger and more complex - requiring less interpretable models. In the setting of \textit{post hoc} explainability, there is no ground truth for explanations. Inspired by recent work in explaining image classifiers that does provide ground truth, we propose a similar solution for tabular data. Using copulas, a concise specification of the desired statistical properties of a dataset, users can build intuition around explainability using controlled data sets and experimentation. The current capabilities are demonstrated on three use cases: one dimensional logistic regression, impact of correlation from informative features, impact of correlation from redundant variables.

preprint2020arXiv

ViCE: Visual Counterfactual Explanations for Machine Learning Models

The continued improvements in the predictive accuracy of machine learning models have allowed for their widespread practical application. Yet, many decisions made with seemingly accurate models still require verification by domain experts. In addition, end-users of a model also want to understand the reasons behind specific decisions. Thus, the need for interpretability is increasingly paramount. In this paper we present an interactive visual analytics tool, ViCE, that generates counterfactual explanations to contextualize and evaluate model decisions. Each sample is assessed to identify the minimal set of changes needed to flip the model's output. These explanations aim to provide end-users with personalized actionable insights with which to understand, and possibly contest or improve, automated decisions. The results are effectively displayed in a visual interface where counterfactual explanations are highlighted and interactive methods are provided for users to explore the data and model. The functionality of the tool is demonstrated by its application to a home equity line of credit dataset.

preprint2016arXiv

RioBusData: Outlier Detection in Bus Routes of Rio de Janeiro

Buses are the primary means of public transportation in the city of Rio de Janeiro, carrying around 100 million passengers every month. Recently, real-time GPS coordinates of all operating public buses has been made publicly available - roughly 1 million GPS entries each captured each day. In an initial study, we observed that a substantial number of buses follow trajectories that do not follow the expected behavior. In this paper, we present RioBusData, a tool that helps users identify and explore, through different visualizations, the behavior of outlier trajectories. We describe how the system automatically detects these outliers using a Convolutional Neural Network (CNN) and we also discuss a series of case studies which show how RioBusData helps users better understand not only the flow and service of outlier buses but also the bus system as a whole.

preprint2016arXiv

Using Visual Analytics to Interpret Predictive Machine Learning Models

It is commonly believed that increasing the interpretability of a machine learning model may decrease its predictive power. However, inspecting input-output relationships of those models using visual analytics, while treating them as black-box, can help to understand the reasoning behind outcomes without sacrificing predictive quality. We identify a space of possible solutions and provide two examples of where such techniques have been successfully used in practice.

Enrico Bertini

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Cognitive Affordances in Visualization: Related Constructs, Design Factors, and Framework

Misinformed by Visualization: What Do We Learn From Misinformative Visualizations?

Visual Exploration of Machine Learning Model Behavior with Hierarchical Surrogate Rule Sets

Mapping the Landscape of COVID-19 Crisis Visualizations

Visualizing Rule Sets: Exploration and Validation of a Design Space

Why Shouldn't All Charts Be Scatter Plots? Beyond Precision-Driven Visualizations

Human Factors in Model Interpretability: Industry Practices, Challenges, and Needs

Melody: Generating and Visualizing Machine Learning Model Summary to Understand Data and Classifiers Together

PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines

Towards Evaluating Exploratory Model Building Process with AutoML Systems

Towards Ground Truth Explainability on Tabular Data

ViCE: Visual Counterfactual Explanations for Machine Learning Models

RioBusData: Outlier Detection in Bus Routes of Rio de Janeiro

Using Visual Analytics to Interpret Predictive Machine Learning Models