Source author record

Bhavya Ghai

Bhavya Ghai appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Human-Computer Interaction Artificial Intelligence Machine Learning cs.CY Methodology

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Cascaded Debiasing: Studying the Cumulative Effect of Multiple Fairness-Enhancing Interventions

Understanding the cumulative effect of multiple fairness enhancing interventions at different stages of the machine learning (ML) pipeline is a critical and underexplored facet of the fairness literature. Such knowledge can be valuable to data scientists/ML practitioners in designing fair ML pipelines. This paper takes the first step in exploring this area by undertaking an extensive empirical study comprising 60 combinations of interventions, 9 fairness metrics, 2 utility metrics (Accuracy and F1 Score) across 4 benchmark datasets. We quantitatively analyze the experimental data to measure the impact of multiple interventions on fairness, utility and population groups. We found that applying multiple interventions results in better fairness and lower utility than individual interventions on aggregate. However, adding more interventions do no always result in better fairness or worse utility. The likelihood of achieving high performance (F1 Score) along with high fairness increases with larger number of interventions. On the downside, we found that fairness-enhancing interventions can negatively impact different population groups, especially the privileged group. This study highlights the need for new fairness metrics that account for the impact on different population groups apart from just the disparity between groups. Lastly, we offer a list of combinations of interventions that perform best for different fairness and utility metrics to aid the design of fair ML pipelines.

preprint2022arXiv

D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias

With the rise of AI, algorithms have become better at learning underlying patterns from the training data including ingrained social biases based on gender, race, etc. Deployment of such algorithms to domains such as hiring, healthcare, law enforcement, etc. has raised serious concerns about fairness, accountability, trust and interpretability in machine learning algorithms. To alleviate this problem, we propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases from tabular datasets. It uses a graphical causal model to represent causal relationships among different features in the dataset and as a medium to inject domain knowledge. A user can detect the presence of bias against a group, say females, or a subgroup, say black females, by identifying unfair causal relationships in the causal network and using an array of fairness metrics. Thereafter, the user can mitigate bias by acting on the unfair causal edges. For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset based on the current causal model. Users can visually assess the impact of their interactions on different fairness metrics, utility metrics, data distortion, and the underlying data distribution. Once satisfied, they can download the debiased dataset and use it for any downstream application for fairer predictions. We evaluate D-BIAS by conducting experiments on 3 datasets and also a formal user study. We found that D-BIAS helps reduce bias significantly compared to the baseline debiasing approach across different fairness metrics while incurring little data distortion and a small loss in utility. Moreover, our human-in-the-loop based approach significantly outperforms an automated approach on trust, interpretability and accountability.

preprint2022arXiv

DramatVis Personae: Visual Text Analytics for Identifying Social Biases in Creative Writing

Implicit biases and stereotypes are often pervasive in different forms of creative writing such as novels, screenplays, and children's books. To understand the kind of biases writers are concerned about and how they mitigate those in their writing, we conducted formative interviews with nine writers. The interviews suggested that despite a writer's best interest, tracking and managing implicit biases such as a lack of agency, supporting or submissive roles, or harmful language for characters representing marginalized groups is challenging as the story becomes longer and complicated. Based on the interviews, we developed DramatVis Personae (DVP), a visual analytics tool that allows writers to assign social identities to characters, and evaluate how characters and different intersectional social identities are represented in the story. To evaluate DVP, we first conducted think-aloud sessions with three writers and found that DVP is easy-to-use, naturally integrates into the writing process, and could potentially help writers in several critical bias identification tasks. We then conducted a follow-up user study with 11 writers and found that participants could answer questions related to bias detection more efficiently using DVP in comparison to a simple text editor.

preprint2020arXiv

Active Learning++: Incorporating Annotator's Rationale using Local Model Explanation

We propose a new active learning (AL) framework, Active Learning++, which can utilize an annotator's labels as well as its rationale. Annotators can provide their rationale for choosing a label by ranking input features based on their importance for a given query. To incorporate this additional input, we modified the disagreement measure for a bagging-based Query by Committee (QBC) sampling strategy. Instead of weighing all committee models equally to select the next instance, we assign higher weight to the committee model with higher agreement with the annotator's ranking. Specifically, we generated a feature importance-based local explanation for each committee model. The similarity score between feature rankings provided by the annotator and the local model explanation is used to assign a weight to each corresponding committee model. This approach is applicable to any kind of ML model using model-agnostic techniques to generate local explanation such as LIME. With a simulation study, we show that our framework significantly outperforms a QBC based vanilla AL framework.

preprint2020arXiv

Measuring Social Biases of Crowd Workers using Counterfactual Queries

Social biases based on gender, race, etc. have been shown to pollute machine learning (ML) pipeline predominantly via biased training datasets. Crowdsourcing, a popular cost-effective measure to gather labeled training datasets, is not immune to the inherent social biases of crowd workers. To ensure such social biases aren't passed onto the curated datasets, it's important to know how biased each crowd worker is. In this work, we propose a new method based on counterfactual fairness to quantify the degree of inherent social bias in each crowd worker. This extra information can be leveraged together with individual worker responses to curate a less biased dataset.

Bhavya Ghai

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Cascaded Debiasing: Studying the Cumulative Effect of Multiple Fairness-Enhancing Interventions

D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias

DramatVis Personae: Visual Text Analytics for Identifying Social Biases in Creative Writing

Active Learning++: Incorporating Annotator's Rationale using Local Model Explanation

Measuring Social Biases of Crowd Workers using Counterfactual Queries