Source author record

Daniel Luna

Daniel Luna appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

1works
1topics
4close collaborators

Actions

Connect this record

Log in to claim

Research graph

See the researcher in context

Open full explorer

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

1 published item(s)

preprint2022arXiv

Impact of class imbalance on chest x-ray classifiers: towards better evaluation practices for discrimination and calibration performance

This work aims to analyze standard evaluation practices adopted by the research community when assessing chest x-ray classifiers, particularly focusing on the impact of class imbalance in such appraisals. Our analysis considers a comprehensive definition of model performance, covering not only discriminative performance but also model calibration, a topic of research that has received increasing attention during the last years within the machine learning community. Firstly, we conducted a literature study to analyze common scientific practices and confirmed that: (1) even when dealing with highly imbalanced datasets, the community tends to use metrics that are dominated by the majority class; and (2) it is still uncommon to include calibration studies for chest x-ray classifiers, albeit its importance in the context of healthcare. Secondly, we perform a systematic experiment on two major chest x-ray datasets to explore the behavior of several performance metrics under different class ratios and show that widely adopted metrics can conceal the performance in the minority class. Finally, we recommend the inclusion of complementary metrics to better reflect the system's performance in such scenarios. Our study indicates that current evaluation practices adopted by the research community for chest x-ray computer-aided diagnosis systems may not reflect their performance in real clinical scenarios, and suggest alternatives to improve this situation.