Graph explorer

Reproducible Subjective Evaluation

Human perceptual studies are the gold standard for the evaluation of many research tasks in machine learning, linguistics, and psychology. However, these studies require significant time and cost to perform. As a result, many researchers use objective measures that can correlate poorly with human evaluation. When subjective evaluations are performed, they are often not reported with sufficient detail to ensure reproducibility. We propose Reproducible Subjective Evaluation (ReSEval), an open-source framework for quickly deploying crowdsourced subjective evaluations directly from Python. ReSEval lets researchers launch A/B, ABX, Mean Opinion Score (MOS) and MUltiple Stimuli with Hidden Reference and Anchor (MUSHRA) tests on audio, image, text, or video data from a command-line interface or using one line of Python, making it as easy to run as objective evaluation. With ReSEval, researchers can reproduce each other's subjective evaluations by sharing a configuration file and the audio, image, text, or video files.

7 nodes7 linksoverview previewReproducible Subjective Evaluation
7 nodes7 links
Reproducible Subjective Evaluation7 visible / 7 total nodes / 13 links
Related contextCo-authorshipCo-authorshipCo-authorshipCo-authorshipCo-authorshipCo-authorshipAuthorshipAuthorshipAuthorshipAuthorshipTopic signalTopic signalWReproducible Subjective Evaluationpreprint / 2022AMax MorrisonResearcherABrian TangResearcherAGefei TanResearcherABryan PardoResearcherTMachine Learning49008 worksTHuman-Computer Interaction3971 works
PaperSignal 106 links

Reproducible Subjective Evaluation

preprint / 2022

Open