Source author record

Joao Rulff

Joao Rulff appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Human-Computer Interaction Machine Learning eess.AS Sound Artificial Intelligence cs.CY

Catalog footprint

What is connected

4works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Study on Robustness to Perturbations for Representations of Environmental Sound

Audio applications involving environmental sound analysis increasingly use general-purpose audio representations, also known as embeddings, for transfer learning. Recently, Holistic Evaluation of Audio Representations (HEAR) evaluated twenty-nine embedding models on nineteen diverse tasks. However, the evaluation's effectiveness depends on the variation already captured within a given dataset. Therefore, for a given data domain, it is unclear how the representations would be affected by the variations caused by myriad microphones' range and acoustic conditions -- commonly known as channel effects. We aim to extend HEAR to evaluate invariance to channel effects in this work. To accomplish this, we imitate channel effects by injecting perturbations to the audio signal and measure the shift in the new (perturbed) embeddings with three distance measures, making the evaluation domain-dependent but not task-dependent. Combined with the downstream performance, it helps us make a more informed prediction of how robust the embeddings are to the channel effects. We evaluate two embeddings -- YAMNet, and OpenL3 on monophonic (UrbanSound8K) and polyphonic (SONYC-UST) urban datasets. We show that one distance measure does not suffice in such task-independent evaluation. Although Fréchet Audio Distance (FAD) correlates with the trend of the performance drop in the downstream task most accurately, we show that we need to study FAD in conjunction with the other distances to get a clear understanding of the overall effect of the perturbation. In terms of the embedding performance, we find OpenL3 to be more robust than YAMNet, which aligns with the HEAR evaluation.

preprint2022arXiv

Calibrate: Interactive Analysis of Probabilistic Model Output

Analyzing classification model performance is a crucial task for machine learning practitioners. While practitioners often use count-based metrics derived from confusion matrices, like accuracy, many applications, such as weather prediction, sports betting, or patient risk prediction, rely on a classifier's predicted probabilities rather than predicted labels. In these instances, practitioners are concerned with producing a calibrated model, that is, one which outputs probabilities that reflect those of the true distribution. Model calibration is often analyzed visually, through static reliability diagrams, however, the traditional calibration visualization may suffer from a variety of drawbacks due to the strong aggregations it necessitates. Furthermore, count-based approaches are unable to sufficiently analyze model calibration. We present Calibrate, an interactive reliability diagram that addresses the aforementioned issues. Calibrate constructs a reliability diagram that is resistant to drawbacks in traditional approaches, and allows for interactive subgroup analysis and instance-level inspection. We demonstrate the utility of Calibrate through use cases on both real-world and synthetic data. We further validate Calibrate by presenting the results of a think-aloud experiment with data scientists who routinely analyze model calibration.

preprint2022arXiv

ggViz: Accelerating Large-Scale Esports Game Analysis

While esports organizations are increasingly adopting practices of conventional sports teams, such as dedicated analysts and data-driven decision-making, video-based game review is still the primary mode of game analysis. In conventional sports, advances in data collection have introduced systems that allow for sketch-based querying of game situations. However, due to data limitations, as well as differences in the sport itself, esports has seen a dearth of such systems. In this paper, we leverage player tracking data for Counter-Strike: Global Offensive (CSGO) to develop ggViz, a visual analytics system that allows users to query a large esports data set through game state sketches to find similar game states. Users are guided to game states of interest using win probability charts and round icons, and can summarize collections of states through heatmaps. We motivate our design through interviews with esports experts to especially address the issue of game review. We demonstrate ggViz's utility through detailed case studies and expert interviews with coaches, managers, and analysts from professional esports teams.

preprint2022arXiv

Urban Rhapsody: Large-scale exploration of urban soundscapes

Noise is one of the primary quality-of-life issues in urban environments. In addition to annoyance, noise negatively impacts public health and educational performance. While low-cost sensors can be deployed to monitor ambient noise levels at high temporal resolutions, the amount of data they produce and the complexity of these data pose significant analytical challenges. One way to address these challenges is through machine listening techniques, which are used to extract features in attempts to classify the source of noise and understand temporal patterns of a city's noise situation. However, the overwhelming number of noise sources in the urban environment and the scarcity of labeled data makes it nearly impossible to create classification models with large enough vocabularies that capture the true dynamism of urban soundscapes In this paper, we first identify a set of requirements in the yet unexplored domain of urban soundscape exploration. To satisfy the requirements and tackle the identified challenges, we propose Urban Rhapsody, a framework that combines state-of-the-art audio representation, machine learning, and visual analytics to allow users to interactively create classification models, understand noise patterns of a city, and quickly retrieve and label audio excerpts in order to create a large high-precision annotated database of urban sound recordings. We demonstrate the tool's utility through case studies performed by domain experts using data generated over the five-year deployment of a one-of-a-kind sensor network in New York City.

Joao Rulff

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

A Study on Robustness to Perturbations for Representations of Environmental Sound

Calibrate: Interactive Analysis of Probabilistic Model Output

ggViz: Accelerating Large-Scale Esports Game Analysis

Urban Rhapsody: Large-scale exploration of urban soundscapes