Source author record

Christoph Schmidt

Christoph Schmidt appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language eess.AS Human-Computer Interaction Sound

Catalog footprint

What is connected

3works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Human and Automatic Speech Recognition Performance on German Oral History Interviews

Automatic speech recognition systems have accomplished remarkable improvements in transcription accuracy in recent years. On some domains, models now achieve near-human performance. However, transcription performance on oral history has not yet reached human accuracy. In the present work, we investigate how large this gap between human and machine transcription still is. For this purpose, we analyze and compare transcriptions of three humans on a new oral history data set. We estimate a human word error rate of 8.7% for recent German oral history interviews with clean acoustic conditions. For comparison with recent machine transcription accuracy, we present experiments on the adaptation of an acoustic model achieving near-human performance on broadcast speech. We investigate the influence of different adaptation data on robustness and generalization for clean and noisy oral history interviews. We optimize our acoustic models by 5 to 8% relative for this task and achieve 23.9% WER on noisy and 15.6% word error rate on clean oral history interviews.

preprint2020arXiv

Multi-Staged Cross-Lingual Acoustic Model Adaption for Robust Speech Recognition in Real-World Applications -- A Case Study on German Oral History Interviews

While recent automatic speech recognition systems achieve remarkable performance when large amounts of adequate, high quality annotated speech data is used for training, the same systems often only achieve an unsatisfactory result for tasks in domains that greatly deviate from the conditions represented by the training data. For many real-world applications, there is a lack of sufficient data that can be directly used for training robust speech recognition systems. To address this issue, we propose and investigate an approach that performs a robust acoustic model adaption to a target domain in a cross-lingual, multi-staged manner. Our approach enables the exploitation of large-scale training data from other domains in both the same and other languages. We evaluate our approach using the challenging task of German oral history interviews, where we achieve a relative reduction of the word error rate by more than 30% compared to a model trained from scratch only on the target domain, and 6-7% relative compared to a model trained robustly on 1000 hours of same-language out-of-domain training data.

preprint2020arXiv

Varying Annotations in the Steps of the Visual Analysis

Annotations in Visual Analytics (VA) have become a common means to support the analysis by integrating additional information into the VA system. That additional information often depends on the current process step in the visual analysis. For example, the data preprocessing step has data structuring operations while the data exploration step focuses on user interaction and input. Describing suitable annotations to meet the goals of the different steps is challenging. To tackle this issue, we identify individual annotations for each step and outline their gathering and design properties for the visual analysis of heterogeneous clinical data. We integrate our annotation design into a visual analysis tool to show its applicability to data from the ophthalmic domain. In interviews and application sessions with experts we asses its usefulness for the analysis of patients with different medications.

Christoph Schmidt

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Human and Automatic Speech Recognition Performance on German Oral History Interviews

Multi-Staged Cross-Lingual Acoustic Model Adaption for Robust Speech Recognition in Real-World Applications -- A Case Study on German Oral History Interviews

Varying Annotations in the Steps of the Visual Analysis