Alert button
Picture for Joachim Köhler

Joachim Köhler

Alert button

A Study on the Ambiguity in Human Annotation of German Oral History Interviews for Perceived Emotion Recognition and Sentiment Analysis

Jan 18, 2022
Michael Gref, Nike Matthiesen, Sreenivasa Hikkal Venugopala, Shalaka Satheesh, Aswinkumar Vijayananth, Duc Bach Ha, Sven Behnke, Joachim Köhler

Figure 1 for A Study on the Ambiguity in Human Annotation of German Oral History Interviews for Perceived Emotion Recognition and Sentiment Analysis
Figure 2 for A Study on the Ambiguity in Human Annotation of German Oral History Interviews for Perceived Emotion Recognition and Sentiment Analysis
Figure 3 for A Study on the Ambiguity in Human Annotation of German Oral History Interviews for Perceived Emotion Recognition and Sentiment Analysis
Figure 4 for A Study on the Ambiguity in Human Annotation of German Oral History Interviews for Perceived Emotion Recognition and Sentiment Analysis

For research in audiovisual interview archives often it is not only of interest what is said but also how. Sentiment analysis and emotion recognition can help capture, categorize and make these different facets searchable. In particular, for oral history archives, such indexing technologies can be of great interest. These technologies can help understand the role of emotions in historical remembering. However, humans often perceive sentiments and emotions ambiguously and subjectively. Moreover, oral history interviews have multi-layered levels of complex, sometimes contradictory, sometimes very subtle facets of emotions. Therefore, the question arises of the chance machines and humans have capturing and assigning these into predefined categories. This paper investigates the ambiguity in human perception of emotions and sentiment in German oral history interviews and the impact on machine learning systems. Our experiments reveal substantial differences in human perception for different emotions. Furthermore, we report from ongoing machine learning experiments with different modalities. We show that the human perceptual ambiguity and other challenges, such as class imbalance and lack of training data, currently limit the opportunities of these technologies for oral history archives. Nonetheless, our work uncovers promising observations and possibilities for further research.

* Submitted to LREC 2022 
Viaarxiv icon

Human and Automatic Speech Recognition Performance on German Oral History Interviews

Jan 18, 2022
Michael Gref, Nike Matthiesen, Christoph Schmidt, Sven Behnke, Joachim Köhler

Figure 1 for Human and Automatic Speech Recognition Performance on German Oral History Interviews
Figure 2 for Human and Automatic Speech Recognition Performance on German Oral History Interviews
Figure 3 for Human and Automatic Speech Recognition Performance on German Oral History Interviews
Figure 4 for Human and Automatic Speech Recognition Performance on German Oral History Interviews

Automatic speech recognition systems have accomplished remarkable improvements in transcription accuracy in recent years. On some domains, models now achieve near-human performance. However, transcription performance on oral history has not yet reached human accuracy. In the present work, we investigate how large this gap between human and machine transcription still is. For this purpose, we analyze and compare transcriptions of three humans on a new oral history data set. We estimate a human word error rate of 8.7% for recent German oral history interviews with clean acoustic conditions. For comparison with recent machine transcription accuracy, we present experiments on the adaptation of an acoustic model achieving near-human performance on broadcast speech. We investigate the influence of different adaptation data on robustness and generalization for clean and noisy oral history interviews. We optimize our acoustic models by 5 to 8% relative for this task and achieve 23.9% WER on noisy and 15.6% word error rate on clean oral history interviews.

* Submitted to LREC 2022 
Viaarxiv icon

Tab.IAIS: Flexible Table Recognition and Semantic Interpretation System

May 25, 2021
Marcin Namysl, Alexander M. Esser, Sven Behnke, Joachim Köhler

Figure 1 for Tab.IAIS: Flexible Table Recognition and Semantic Interpretation System
Figure 2 for Tab.IAIS: Flexible Table Recognition and Semantic Interpretation System
Figure 3 for Tab.IAIS: Flexible Table Recognition and Semantic Interpretation System
Figure 4 for Tab.IAIS: Flexible Table Recognition and Semantic Interpretation System

Table extraction is an important but still unsolved problem. In this paper, we introduce a flexible end-to-end table extraction system. We develop two rule-based algorithms that perform the complete table recognition process and support the most frequent table formats found in the scientific literature. Moreover, to incorporate the extraction of semantic information into the table recognition process, we develop a graph-based table interpretation method. We conduct extensive experiments on the challenging table recognition benchmarks ICDAR 2013 and ICDAR 2019. Our table recognition approach achieves results competitive with state-of-the-art approaches. Moreover, our complete information extraction system exhibited a high F1 score of 0.7380 proving the utility of our approach.

* 14 pages, 7 figures, 3 tables 
Viaarxiv icon

Empirical Error Modeling Improves Robustness of Noisy Neural Sequence Labeling

May 25, 2021
Marcin Namysl, Sven Behnke, Joachim Köhler

Figure 1 for Empirical Error Modeling Improves Robustness of Noisy Neural Sequence Labeling
Figure 2 for Empirical Error Modeling Improves Robustness of Noisy Neural Sequence Labeling
Figure 3 for Empirical Error Modeling Improves Robustness of Noisy Neural Sequence Labeling
Figure 4 for Empirical Error Modeling Improves Robustness of Noisy Neural Sequence Labeling

Despite recent advances, standard sequence labeling systems often fail when processing noisy user-generated text or consuming the output of an Optical Character Recognition (OCR) process. In this paper, we improve the noise-aware training method by proposing an empirical error generation approach that employs a sequence-to-sequence model trained to perform translation from error-free to erroneous text. Using an OCR engine, we generated a large parallel text corpus for training and produced several real-world noisy sequence labeling benchmarks for evaluation. Moreover, to overcome the data sparsity problem that exacerbates in the case of imperfect textual input, we learned noisy language model-based embeddings. Our approach outperformed the baseline noise generation and error correction techniques on the erroneous sequence labeling data sets. To facilitate future research on robustness, we make our code, embeddings, and data conversion scripts publicly available.

* Accepted to appear in Findings of ACL 2021 (camera-ready version) 
Viaarxiv icon

NAT: Noise-Aware Training for Robust Neural Sequence Labeling

May 14, 2020
Marcin Namysl, Sven Behnke, Joachim Köhler

Sequence labeling systems should perform reliably not only under ideal conditions but also with corrupted inputs - as these systems often process user-generated text or follow an error-prone upstream component. To this end, we formulate the noisy sequence labeling problem, where the input may undergo an unknown noising process and propose two Noise-Aware Training (NAT) objectives that improve robustness of sequence labeling performed on perturbed input: Our data augmentation method trains a neural model using a mixture of clean and noisy samples, whereas our stability training algorithm encourages the model to create a noise-invariant latent representation. We employ a vanilla noise model at training time. For evaluation, we use both the original data and its variants perturbed with real OCR errors and misspellings. Extensive experiments on English and German named entity recognition benchmarks confirmed that NAT consistently improved robustness of popular sequence labeling models, preserving accuracy on the original input. We make our code and data publicly available for the research community.

* Accepted to appear at ACL 2020 
Viaarxiv icon

Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability

Apr 17, 2020
Georg Rehm, Dimitrios Galanis, Penny Labropoulou, Stelios Piperidis, Martin Welß, Ricardo Usbeck, Joachim Köhler, Miltos Deligiannis, Katerina Gkirtzou, Johannes Fischer, Christian Chiarcos, Nils Feldhus, Julián Moreno-Schneider, Florian Kintzel, Elena Montiel, Víctor Rodríguez Doncel, John P. McCrae, David Laqua, Irina Patricia Theile, Christian Dittmar, Kalina Bontcheva, Ian Roberts, Andrejs Vasiljevs, Andis Lagzdiņš

Figure 1 for Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability
Figure 2 for Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability
Figure 3 for Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability
Figure 4 for Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability

With regard to the wider area of AI/LT platform interoperability, we concentrate on two core aspects: (1) cross-platform search and discovery of resources and services; (2) composition of cross-platform service workflows. We devise five different levels (of increasing complexity) of platform interoperability that we suggest to implement in a wider federation of AI/LT platforms. We illustrate the approach using the five emerging AI/LT platforms AI4EU, ELG, Lynx, QURATOR and SPEAKER.

* Proceedings of the 1st International Workshop on Language Technology Platforms (IWLTP 2020). To appear 
Viaarxiv icon

The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe

Mar 30, 2020
Georg Rehm, Katrin Marheinecke, Stefanie Hegele, Stelios Piperidis, Kalina Bontcheva, Jan Hajič, Khalid Choukri, Andrejs Vasiļjevs, Gerhard Backfried, Christoph Prinz, José Manuel Gómez Pérez, Luc Meertens, Paul Lukowicz, Josef van Genabith, Andrea Lösch, Philipp Slusallek, Morten Irgens, Patrick Gatellier, Joachim Köhler, Laure Le Bars, Dimitra Anastasiou, Albina Auksoriūtė, Núria Bel, António Branco, Gerhard Budin, Walter Daelemans, Koenraad De Smedt, Radovan Garabík, Maria Gavriilidou, Dagmar Gromann, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lindén, Bernardo Magnini, Jan Odijk, Maciej Ogrodniczuk, Eiríkur Rögnvaldsson, Mike Rosner, Bolette Sandford Pedersen, Inguna Skadiņa, Marko Tadić, Dan Tufiş, Tamás Váradi, Kadri Vider, Andy Way, François Yvon

Figure 1 for The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe
Figure 2 for The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe

Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality. However, language barriers impacting business, cross-lingual and cross-cultural communication are still omnipresent. Language Technologies (LTs) are a powerful means to break down these barriers. While the last decade has seen various initiatives that created a multitude of approaches and technologies tailored to Europe's specific needs, there is still an immense level of fragmentation. At the same time, AI has become an increasingly important concept in the European Information and Communication Technology area. For a few years now, AI, including many opportunities, synergies but also misconceptions, has been overshadowing every other topic. We present an overview of the European LT landscape, describing funding programmes, activities, actions and challenges in the different countries with regard to LT, including the current state of play in industry and the LT market. We present a brief overview of the main LT-related activities on the EU level in the last ten years and develop strategic guidance with regard to four key dimensions.

* Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020). To appear 
Viaarxiv icon

Two-Staged Acoustic Modeling Adaption for Robust Speech Recognition by the Example of German Oral History Interviews

Aug 19, 2019
Michael Gref, Christoph Schmidt, Sven Behnke, Joachim Köhler

Figure 1 for Two-Staged Acoustic Modeling Adaption for Robust Speech Recognition by the Example of German Oral History Interviews
Figure 2 for Two-Staged Acoustic Modeling Adaption for Robust Speech Recognition by the Example of German Oral History Interviews
Figure 3 for Two-Staged Acoustic Modeling Adaption for Robust Speech Recognition by the Example of German Oral History Interviews
Figure 4 for Two-Staged Acoustic Modeling Adaption for Robust Speech Recognition by the Example of German Oral History Interviews

In automatic speech recognition, often little training data is available for specific challenging tasks, but training of state-of-the-art automatic speech recognition systems requires large amounts of annotated speech. To address this issue, we propose a two-staged approach to acoustic modeling that combines noise and reverberation data augmentation with transfer learning to robustly address challenges such as difficult acoustic recording conditions, spontaneous speech, and speech of elderly people. We evaluate our approach using the example of German oral history interviews, where a relative average reduction of the word error rate by 19.3% is achieved.

* IEEE International Conference on Multimedia and Expo (ICME), Shanghai, China, July 2019  
* Accepted for IEEE International Conference on Multimedia and Expo (ICME), Shanghai, China, July 2019 
Viaarxiv icon