Picture for Heinrich Dinkel

Heinrich Dinkel

AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction

Add code
Jun 25, 2023
Figure 1 for AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Figure 2 for AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Figure 3 for AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Figure 4 for AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Viaarxiv icon

Understanding temporally weakly supervised training: A case study for keyword spotting

Add code
May 30, 2023
Viaarxiv icon

Streaming Audio Transformers for Online Audio Tagging

Add code
May 29, 2023
Viaarxiv icon

Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers

Add code
Mar 03, 2023
Figure 1 for Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers
Figure 2 for Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers
Figure 3 for Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers
Figure 4 for Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers
Viaarxiv icon

An empirical study of weakly supervised audio tagging embeddings for general audio representations

Add code
Sep 30, 2022
Figure 1 for An empirical study of weakly supervised audio tagging embeddings for general audio representations
Figure 2 for An empirical study of weakly supervised audio tagging embeddings for general audio representations
Figure 3 for An empirical study of weakly supervised audio tagging embeddings for general audio representations
Figure 4 for An empirical study of weakly supervised audio tagging embeddings for general audio representations
Viaarxiv icon

UniKW-AT: Unified Keyword Spotting and Audio Tagging

Add code
Sep 23, 2022
Figure 1 for UniKW-AT: Unified Keyword Spotting and Audio Tagging
Figure 2 for UniKW-AT: Unified Keyword Spotting and Audio Tagging
Figure 3 for UniKW-AT: Unified Keyword Spotting and Audio Tagging
Figure 4 for UniKW-AT: Unified Keyword Spotting and Audio Tagging
Viaarxiv icon

Pseudo strong labels for large scale weakly supervised audio tagging

Add code
Apr 28, 2022
Figure 1 for Pseudo strong labels for large scale weakly supervised audio tagging
Figure 2 for Pseudo strong labels for large scale weakly supervised audio tagging
Figure 3 for Pseudo strong labels for large scale weakly supervised audio tagging
Figure 4 for Pseudo strong labels for large scale weakly supervised audio tagging
Viaarxiv icon

Voice activity detection in the wild: A data-driven approach using teacher-student training

Add code
May 10, 2021
Figure 1 for Voice activity detection in the wild: A data-driven approach using teacher-student training
Figure 2 for Voice activity detection in the wild: A data-driven approach using teacher-student training
Figure 3 for Voice activity detection in the wild: A data-driven approach using teacher-student training
Figure 4 for Voice activity detection in the wild: A data-driven approach using teacher-student training
Viaarxiv icon

Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events

Add code
Feb 23, 2021
Figure 1 for Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events
Figure 2 for Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events
Figure 3 for Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events
Figure 4 for Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events
Viaarxiv icon

Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning

Add code
Feb 23, 2021
Figure 1 for Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning
Figure 2 for Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning
Figure 3 for Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning
Viaarxiv icon