Picture for Heinrich Dinkel

Heinrich Dinkel

Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information

Add code
Jun 28, 2023
Figure 1 for Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information
Figure 2 for Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information
Figure 3 for Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information
Figure 4 for Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information
Viaarxiv icon

AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction

Add code
Jun 25, 2023
Figure 1 for AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Figure 2 for AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Figure 3 for AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Figure 4 for AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Viaarxiv icon

Understanding temporally weakly supervised training: A case study for keyword spotting

Add code
May 30, 2023
Viaarxiv icon

Streaming Audio Transformers for Online Audio Tagging

Add code
May 29, 2023
Figure 1 for Streaming Audio Transformers for Online Audio Tagging
Figure 2 for Streaming Audio Transformers for Online Audio Tagging
Figure 3 for Streaming Audio Transformers for Online Audio Tagging
Figure 4 for Streaming Audio Transformers for Online Audio Tagging
Viaarxiv icon

Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers

Add code
Mar 03, 2023
Figure 1 for Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers
Figure 2 for Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers
Figure 3 for Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers
Figure 4 for Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers
Viaarxiv icon

An empirical study of weakly supervised audio tagging embeddings for general audio representations

Add code
Sep 30, 2022
Figure 1 for An empirical study of weakly supervised audio tagging embeddings for general audio representations
Figure 2 for An empirical study of weakly supervised audio tagging embeddings for general audio representations
Figure 3 for An empirical study of weakly supervised audio tagging embeddings for general audio representations
Figure 4 for An empirical study of weakly supervised audio tagging embeddings for general audio representations
Viaarxiv icon

UniKW-AT: Unified Keyword Spotting and Audio Tagging

Add code
Sep 23, 2022
Figure 1 for UniKW-AT: Unified Keyword Spotting and Audio Tagging
Figure 2 for UniKW-AT: Unified Keyword Spotting and Audio Tagging
Figure 3 for UniKW-AT: Unified Keyword Spotting and Audio Tagging
Figure 4 for UniKW-AT: Unified Keyword Spotting and Audio Tagging
Viaarxiv icon

Pseudo strong labels for large scale weakly supervised audio tagging

Add code
Apr 28, 2022
Figure 1 for Pseudo strong labels for large scale weakly supervised audio tagging
Figure 2 for Pseudo strong labels for large scale weakly supervised audio tagging
Figure 3 for Pseudo strong labels for large scale weakly supervised audio tagging
Figure 4 for Pseudo strong labels for large scale weakly supervised audio tagging
Viaarxiv icon

Voice activity detection in the wild: A data-driven approach using teacher-student training

Add code
May 10, 2021
Figure 1 for Voice activity detection in the wild: A data-driven approach using teacher-student training
Figure 2 for Voice activity detection in the wild: A data-driven approach using teacher-student training
Figure 3 for Voice activity detection in the wild: A data-driven approach using teacher-student training
Figure 4 for Voice activity detection in the wild: A data-driven approach using teacher-student training
Viaarxiv icon

Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events

Add code
Feb 23, 2021
Figure 1 for Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events
Figure 2 for Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events
Figure 3 for Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events
Figure 4 for Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events
Viaarxiv icon