Alert button

"speech": models, code, and papers
Alert button

Speaker attribution in German parliamentary debates with QLoRA-adapted large language models

Sep 18, 2023
Tobias Bornheim, Niklas Grieger, Patrick Gustav Blaneck, Stephan Bialonski

Figure 1 for Speaker attribution in German parliamentary debates with QLoRA-adapted large language models
Figure 2 for Speaker attribution in German parliamentary debates with QLoRA-adapted large language models
Figure 3 for Speaker attribution in German parliamentary debates with QLoRA-adapted large language models
Figure 4 for Speaker attribution in German parliamentary debates with QLoRA-adapted large language models
Viaarxiv icon

Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning

May 23, 2023
Sara Kashiwagi, Keitaro Tanaka, Qi Feng, Shigeo Morishima

Figure 1 for Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning
Figure 2 for Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning
Figure 3 for Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning
Figure 4 for Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning
Viaarxiv icon

DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech

Jun 25, 2023
Sen Liu, Yiwei Guo, Chenpeng Du, Xie Chen, Kai Yu

Figure 1 for DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech
Figure 2 for DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech
Figure 3 for DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech
Figure 4 for DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech
Viaarxiv icon

Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media

Jul 18, 2023
Liam Hebert, Gaurav Sahu, Nanda Kishore Sreenivas, Lukasz Golab, Robin Cohen

Figure 1 for Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media
Figure 2 for Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media
Figure 3 for Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media
Figure 4 for Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media
Viaarxiv icon

Efficient Face Detection with Audio-Based Region Proposals

Sep 14, 2023
William Aris, François Grondin

Viaarxiv icon

The Art of Embedding Fusion: Optimizing Hate Speech Detection

Jun 26, 2023
Mohammad Aflah Khan, Neemesh Yadav, Mohit Jain, Sanyam Goyal

Figure 1 for The Art of Embedding Fusion: Optimizing Hate Speech Detection
Figure 2 for The Art of Embedding Fusion: Optimizing Hate Speech Detection
Figure 3 for The Art of Embedding Fusion: Optimizing Hate Speech Detection
Figure 4 for The Art of Embedding Fusion: Optimizing Hate Speech Detection
Viaarxiv icon

Towards Stealthy Backdoor Attacks against Speech Recognition via Elements of Sound

Jul 17, 2023
Hanbo Cai, Pengcheng Zhang, Hai Dong, Yan Xiao, Stefanos Koffas, Yiming Li

Figure 1 for Towards Stealthy Backdoor Attacks against Speech Recognition via Elements of Sound
Figure 2 for Towards Stealthy Backdoor Attacks against Speech Recognition via Elements of Sound
Figure 3 for Towards Stealthy Backdoor Attacks against Speech Recognition via Elements of Sound
Figure 4 for Towards Stealthy Backdoor Attacks against Speech Recognition via Elements of Sound
Viaarxiv icon

Incorporating L2 Phonemes Using Articulatory Features for Robust Speech Recognition

Jun 05, 2023
Jisung Wang, Haram Lee, Myungwoo Oh

Figure 1 for Incorporating L2 Phonemes Using Articulatory Features for Robust Speech Recognition
Figure 2 for Incorporating L2 Phonemes Using Articulatory Features for Robust Speech Recognition
Figure 3 for Incorporating L2 Phonemes Using Articulatory Features for Robust Speech Recognition
Figure 4 for Incorporating L2 Phonemes Using Articulatory Features for Robust Speech Recognition
Viaarxiv icon

Addressing Cold Start Problem for End-to-end Automatic Speech Scoring

Jun 25, 2023
Jungbae Park, Seungtaek Choi

Figure 1 for Addressing Cold Start Problem for End-to-end Automatic Speech Scoring
Figure 2 for Addressing Cold Start Problem for End-to-end Automatic Speech Scoring
Figure 3 for Addressing Cold Start Problem for End-to-end Automatic Speech Scoring
Figure 4 for Addressing Cold Start Problem for End-to-end Automatic Speech Scoring
Viaarxiv icon

Enhancing Quantised End-to-End ASR Models via Personalisation

Sep 17, 2023
Qiuming Zhao, Guangzhi Sun, Chao Zhang, Mingxing Xu, Thomas Fang Zheng

Figure 1 for Enhancing Quantised End-to-End ASR Models via Personalisation
Figure 2 for Enhancing Quantised End-to-End ASR Models via Personalisation
Figure 3 for Enhancing Quantised End-to-End ASR Models via Personalisation
Figure 4 for Enhancing Quantised End-to-End ASR Models via Personalisation
Viaarxiv icon