Alert button

"speech": models, code, and papers
Alert button

SparseVSR: Lightweight and Noise Robust Visual Speech Recognition

Jul 10, 2023
Adriana Fernandez-Lopez, Honglie Chen, Pingchuan Ma, Alexandros Haliassos, Stavros Petridis, Maja Pantic

Figure 1 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Figure 2 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Figure 3 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Figure 4 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Viaarxiv icon

Improved Cross-Lingual Transfer Learning For Automatic Speech Translation

Jun 01, 2023
Sameer Khurana, Nauman Dawalatabad, Antoine Laurent, Luis Vicente, Pablo Gimeno, Victoria Mingote, James Glass

Figure 1 for Improved Cross-Lingual Transfer Learning For Automatic Speech Translation
Figure 2 for Improved Cross-Lingual Transfer Learning For Automatic Speech Translation
Figure 3 for Improved Cross-Lingual Transfer Learning For Automatic Speech Translation
Figure 4 for Improved Cross-Lingual Transfer Learning For Automatic Speech Translation
Viaarxiv icon

HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods

Sep 15, 2023
Hyun-seo Shin, Jungwoo Heo, Ju-ho Kim, Chan-yeong Lim, Wonbin Kim, Ha-Jin Yu

Figure 1 for HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods
Figure 2 for HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods
Figure 3 for HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods
Figure 4 for HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods
Viaarxiv icon

HTEC: Human Transcription Error Correction

Sep 18, 2023
Hanbo Sun, Jian Gao, Xiaomin Wu, Anjie Fang, Cheng Cao, Zheng Du

Figure 1 for HTEC: Human Transcription Error Correction
Figure 2 for HTEC: Human Transcription Error Correction
Figure 3 for HTEC: Human Transcription Error Correction
Figure 4 for HTEC: Human Transcription Error Correction
Viaarxiv icon

SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts

Jun 03, 2023
Haibin Wu, Kai-Wei Chang, Yuan-Kuei Wu, Hung-yi Lee

Figure 1 for SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts
Figure 2 for SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts
Figure 3 for SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts
Figure 4 for SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts
Viaarxiv icon

MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models

May 30, 2023
Yu-Hsiang Wang, Huang-Yu Chen, Kai-Wei Chang, Winston Hsu, Hung-yi Lee

Figure 1 for MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models
Figure 2 for MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models
Figure 3 for MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models
Figure 4 for MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models
Viaarxiv icon

Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems

Jun 26, 2023
Jiajun Deng, Guinan Li, Xurong Xie, Zengrui Jin, Mingyu Cui, Tianzi Wang, Shujie Hu, Mengzhe Geng, Xunying Liu

Figure 1 for Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems
Figure 2 for Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems
Figure 3 for Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems
Figure 4 for Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems
Viaarxiv icon

RescueSpeech: A German Corpus for Speech Recognition in Search and Rescue Domain

Jun 06, 2023
Sangeet Sagar, Mirco Ravanelli, Bernd Kiefer, Ivana Kruijff Korbayova, Josef van Genabith

Figure 1 for RescueSpeech: A German Corpus for Speech Recognition in Search and Rescue Domain
Figure 2 for RescueSpeech: A German Corpus for Speech Recognition in Search and Rescue Domain
Figure 3 for RescueSpeech: A German Corpus for Speech Recognition in Search and Rescue Domain
Figure 4 for RescueSpeech: A German Corpus for Speech Recognition in Search and Rescue Domain
Viaarxiv icon

Personalization for BERT-based Discriminative Speech Recognition Rescoring

Jul 13, 2023
Jari Kolehmainen, Yile Gu, Aditya Gourav, Prashanth Gurunath Shivakumar, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko

Figure 1 for Personalization for BERT-based Discriminative Speech Recognition Rescoring
Figure 2 for Personalization for BERT-based Discriminative Speech Recognition Rescoring
Figure 3 for Personalization for BERT-based Discriminative Speech Recognition Rescoring
Figure 4 for Personalization for BERT-based Discriminative Speech Recognition Rescoring
Viaarxiv icon

Phase perturbation improves channel robustness for speech spoofing countermeasures

Jun 06, 2023
Yongyi Zang, You Zhang, Zhiyao Duan

Figure 1 for Phase perturbation improves channel robustness for speech spoofing countermeasures
Figure 2 for Phase perturbation improves channel robustness for speech spoofing countermeasures
Figure 3 for Phase perturbation improves channel robustness for speech spoofing countermeasures
Figure 4 for Phase perturbation improves channel robustness for speech spoofing countermeasures
Viaarxiv icon