Alert button

"speech": models, code, and papers
Alert button

CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling

Oct 19, 2022
Jun Zhang, Shuyang Jiang, Jiangtao Feng, Lin Zheng, Lingpeng Kong

Figure 1 for CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Figure 2 for CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Figure 3 for CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Figure 4 for CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Viaarxiv icon

Language Independent Speech Emotion and Non-invasive Early Detection of Neurocognitive Disorder

Jun 03, 2021
Susmita Bhaduri, Anirban Bhaduri, Rajib Sarkar

Figure 1 for Language Independent Speech Emotion and Non-invasive Early Detection of Neurocognitive Disorder
Figure 2 for Language Independent Speech Emotion and Non-invasive Early Detection of Neurocognitive Disorder
Viaarxiv icon

The NTNU System for Formosa Speech Recognition Challenge 2020

Apr 09, 2021
Fu-An Chao, Tien-Hong Lo, Shi-Yan Weng, Shih-Hsuan Chiu, Yao-Ting Sung, Berlin Chen

Figure 1 for The NTNU System for Formosa Speech Recognition Challenge 2020
Figure 2 for The NTNU System for Formosa Speech Recognition Challenge 2020
Figure 3 for The NTNU System for Formosa Speech Recognition Challenge 2020
Figure 4 for The NTNU System for Formosa Speech Recognition Challenge 2020
Viaarxiv icon

Silent versus modal multi-speaker speech recognition from ultrasound and video

Feb 27, 2021
Manuel Sam Ribeiro, Aciel Eshky, Korin Richmond, Steve Renals

Figure 1 for Silent versus modal multi-speaker speech recognition from ultrasound and video
Figure 2 for Silent versus modal multi-speaker speech recognition from ultrasound and video
Figure 3 for Silent versus modal multi-speaker speech recognition from ultrasound and video
Figure 4 for Silent versus modal multi-speaker speech recognition from ultrasound and video
Viaarxiv icon

PINEAPPLE: Personifying INanimate Entities by Acquiring Parallel Personification data for Learning Enhanced generation

Sep 16, 2022
Sedrick Scott Keh, Kevin Lu, Varun Gangal, Steven Y. Feng, Harsh Jhamtani, Malihe Alikhani, Eduard Hovy

Figure 1 for PINEAPPLE: Personifying INanimate Entities by Acquiring Parallel Personification data for Learning Enhanced generation
Figure 2 for PINEAPPLE: Personifying INanimate Entities by Acquiring Parallel Personification data for Learning Enhanced generation
Figure 3 for PINEAPPLE: Personifying INanimate Entities by Acquiring Parallel Personification data for Learning Enhanced generation
Figure 4 for PINEAPPLE: Personifying INanimate Entities by Acquiring Parallel Personification data for Learning Enhanced generation
Viaarxiv icon

An Improved Model for Voicing Silent Speech

Jun 21, 2021
David Gaddy, Dan Klein

Figure 1 for An Improved Model for Voicing Silent Speech
Figure 2 for An Improved Model for Voicing Silent Speech
Figure 3 for An Improved Model for Voicing Silent Speech
Figure 4 for An Improved Model for Voicing Silent Speech
Viaarxiv icon

Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser

Apr 08, 2022
Sonal Joshi, Saurabh Kataria, Yiwen Shao, Piotr Zelasko, Jesus Villalba, Sanjeev Khudanpur, Najim Dehak

Figure 1 for Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser
Figure 2 for Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser
Figure 3 for Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser
Figure 4 for Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser
Viaarxiv icon

Detection of Consonant Errors in Disordered Speech Based on Consonant-vowel Segment Embedding

Jun 16, 2021
Si-Ioi Ng, Cymie Wing-Yee Ng, Jingyu Li, Tan Lee

Figure 1 for Detection of Consonant Errors in Disordered Speech Based on Consonant-vowel Segment Embedding
Figure 2 for Detection of Consonant Errors in Disordered Speech Based on Consonant-vowel Segment Embedding
Figure 3 for Detection of Consonant Errors in Disordered Speech Based on Consonant-vowel Segment Embedding
Figure 4 for Detection of Consonant Errors in Disordered Speech Based on Consonant-vowel Segment Embedding
Viaarxiv icon

Improving CTC-based speech recognition via knowledge transferring from pre-trained language models

Feb 22, 2022
Keqi Deng, Songjun Cao, Yike Zhang, Long Ma, Gaofeng Cheng, Ji Xu, Pengyuan Zhang

Figure 1 for Improving CTC-based speech recognition via knowledge transferring from pre-trained language models
Figure 2 for Improving CTC-based speech recognition via knowledge transferring from pre-trained language models
Figure 3 for Improving CTC-based speech recognition via knowledge transferring from pre-trained language models
Figure 4 for Improving CTC-based speech recognition via knowledge transferring from pre-trained language models
Viaarxiv icon

Towards MOOCs for Lip Reading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale

Aug 21, 2022
Aditya Agarwal, Bipasha Sen, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V Jawahar

Figure 1 for Towards MOOCs for Lip Reading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale
Figure 2 for Towards MOOCs for Lip Reading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale
Figure 3 for Towards MOOCs for Lip Reading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale
Figure 4 for Towards MOOCs for Lip Reading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale
Viaarxiv icon