Alert button

"speech": models, code, and papers
Alert button

ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression

May 26, 2023
Yixin Wan, Yuan Zhou, Xiulian Peng, Kai-Wei Chang, Yan Lu

Figure 1 for ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression
Figure 2 for ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression
Figure 3 for ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression
Figure 4 for ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression
Viaarxiv icon

Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert

Add code
Bookmark button
Alert button
Mar 29, 2023
Jiadong Wang, Xinyuan Qian, Malu Zhang, Robby T. Tan, Haizhou Li

Figure 1 for Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
Figure 2 for Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
Figure 3 for Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
Figure 4 for Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
Viaarxiv icon

Quran Recitation Recognition using End-to-End Deep Learning

May 10, 2023
Ahmad Al Harere, Khloud Al Jallad

Figure 1 for Quran Recitation Recognition using End-to-End Deep Learning
Figure 2 for Quran Recitation Recognition using End-to-End Deep Learning
Figure 3 for Quran Recitation Recognition using End-to-End Deep Learning
Figure 4 for Quran Recitation Recognition using End-to-End Deep Learning
Viaarxiv icon

The Ability of Self-Supervised Speech Models for Audio Representations

Add code
Bookmark button
Alert button
Sep 28, 2022
Tung-Yu Wu, Chen-An Li, Tzu-Han Lin, Tsu-Yuan Hsu, Hung-Yi Lee

Figure 1 for The Ability of Self-Supervised Speech Models for Audio Representations
Figure 2 for The Ability of Self-Supervised Speech Models for Audio Representations
Figure 3 for The Ability of Self-Supervised Speech Models for Audio Representations
Figure 4 for The Ability of Self-Supervised Speech Models for Audio Representations
Viaarxiv icon

Risk of re-identification for shared clinical speech recordings

Add code
Bookmark button
Alert button
Oct 18, 2022
Daniela A. Wiepert, Bradley A. Malin, Joseph R. Duffy, Rene L. Utianski, John L. Stricker, David T. Jones, Hugo Botha

Figure 1 for Risk of re-identification for shared clinical speech recordings
Figure 2 for Risk of re-identification for shared clinical speech recordings
Figure 3 for Risk of re-identification for shared clinical speech recordings
Figure 4 for Risk of re-identification for shared clinical speech recordings
Viaarxiv icon

Improving Speech Enhancement through Fine-Grained Speech Characteristics

Add code
Bookmark button
Alert button
Jul 01, 2022
Muqiao Yang, Joseph Konan, David Bick, Anurag Kumar, Shinji Watanabe, Bhiksha Raj

Figure 1 for Improving Speech Enhancement through Fine-Grained Speech Characteristics
Figure 2 for Improving Speech Enhancement through Fine-Grained Speech Characteristics
Figure 3 for Improving Speech Enhancement through Fine-Grained Speech Characteristics
Figure 4 for Improving Speech Enhancement through Fine-Grained Speech Characteristics
Viaarxiv icon

Fast and efficient speech enhancement with variational autoencoders

Add code
Bookmark button
Alert button
Nov 02, 2022
Mostafa Sadeghi, Romain Serizel

Figure 1 for Fast and efficient speech enhancement with variational autoencoders
Viaarxiv icon

SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data

Add code
Bookmark button
Alert button
Sep 30, 2022
Ziqiang Zhang, Sanyuan Chen, Long Zhou, Yu Wu, Shuo Ren, Shujie Liu, Zhuoyuan Yao, Xun Gong, Lirong Dai, Jinyu Li, Furu Wei

Figure 1 for SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
Figure 2 for SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
Figure 3 for SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
Figure 4 for SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
Viaarxiv icon

Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding

Add code
Bookmark button
Alert button
May 23, 2023
Tian-Hao Zhang, Hai-Bo Qin, Zhi-Hao Lai, Song-Lu Chen, Qi Liu, Feng Chen, Xinyuan Qian, Xu-Cheng Yin

Figure 1 for Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding
Figure 2 for Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding
Figure 3 for Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding
Figure 4 for Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding
Viaarxiv icon

Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech

Aug 10, 2022
Kaitao Song, Teng Wan, Bixia Wang, Huiqiang Jiang, Luna Qiu, Jiahang Xu, Liping Jiang, Qun Lou, Yuqing Yang, Dongsheng Li, Xudong Wang, Lili Qiu

Figure 1 for Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech
Figure 2 for Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech
Figure 3 for Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech
Figure 4 for Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech
Viaarxiv icon