Alert button

"speech recognition": models, code, and papers
Alert button

Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions

Add code
Bookmark button
Alert button
Oct 27, 2021
Wangyou Zhang, Jing Shi, Chenda Li, Shinji Watanabe, Yanmin Qian

Figure 1 for Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions
Figure 2 for Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions
Figure 3 for Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions
Viaarxiv icon

Learning linearly separable features for speech recognition using convolutional neural networks

Apr 16, 2015
Dimitri Palaz, Mathew Magimai Doss, Ronan Collobert

Figure 1 for Learning linearly separable features for speech recognition using convolutional neural networks
Figure 2 for Learning linearly separable features for speech recognition using convolutional neural networks
Figure 3 for Learning linearly separable features for speech recognition using convolutional neural networks
Figure 4 for Learning linearly separable features for speech recognition using convolutional neural networks
Viaarxiv icon

A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes

Apr 20, 2022
Shaojin Ding, Weiran Wang, Ding Zhao, Tara N. Sainath, Yanzhang He, Robert David, Rami Botros, Xin Wang, Rina Panigrahy, Qiao Liang, Dongseong Hwang, Ian McGraw, Rohit Prabhavalkar, Trevor Strohman

Figure 1 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 2 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 3 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 4 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Viaarxiv icon

CMGAN: Conformer-based Metric GAN for Speech Enhancement

Add code
Bookmark button
Alert button
Mar 28, 2022
Ruizhe Cao, Sherif Abdulatif, Bin Yang

Figure 1 for CMGAN: Conformer-based Metric GAN for Speech Enhancement
Figure 2 for CMGAN: Conformer-based Metric GAN for Speech Enhancement
Figure 3 for CMGAN: Conformer-based Metric GAN for Speech Enhancement
Viaarxiv icon

Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition

Add code
Bookmark button
Alert button
Jul 09, 2019
Yonatan Belinkov, Ahmed Ali, James Glass

Figure 1 for Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition
Figure 2 for Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition
Figure 3 for Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition
Figure 4 for Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition
Viaarxiv icon

Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models

Add code
Bookmark button
Alert button
Feb 26, 2022
Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, N. Tomashenko

Figure 1 for Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models
Figure 2 for Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models
Figure 3 for Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models
Figure 4 for Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models
Viaarxiv icon

Distilling Knowledge Using Parallel Data for Far-field Speech Recognition

Feb 20, 2018
Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Bin Liu

Figure 1 for Distilling Knowledge Using Parallel Data for Far-field Speech Recognition
Figure 2 for Distilling Knowledge Using Parallel Data for Far-field Speech Recognition
Figure 3 for Distilling Knowledge Using Parallel Data for Far-field Speech Recognition
Viaarxiv icon

SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning

Add code
Bookmark button
Alert button
Oct 16, 2022
Tzu-hsun Feng, Annie Dong, Ching-Feng Yeh, Shu-wen Yang, Tzu-Quan Lin, Jiatong Shi, Kai-Wei Chang, Zili Huang, Haibin Wu, Xuankai Chang, Shinji Watanabe, Abdelrahman Mohamed, Shang-Wen Li, Hung-yi Lee

Figure 1 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Figure 2 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Figure 3 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Figure 4 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Viaarxiv icon

CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR

Mar 31, 2022
Keyu An, Huahuan Zheng, Zhijian Ou, Hongyu Xiang, Ke Ding, Guanglu Wan

Figure 1 for CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Figure 2 for CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Figure 3 for CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Figure 4 for CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Viaarxiv icon

Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition

Jan 06, 2020
Zhong Meng, Jinyu Li, Yashesh Gaur, Yifan Gong

Figure 1 for Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition
Figure 2 for Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition
Figure 3 for Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition
Viaarxiv icon