Alert button

"speech recognition": models, code, and papers
Alert button

A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation

Sep 14, 2022
Tom O'Malley, Arun Narayanan, Quan Wang

Figure 1 for A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation
Figure 2 for A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation
Figure 3 for A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation
Figure 4 for A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation
Viaarxiv icon

On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition

Add code
Bookmark button
Alert button
Nov 01, 2018
Zhiping Zeng, Yerbolat Khassanov, Van Tung Pham, Haihua Xu, Eng Siong Chng, Haizhou Li

Figure 1 for On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition
Figure 2 for On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition
Figure 3 for On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition
Figure 4 for On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition
Viaarxiv icon

Speaker Recognition in the Wild

Add code
Bookmark button
Alert button
May 05, 2022
Neeraj Chhimwal, Anirudh Gupta, Rishabh Gaur, Harveen Singh Chadha, Priyanshi Shah, Ankur Dhuriya, Vivek Raghavan

Figure 1 for Speaker Recognition in the Wild
Figure 2 for Speaker Recognition in the Wild
Figure 3 for Speaker Recognition in the Wild
Viaarxiv icon

Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications

Oct 29, 2020
Yongqiang Wang, Yangyang Shi, Frank Zhang, Chunyang Wu, Julian Chan, Ching-Feng Yeh, Alex Xiao

Figure 1 for Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications
Figure 2 for Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications
Figure 3 for Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications
Figure 4 for Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications
Viaarxiv icon

Improving Voice Separation by Incorporating End-to-end Speech Recognition

Add code
Bookmark button
Alert button
Nov 29, 2019
Naoya Takahashi, Mayank Kumar Singh, Sakya Basak, Parthasaarathy Sudarsanam, Sriram Ganapathy, Yuki Mitsufuji

Figure 1 for Improving Voice Separation by Incorporating End-to-end Speech Recognition
Figure 2 for Improving Voice Separation by Incorporating End-to-end Speech Recognition
Figure 3 for Improving Voice Separation by Incorporating End-to-end Speech Recognition
Figure 4 for Improving Voice Separation by Incorporating End-to-end Speech Recognition
Viaarxiv icon

Densely Connected Convolutional Networks for Speech Recognition

Aug 10, 2018
Chia Yu Li, Ngoc Thang Vu

Figure 1 for Densely Connected Convolutional Networks for Speech Recognition
Figure 2 for Densely Connected Convolutional Networks for Speech Recognition
Figure 3 for Densely Connected Convolutional Networks for Speech Recognition
Figure 4 for Densely Connected Convolutional Networks for Speech Recognition
Viaarxiv icon

A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction

Add code
Bookmark button
Alert button
Mar 31, 2022
Zexu Pan, Meng Ge, Haizhou Li

Figure 1 for A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction
Figure 2 for A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction
Figure 3 for A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction
Figure 4 for A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction
Viaarxiv icon

CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition

Add code
Bookmark button
Alert button
May 27, 2019
Linhao Dong, Bo Xu

Figure 1 for CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition
Figure 2 for CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition
Figure 3 for CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition
Figure 4 for CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition
Viaarxiv icon

End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning

Aug 13, 2019
Pavel Denisov, Ngoc Thang Vu

Figure 1 for End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning
Figure 2 for End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning
Figure 3 for End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning
Figure 4 for End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning
Viaarxiv icon

Speech Emotion Recognition with Global-Aware Fusion on Multi-scale Feature Representation

Add code
Bookmark button
Alert button
Apr 12, 2022
Wenjing Zhu, Xiang Li

Figure 1 for Speech Emotion Recognition with Global-Aware Fusion on Multi-scale Feature Representation
Figure 2 for Speech Emotion Recognition with Global-Aware Fusion on Multi-scale Feature Representation
Figure 3 for Speech Emotion Recognition with Global-Aware Fusion on Multi-scale Feature Representation
Figure 4 for Speech Emotion Recognition with Global-Aware Fusion on Multi-scale Feature Representation
Viaarxiv icon