Alert button

"speech": models, code, and papers
Alert button

Deep Multi-Frame MVDR Filtering for Binaural Noise Reduction

Add code
Bookmark button
Alert button
May 18, 2022
Marvin Tammen, Simon Doclo

Figure 1 for Deep Multi-Frame MVDR Filtering for Binaural Noise Reduction
Figure 2 for Deep Multi-Frame MVDR Filtering for Binaural Noise Reduction
Viaarxiv icon

FrAUG: A Frame Rate Based Data Augmentation Method for Depression Detection from Speech Signals

Feb 11, 2022
Vijay Ravi, Jinhan Wang, Jonathan Flint, Abeer Alwan

Figure 1 for FrAUG: A Frame Rate Based Data Augmentation Method for Depression Detection from Speech Signals
Figure 2 for FrAUG: A Frame Rate Based Data Augmentation Method for Depression Detection from Speech Signals
Figure 3 for FrAUG: A Frame Rate Based Data Augmentation Method for Depression Detection from Speech Signals
Figure 4 for FrAUG: A Frame Rate Based Data Augmentation Method for Depression Detection from Speech Signals
Viaarxiv icon

Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement Learning

Add code
Bookmark button
Alert button
Sep 16, 2022
Atsumoto Ohashi, Ryuichiro Higashinaka

Figure 1 for Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement Learning
Figure 2 for Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement Learning
Figure 3 for Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement Learning
Figure 4 for Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement Learning
Viaarxiv icon

Effectiveness of Mining Audio and Text Pairs from Public Data for Improving ASR Systems for Low-Resource Languages

Add code
Bookmark button
Alert button
Aug 26, 2022
Kaushal Santosh Bhogale, Abhigyan Raman, Tahir Javed, Sumanth Doddapaneni, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

Figure 1 for Effectiveness of Mining Audio and Text Pairs from Public Data for Improving ASR Systems for Low-Resource Languages
Figure 2 for Effectiveness of Mining Audio and Text Pairs from Public Data for Improving ASR Systems for Low-Resource Languages
Figure 3 for Effectiveness of Mining Audio and Text Pairs from Public Data for Improving ASR Systems for Low-Resource Languages
Figure 4 for Effectiveness of Mining Audio and Text Pairs from Public Data for Improving ASR Systems for Low-Resource Languages
Viaarxiv icon

UserLibri: A Dataset for ASR Personalization Using Only Text

Jul 02, 2022
Theresa Breiner, Swaroop Ramaswamy, Ehsan Variani, Shefali Garg, Rajiv Mathews, Khe Chai Sim, Kilol Gupta, Mingqing Chen, Lara McConnaughey

Figure 1 for UserLibri: A Dataset for ASR Personalization Using Only Text
Figure 2 for UserLibri: A Dataset for ASR Personalization Using Only Text
Figure 3 for UserLibri: A Dataset for ASR Personalization Using Only Text
Figure 4 for UserLibri: A Dataset for ASR Personalization Using Only Text
Viaarxiv icon

WaDeNet: Wavelet Decomposition based CNN for Speech Processing

Nov 11, 2020
Prithvi Suresh, Abhijith Ragav

Figure 1 for WaDeNet: Wavelet Decomposition based CNN for Speech Processing
Figure 2 for WaDeNet: Wavelet Decomposition based CNN for Speech Processing
Figure 3 for WaDeNet: Wavelet Decomposition based CNN for Speech Processing
Viaarxiv icon

Speaker and Direction Inferred Dual-channel Speech Separation

Add code
Bookmark button
Alert button
Feb 08, 2021
Chenxing Li, Jiaming Xu, Nima Mesgarani, Bo Xu

Figure 1 for Speaker and Direction Inferred Dual-channel Speech Separation
Figure 2 for Speaker and Direction Inferred Dual-channel Speech Separation
Figure 3 for Speaker and Direction Inferred Dual-channel Speech Separation
Figure 4 for Speaker and Direction Inferred Dual-channel Speech Separation
Viaarxiv icon

Fast Development of ASR in African Languages using Self Supervised Speech Representation Learning

Add code
Bookmark button
Alert button
Mar 16, 2021
Jama Hussein Mohamud, Lloyd Acquaye Thompson, Aissatou Ndoye, Laurent Besacier

Figure 1 for Fast Development of ASR in African Languages using Self Supervised Speech Representation Learning
Figure 2 for Fast Development of ASR in African Languages using Self Supervised Speech Representation Learning
Figure 3 for Fast Development of ASR in African Languages using Self Supervised Speech Representation Learning
Figure 4 for Fast Development of ASR in African Languages using Self Supervised Speech Representation Learning
Viaarxiv icon

Speaker Separation Using Speaker Inventories and Estimated Speech

Oct 20, 2020
Peidong Wang, Zhuo Chen, DeLiang Wang, Jinyu Li, Yifan Gong

Figure 1 for Speaker Separation Using Speaker Inventories and Estimated Speech
Figure 2 for Speaker Separation Using Speaker Inventories and Estimated Speech
Figure 3 for Speaker Separation Using Speaker Inventories and Estimated Speech
Figure 4 for Speaker Separation Using Speaker Inventories and Estimated Speech
Viaarxiv icon

Visual Speech Enhancement Without A Real Visual Stream

Add code
Bookmark button
Alert button
Dec 20, 2020
Sindhu B Hegde, K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

Figure 1 for Visual Speech Enhancement Without A Real Visual Stream
Figure 2 for Visual Speech Enhancement Without A Real Visual Stream
Figure 3 for Visual Speech Enhancement Without A Real Visual Stream
Figure 4 for Visual Speech Enhancement Without A Real Visual Stream
Viaarxiv icon