Alert button

"speech": models, code, and papers
Alert button

Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint CTC/Attention Frames

Add code
Bookmark button
Alert button
Nov 02, 2022
Chengdong Liang, Xiao-Lei Zhang, BinBin Zhang, Di Wu, Shengqiang Li, Xingchen Song, Zhendong Peng, Fuping Pan

Viaarxiv icon

Detection of AI Synthesized Hindi Speech

Mar 07, 2022
Karan Bhatia, Ansh Agrawal, Priyanka Singh, Arun Kumar Singh

Figure 1 for Detection of AI Synthesized Hindi Speech
Figure 2 for Detection of AI Synthesized Hindi Speech
Figure 3 for Detection of AI Synthesized Hindi Speech
Figure 4 for Detection of AI Synthesized Hindi Speech
Viaarxiv icon

FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning

Jul 01, 2022
Yeonghyeon Lee, Kangwook Jang, Jahyun Goo, Youngmoon Jung, Hoirin Kim

Figure 1 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Figure 2 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Figure 3 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Figure 4 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Viaarxiv icon

Extending DNN-based Multiplicative Masking to Deep Subband Filtering for Improved Dereverberation

Add code
Bookmark button
Alert button
Mar 01, 2023
Jean-Marie Lemercier, Julian Tobergte, Timo Gerkmann

Figure 1 for Extending DNN-based Multiplicative Masking to Deep Subband Filtering for Improved Dereverberation
Figure 2 for Extending DNN-based Multiplicative Masking to Deep Subband Filtering for Improved Dereverberation
Figure 3 for Extending DNN-based Multiplicative Masking to Deep Subband Filtering for Improved Dereverberation
Figure 4 for Extending DNN-based Multiplicative Masking to Deep Subband Filtering for Improved Dereverberation
Viaarxiv icon

Controlling High-Dimensional Data With Sparse Input

Add code
Bookmark button
Alert button
Mar 14, 2023
Dan Andrei Iliescu, Devang Savita Ram Mohan, Tian Huey Teh, Zack Hodari

Figure 1 for Controlling High-Dimensional Data With Sparse Input
Figure 2 for Controlling High-Dimensional Data With Sparse Input
Figure 3 for Controlling High-Dimensional Data With Sparse Input
Figure 4 for Controlling High-Dimensional Data With Sparse Input
Viaarxiv icon

GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block

Jun 30, 2022
Xinmeng Xu, Yang Wang, Jie Jia, Binbin Chen, Jianjun Hao

Figure 1 for GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block
Figure 2 for GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block
Figure 3 for GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block
Figure 4 for GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block
Viaarxiv icon

Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition

Add code
Bookmark button
Alert button
Mar 30, 2022
Lester Phillip Violeta, Wen-Chin Huang, Tomoki Toda

Figure 1 for Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition
Figure 2 for Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition
Figure 3 for Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition
Figure 4 for Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition
Viaarxiv icon

Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers

Add code
Bookmark button
Alert button
Mar 03, 2023
Heinrich Dinkel, Yongqing Wang, Zhiyong Yan, Junbo Zhang, Yujun Wang

Figure 1 for Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers
Figure 2 for Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers
Figure 3 for Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers
Figure 4 for Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers
Viaarxiv icon

Prosodic features improve sentence segmentation and parsing

Add code
Bookmark button
Alert button
Feb 23, 2023
Elizabeth Nielsen, Sharon Goldwater, Mark Steedman

Figure 1 for Prosodic features improve sentence segmentation and parsing
Figure 2 for Prosodic features improve sentence segmentation and parsing
Figure 3 for Prosodic features improve sentence segmentation and parsing
Figure 4 for Prosodic features improve sentence segmentation and parsing
Viaarxiv icon

Automatic Heteronym Resolution Pipeline Using RAD-TTS Aligners

Add code
Bookmark button
Alert button
Feb 28, 2023
Jocelyn Huang, Evelina Bakhturina, Oktai Tatanov

Figure 1 for Automatic Heteronym Resolution Pipeline Using RAD-TTS Aligners
Figure 2 for Automatic Heteronym Resolution Pipeline Using RAD-TTS Aligners
Figure 3 for Automatic Heteronym Resolution Pipeline Using RAD-TTS Aligners
Figure 4 for Automatic Heteronym Resolution Pipeline Using RAD-TTS Aligners
Viaarxiv icon