Alert button

"speech": models, code, and papers
Alert button

Device Directedness with Contextual Cues for Spoken Dialog Systems

Nov 23, 2022
Dhanush Bekal, Sundararajan Srinivasan, Sravan Bodapati, Srikanth Ronanki, Katrin Kirchhoff

Figure 1 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Figure 2 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Figure 3 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Figure 4 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Viaarxiv icon

Boosting Self-Supervised Embeddings for Speech Enhancement

Apr 07, 2022
Kuo-Hsuan Hung, Szu-wei Fu, Huan-Hsin Tseng, Hsin-Tien Chiang, Yu Tsao, Chii-Wann Lin

Figure 1 for Boosting Self-Supervised Embeddings for Speech Enhancement
Figure 2 for Boosting Self-Supervised Embeddings for Speech Enhancement
Figure 3 for Boosting Self-Supervised Embeddings for Speech Enhancement
Figure 4 for Boosting Self-Supervised Embeddings for Speech Enhancement
Viaarxiv icon

Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding

Add code
Bookmark button
Alert button
Jul 06, 2022
Yifan Peng, Siddharth Dalmia, Ian Lane, Shinji Watanabe

Figure 1 for Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Figure 2 for Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Figure 3 for Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Figure 4 for Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Viaarxiv icon

A light-weight full-band speech enhancement model

Add code
Bookmark button
Alert button
Jul 03, 2022
Qinwen Hu, Zhongshu Hou, Xiaohuai Le, Jing Lu

Figure 1 for A light-weight full-band speech enhancement model
Figure 2 for A light-weight full-band speech enhancement model
Figure 3 for A light-weight full-band speech enhancement model
Viaarxiv icon

TRILLsson: Distilled Universal Paralinguistic Speech Representations

Mar 20, 2022
Joel Shor, Subhashini Venugopalan

Figure 1 for TRILLsson: Distilled Universal Paralinguistic Speech Representations
Figure 2 for TRILLsson: Distilled Universal Paralinguistic Speech Representations
Figure 3 for TRILLsson: Distilled Universal Paralinguistic Speech Representations
Figure 4 for TRILLsson: Distilled Universal Paralinguistic Speech Representations
Viaarxiv icon

Contrastive Siamese Network for Semi-supervised Speech Recognition

May 27, 2022
Soheil Khorram, Jaeyoung Kim, Anshuman Tripathi, Han Lu, Qian Zhang, Hasim Sak

Figure 1 for Contrastive Siamese Network for Semi-supervised Speech Recognition
Figure 2 for Contrastive Siamese Network for Semi-supervised Speech Recognition
Figure 3 for Contrastive Siamese Network for Semi-supervised Speech Recognition
Figure 4 for Contrastive Siamese Network for Semi-supervised Speech Recognition
Viaarxiv icon

Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings

Add code
Bookmark button
Alert button
Oct 23, 2022
Jian Zhu, Zuoyu Tian, Yadong Liu, Cong Zhang, Chia-wen Lo

Figure 1 for Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
Figure 2 for Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
Figure 3 for Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
Figure 4 for Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
Viaarxiv icon

The Effectiveness of Time Stretching for Enhancing Dysarthric Speech for Improved Dysarthric Speech Recognition

Add code
Bookmark button
Alert button
Jan 13, 2022
Luke Prananta, Bence Mark Halpern, Siyuan Feng, Odette Scharenborg

Figure 1 for The Effectiveness of Time Stretching for Enhancing Dysarthric Speech for Improved Dysarthric Speech Recognition
Figure 2 for The Effectiveness of Time Stretching for Enhancing Dysarthric Speech for Improved Dysarthric Speech Recognition
Viaarxiv icon

Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech

Add code
Bookmark button
Alert button
Jun 15, 2022
Jan Lehečka, Jan Švec, Aleš Pražák, Josef V. Psutka

Figure 1 for Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech
Figure 2 for Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech
Figure 3 for Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech
Figure 4 for Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech
Viaarxiv icon

Match to Win: Analysing Sequences Lengths for Efficient Self-supervised Learning in Speech and Audio

Oct 03, 2022
Yan Gao, Javier Fernandez-Marques, Titouan Parcollet, Pedro P. B. de Gusmao, Nicholas D. Lane

Figure 1 for Match to Win: Analysing Sequences Lengths for Efficient Self-supervised Learning in Speech and Audio
Figure 2 for Match to Win: Analysing Sequences Lengths for Efficient Self-supervised Learning in Speech and Audio
Figure 3 for Match to Win: Analysing Sequences Lengths for Efficient Self-supervised Learning in Speech and Audio
Figure 4 for Match to Win: Analysing Sequences Lengths for Efficient Self-supervised Learning in Speech and Audio
Viaarxiv icon