Alert button

"speech": models, code, and papers
Alert button

Learning Transferable Spatiotemporal Representations from Natural Script Knowledge

Add code
Bookmark button
Alert button
Sep 30, 2022
Ziyun Zeng, Yuying Ge, Xihui Liu, Bin Chen, Ping Luo, Shu-Tao Xia, Yixiao Ge

Figure 1 for Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Figure 2 for Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Figure 3 for Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Figure 4 for Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Viaarxiv icon

Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition

May 14, 2021
Khin Me Me Chit, Laet Laet Lin

Figure 1 for Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition
Figure 2 for Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition
Figure 3 for Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition
Figure 4 for Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition
Viaarxiv icon

Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement

Add code
Bookmark button
Alert button
Jun 05, 2021
Sefik Emre Eskimez, Xiaofei Wang, Min Tang, Hemin Yang, Zirun Zhu, Zhuo Chen, Huaming Wang, Takuya Yoshioka

Figure 1 for Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement
Figure 2 for Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement
Figure 3 for Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement
Viaarxiv icon

ILASR: Privacy-Preserving Incremental Learning for AutomaticSpeech Recognition at Production Scale

Jul 19, 2022
Gopinath Chennupati, Milind Rao, Gurpreet Chadha, Aaron Eakin, Anirudh Raju, Gautam Tiwari, Anit Kumar Sahu, Ariya Rastrow, Jasha Droppo, Andy Oberlin, Buddha Nandanoor, Prahalad Venkataramanan, Zheng Wu, Pankaj Sitpure

Figure 1 for ILASR: Privacy-Preserving Incremental Learning for AutomaticSpeech Recognition at Production Scale
Figure 2 for ILASR: Privacy-Preserving Incremental Learning for AutomaticSpeech Recognition at Production Scale
Figure 3 for ILASR: Privacy-Preserving Incremental Learning for AutomaticSpeech Recognition at Production Scale
Figure 4 for ILASR: Privacy-Preserving Incremental Learning for AutomaticSpeech Recognition at Production Scale
Viaarxiv icon

Acoustic-Linguistic Features for Modeling Neurological Task Score in Alzheimer's

Sep 13, 2022
Saurav K. Aryal, Howard Prioleau, Legand Burge

Figure 1 for Acoustic-Linguistic Features for Modeling Neurological Task Score in Alzheimer's
Figure 2 for Acoustic-Linguistic Features for Modeling Neurological Task Score in Alzheimer's
Figure 3 for Acoustic-Linguistic Features for Modeling Neurological Task Score in Alzheimer's
Figure 4 for Acoustic-Linguistic Features for Modeling Neurological Task Score in Alzheimer's
Viaarxiv icon

Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation

Add code
Bookmark button
Alert button
May 16, 2020
Tao Tu, Yuan-Jui Chen, Alexander H. Liu, Hung-yi Lee

Figure 1 for Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation
Figure 2 for Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation
Figure 3 for Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation
Viaarxiv icon

The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at IWSLT 2021

Jul 09, 2021
Dan Liu, Mengge Du, Xiaoxi Li, Yuchen Hu, Lirong Dai

Figure 1 for The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at IWSLT 2021
Figure 2 for The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at IWSLT 2021
Figure 3 for The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at IWSLT 2021
Figure 4 for The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at IWSLT 2021
Viaarxiv icon

Improving Punctuation Restoration for Speech Transcripts via External Data

Oct 01, 2021
Xue-Yong Fu, Cheng Chen, Md Tahmid Rahman Laskar, Shashi Bhushan TN, Simon Corston-Oliver

Figure 1 for Improving Punctuation Restoration for Speech Transcripts via External Data
Figure 2 for Improving Punctuation Restoration for Speech Transcripts via External Data
Figure 3 for Improving Punctuation Restoration for Speech Transcripts via External Data
Figure 4 for Improving Punctuation Restoration for Speech Transcripts via External Data
Viaarxiv icon

Bridging the prosody GAP: Genetic Algorithm with People to efficiently sample emotional prosody

Add code
Bookmark button
Alert button
May 10, 2022
Pol van Rijn, Harin Lee, Nori Jacoby

Figure 1 for Bridging the prosody GAP: Genetic Algorithm with People to efficiently sample emotional prosody
Figure 2 for Bridging the prosody GAP: Genetic Algorithm with People to efficiently sample emotional prosody
Figure 3 for Bridging the prosody GAP: Genetic Algorithm with People to efficiently sample emotional prosody
Viaarxiv icon