Alert button

"speech": models, code, and papers
Alert button

Developing automatic verbatim transcripts for international multilingual meetings: an end-to-end solution

Sep 27, 2023
Akshat Dewan, Michal Ziemski, Henri Meylan, Lorenzo Concina, Bruno Pouliquen

Viaarxiv icon

ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading

Jul 03, 2023
Yujia Xiao, Shaofei Zhang, Xi Wang, Xu Tan, Lei He, Sheng Zhao, Frank K. Soong, Tan Lee

Figure 1 for ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading
Figure 2 for ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading
Figure 3 for ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading
Figure 4 for ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading
Viaarxiv icon

Remote Inference of Cognitive Scores in ALS Patients Using a Picture Description

Sep 13, 2023
Carla Agurto, Guillermo Cecchi, Bo Wen, Ernest Fraenkel, James Berry, Indu Navar, Raquel Norel

Viaarxiv icon

Emo-DNA: Emotion Decoupling and Alignment Learning for Cross-Corpus Speech Emotion Recognition

Aug 04, 2023
Jiaxin Ye, Yujie Wei, Xin-Cheng Wen, Chenglong Ma, Zhizhong Huang, Kunhong Liu, Hongming Shan

Figure 1 for Emo-DNA: Emotion Decoupling and Alignment Learning for Cross-Corpus Speech Emotion Recognition
Figure 2 for Emo-DNA: Emotion Decoupling and Alignment Learning for Cross-Corpus Speech Emotion Recognition
Figure 3 for Emo-DNA: Emotion Decoupling and Alignment Learning for Cross-Corpus Speech Emotion Recognition
Figure 4 for Emo-DNA: Emotion Decoupling and Alignment Learning for Cross-Corpus Speech Emotion Recognition
Viaarxiv icon

Improving speech translation by fusing speech and text

May 23, 2023
Wenbiao Yin, Zhicheng Liu, Chengqi Zhao, Tao Wang, Jian Tong, Rong Ye

Figure 1 for Improving speech translation by fusing speech and text
Figure 2 for Improving speech translation by fusing speech and text
Figure 3 for Improving speech translation by fusing speech and text
Figure 4 for Improving speech translation by fusing speech and text
Viaarxiv icon

Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning

Sep 08, 2023
Saurabhchand Bhati, Jesús Villalba, Laureano Moro-Velazquez, Thomas Thebaud, Najim Dehak

Figure 1 for Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning
Figure 2 for Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning
Figure 3 for Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning
Figure 4 for Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning
Viaarxiv icon

Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers

Jul 06, 2023
Yuan Gong, Sameer Khurana, Leonid Karlinsky, James Glass

Figure 1 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 2 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 3 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 4 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Viaarxiv icon

Prompt-to-OS (P2OS): Revolutionizing Operating Systems and Human-Computer Interaction with Integrated AI Generative Models

Oct 07, 2023
Gabriele Tolomei, Cesare Campagnano, Fabrizio Silvestri, Giovanni Trappolini

Viaarxiv icon

Duplex Diffusion Models Improve Speech-to-Speech Translation

May 22, 2023
Xianchao Wu

Figure 1 for Duplex Diffusion Models Improve Speech-to-Speech Translation
Figure 2 for Duplex Diffusion Models Improve Speech-to-Speech Translation
Figure 3 for Duplex Diffusion Models Improve Speech-to-Speech Translation
Figure 4 for Duplex Diffusion Models Improve Speech-to-Speech Translation
Viaarxiv icon

Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context

Sep 15, 2023
Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Yifan Yang, Liyong Guo, Long Lin, Daniel Povey

Viaarxiv icon