Alert button

"speech": models, code, and papers
Alert button

Array Configuration-Agnostic Personalized Speech Enhancement using Long-Short-Term Spatial Coherence

Nov 16, 2022
Yicheng Hsu, Yonghan Lee, Mingsian R. Bai

Figure 1 for Array Configuration-Agnostic Personalized Speech Enhancement using Long-Short-Term Spatial Coherence
Figure 2 for Array Configuration-Agnostic Personalized Speech Enhancement using Long-Short-Term Spatial Coherence
Figure 3 for Array Configuration-Agnostic Personalized Speech Enhancement using Long-Short-Term Spatial Coherence
Figure 4 for Array Configuration-Agnostic Personalized Speech Enhancement using Long-Short-Term Spatial Coherence
Viaarxiv icon

Improving Speech Recognition on Noisy Speech via Speech Enhancement with Multi-Discriminators CycleGAN

Add code
Bookmark button
Alert button
Dec 12, 2021
Chia-Yu Li, Ngoc Thang Vu

Figure 1 for Improving Speech Recognition on Noisy Speech via Speech Enhancement with Multi-Discriminators CycleGAN
Figure 2 for Improving Speech Recognition on Noisy Speech via Speech Enhancement with Multi-Discriminators CycleGAN
Figure 3 for Improving Speech Recognition on Noisy Speech via Speech Enhancement with Multi-Discriminators CycleGAN
Figure 4 for Improving Speech Recognition on Noisy Speech via Speech Enhancement with Multi-Discriminators CycleGAN
Viaarxiv icon

Direct simultaneous speech to speech translation

Oct 15, 2021
Xutai Ma, Hongyu Gong, Danni Liu, Ann Lee, Yun Tang, Peng-Jen Chen, Wei-Ning Hsu, Kenneth Heafield, Phillip Koehn, Juan Pino

Figure 1 for Direct simultaneous speech to speech translation
Figure 2 for Direct simultaneous speech to speech translation
Viaarxiv icon

FaceChat: An Emotion-Aware Face-to-face Dialogue Framework

Add code
Bookmark button
Alert button
Mar 08, 2023
Deema Alnuhait, Qingyang Wu, Zhou Yu

Figure 1 for FaceChat: An Emotion-Aware Face-to-face Dialogue Framework
Figure 2 for FaceChat: An Emotion-Aware Face-to-face Dialogue Framework
Figure 3 for FaceChat: An Emotion-Aware Face-to-face Dialogue Framework
Figure 4 for FaceChat: An Emotion-Aware Face-to-face Dialogue Framework
Viaarxiv icon

Guided-TTS:Text-to-Speech with Untranscribed Speech

Add code
Bookmark button
Alert button
Nov 23, 2021
Heeseung Kim, Sungwon Kim, Sungroh Yoon

Figure 1 for Guided-TTS:Text-to-Speech with Untranscribed Speech
Figure 2 for Guided-TTS:Text-to-Speech with Untranscribed Speech
Figure 3 for Guided-TTS:Text-to-Speech with Untranscribed Speech
Figure 4 for Guided-TTS:Text-to-Speech with Untranscribed Speech
Viaarxiv icon

IRFL: Image Recognition of Figurative Language

Add code
Bookmark button
Alert button
Mar 27, 2023
Ron Yosef, Yonatan Bitton, Dafna Shahaf

Figure 1 for IRFL: Image Recognition of Figurative Language
Figure 2 for IRFL: Image Recognition of Figurative Language
Figure 3 for IRFL: Image Recognition of Figurative Language
Figure 4 for IRFL: Image Recognition of Figurative Language
Viaarxiv icon

AI-Based Automated Speech Therapy Tools for persons with Speech Sound Disorders: A Systematic Literature Review

Apr 21, 2022
Chinmoy Deka, Abhishek Shrivastava, Saurabh Nautiyal, Praveen Chauhan

Figure 1 for AI-Based Automated Speech Therapy Tools for persons with Speech Sound Disorders: A Systematic Literature Review
Figure 2 for AI-Based Automated Speech Therapy Tools for persons with Speech Sound Disorders: A Systematic Literature Review
Figure 3 for AI-Based Automated Speech Therapy Tools for persons with Speech Sound Disorders: A Systematic Literature Review
Figure 4 for AI-Based Automated Speech Therapy Tools for persons with Speech Sound Disorders: A Systematic Literature Review
Viaarxiv icon

Textless Speech-to-Speech Translation on Real Data

Add code
Bookmark button
Alert button
Dec 15, 2021
Ann Lee, Hongyu Gong, Paul-Ambroise Duquenne, Holger Schwenk, Peng-Jen Chen, Changhan Wang, Sravya Popuri, Juan Pino, Jiatao Gu, Wei-Ning Hsu

Figure 1 for Textless Speech-to-Speech Translation on Real Data
Figure 2 for Textless Speech-to-Speech Translation on Real Data
Figure 3 for Textless Speech-to-Speech Translation on Real Data
Figure 4 for Textless Speech-to-Speech Translation on Real Data
Viaarxiv icon

PATCorrect: Non-autoregressive Phoneme-augmented Transformer for ASR Error Correction

Add code
Bookmark button
Alert button
Feb 10, 2023
Ziji Zhang, Zhehui Wang, Rajesh Kamma, Sharanya Eswaran, Narayanan Sadagopan

Figure 1 for PATCorrect: Non-autoregressive Phoneme-augmented Transformer for ASR Error Correction
Figure 2 for PATCorrect: Non-autoregressive Phoneme-augmented Transformer for ASR Error Correction
Figure 3 for PATCorrect: Non-autoregressive Phoneme-augmented Transformer for ASR Error Correction
Figure 4 for PATCorrect: Non-autoregressive Phoneme-augmented Transformer for ASR Error Correction
Viaarxiv icon

Simultaneous Speech Extraction for Multiple Target Speakers under the Meeting Scenarios(V1)

Add code
Bookmark button
Alert button
Jun 17, 2022
Bang Zeng, Weiqing Wang, Yuanyuan Bao, Ming Li

Figure 1 for Simultaneous Speech Extraction for Multiple Target Speakers under the Meeting Scenarios(V1)
Figure 2 for Simultaneous Speech Extraction for Multiple Target Speakers under the Meeting Scenarios(V1)
Figure 3 for Simultaneous Speech Extraction for Multiple Target Speakers under the Meeting Scenarios(V1)
Figure 4 for Simultaneous Speech Extraction for Multiple Target Speakers under the Meeting Scenarios(V1)
Viaarxiv icon