Alert button

"speech": models, code, and papers
Alert button

Leveraging Synthetic Targets for Machine Translation

May 07, 2023
Sarthak Mittal, Oleksii Hrinchuk, Oleksii Kuchaiev

Figure 1 for Leveraging Synthetic Targets for Machine Translation
Figure 2 for Leveraging Synthetic Targets for Machine Translation
Figure 3 for Leveraging Synthetic Targets for Machine Translation
Figure 4 for Leveraging Synthetic Targets for Machine Translation
Viaarxiv icon

Improving Speech Prosody of Audiobook Text-to-Speech Synthesis with Acoustic and Textual Contexts

Nov 04, 2022
Detai Xin, Sharath Adavanne, Federico Ang, Ashish Kulkarni, Shinnosuke Takamichi, Hiroshi Saruwatari

Figure 1 for Improving Speech Prosody of Audiobook Text-to-Speech Synthesis with Acoustic and Textual Contexts
Figure 2 for Improving Speech Prosody of Audiobook Text-to-Speech Synthesis with Acoustic and Textual Contexts
Figure 3 for Improving Speech Prosody of Audiobook Text-to-Speech Synthesis with Acoustic and Textual Contexts
Figure 4 for Improving Speech Prosody of Audiobook Text-to-Speech Synthesis with Acoustic and Textual Contexts
Viaarxiv icon

McNet: Fuse Multiple Cues for Multichannel Speech Enhancement

Nov 16, 2022
Yujie Yang, Changsheng Quan, Xiaofei Li

Figure 1 for McNet: Fuse Multiple Cues for Multichannel Speech Enhancement
Figure 2 for McNet: Fuse Multiple Cues for Multichannel Speech Enhancement
Figure 3 for McNet: Fuse Multiple Cues for Multichannel Speech Enhancement
Viaarxiv icon

ArzEn-ST: A Three-way Speech Translation Corpus for Code-Switched Egyptian Arabic - English

Nov 22, 2022
Injy Hamed, Nizar Habash, Slim Abdennadher, Ngoc Thang Vu

Figure 1 for ArzEn-ST: A Three-way Speech Translation Corpus for Code-Switched Egyptian Arabic - English
Figure 2 for ArzEn-ST: A Three-way Speech Translation Corpus for Code-Switched Egyptian Arabic - English
Figure 3 for ArzEn-ST: A Three-way Speech Translation Corpus for Code-Switched Egyptian Arabic - English
Figure 4 for ArzEn-ST: A Three-way Speech Translation Corpus for Code-Switched Egyptian Arabic - English
Viaarxiv icon

A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers

Apr 16, 2023
Juan Zuluaga-Gomez, Amrutha Prasad, Iuliia Nigmatulina, Petr Motlicek, Matthias Kleinert

Figure 1 for A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers
Figure 2 for A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers
Figure 3 for A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers
Figure 4 for A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers
Viaarxiv icon

Continual Learning for On-Device Speech Recognition using Disentangled Conformers

Dec 02, 2022
Anuj Diwan, Ching-Feng Yeh, Wei-Ning Hsu, Paden Tomasello, Eunsol Choi, David Harwath, Abdelrahman Mohamed

Figure 1 for Continual Learning for On-Device Speech Recognition using Disentangled Conformers
Figure 2 for Continual Learning for On-Device Speech Recognition using Disentangled Conformers
Figure 3 for Continual Learning for On-Device Speech Recognition using Disentangled Conformers
Figure 4 for Continual Learning for On-Device Speech Recognition using Disentangled Conformers
Viaarxiv icon

Joint unsupervised and supervised learning for context-aware language identification

Mar 29, 2023
Jinseok Park, Hyung Yong Kim, Jihwan Park, Byeong-Yeol Kim, Shukjae Choi, Yunkyu Lim

Figure 1 for Joint unsupervised and supervised learning for context-aware language identification
Figure 2 for Joint unsupervised and supervised learning for context-aware language identification
Figure 3 for Joint unsupervised and supervised learning for context-aware language identification
Figure 4 for Joint unsupervised and supervised learning for context-aware language identification
Viaarxiv icon

How to Solve Few-Shot Abusive Content Detection Using the Data We Actually Have

May 23, 2023
Viktor Hangya, Alexander Fraser

Figure 1 for How to Solve Few-Shot Abusive Content Detection Using the Data We Actually Have
Figure 2 for How to Solve Few-Shot Abusive Content Detection Using the Data We Actually Have
Figure 3 for How to Solve Few-Shot Abusive Content Detection Using the Data We Actually Have
Figure 4 for How to Solve Few-Shot Abusive Content Detection Using the Data We Actually Have
Viaarxiv icon

Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation

Dec 16, 2022
Rongzhi Gu, Shi-Xiong Zhang, Yuexian Zou, Dong Yu

Figure 1 for Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation
Figure 2 for Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation
Figure 3 for Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation
Figure 4 for Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation
Viaarxiv icon

CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training

May 18, 2023
Zhenhui Ye, Rongjie Huang, Yi Ren, Ziyue Jiang, Jinglin Liu, Jinzheng He, Xiang Yin, Zhou Zhao

Figure 1 for CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training
Figure 2 for CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training
Figure 3 for CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training
Figure 4 for CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training
Viaarxiv icon