Alert button

"speech": models, code, and papers
Alert button

Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention

Feb 14, 2020
Yuma Koizumi, Kohei Yatabe, Marc Delcroix, Yoshiki Masuyama, Daiki Takeuchi

Figure 1 for Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention
Figure 2 for Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention
Figure 3 for Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention
Figure 4 for Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention
Viaarxiv icon

Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition

Oct 06, 2021
Zhong Meng, Yashesh Gaur, Naoyuki Kanda, Jinyu Li, Xie Chen, Yu Wu, Yifan Gong

Figure 1 for Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
Figure 2 for Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
Viaarxiv icon

Knowledge Authoring with Factual English

Add code
Bookmark button
Alert button
Aug 05, 2022
Yuheng Wang, Giorgian Borca-Tasciuc, Nikhil Goel, Paul Fodor, Michael Kifer

Figure 1 for Knowledge Authoring with Factual English
Figure 2 for Knowledge Authoring with Factual English
Figure 3 for Knowledge Authoring with Factual English
Figure 4 for Knowledge Authoring with Factual English
Viaarxiv icon

The Interspeech Zero Resource Speech Challenge 2021: Spoken language modelling

Add code
Bookmark button
Alert button
Apr 29, 2021
Ewan Dunbar, Mathieu Bernard, Nicolas Hamilakis, Tu Anh Nguyen, Maureen de Seyssel, Patricia Rozé, Morgane Rivière, Eugene Kharitonov, Emmanuel Dupoux

Figure 1 for The Interspeech Zero Resource Speech Challenge 2021: Spoken language modelling
Figure 2 for The Interspeech Zero Resource Speech Challenge 2021: Spoken language modelling
Viaarxiv icon

Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis

Add code
Bookmark button
Alert button
Nov 06, 2020
Ron J. Weiss, RJ Skerry-Ryan, Eric Battenberg, Soroosh Mariooryad, Diederik P. Kingma

Figure 1 for Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Figure 2 for Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Figure 3 for Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Figure 4 for Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Viaarxiv icon

Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion

Add code
Bookmark button
Alert button
Oct 16, 2020
Shengkui Zhao, Trung Hieu Nguyen, Hao Wang, Bin Ma

Figure 1 for Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion
Figure 2 for Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion
Figure 3 for Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion
Figure 4 for Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion
Viaarxiv icon

Quadrupedal Robotic Guide Dog with Vocal Human-Robot Interaction

Nov 25, 2021
Kavan Mehrizi

Figure 1 for Quadrupedal Robotic Guide Dog with Vocal Human-Robot Interaction
Figure 2 for Quadrupedal Robotic Guide Dog with Vocal Human-Robot Interaction
Figure 3 for Quadrupedal Robotic Guide Dog with Vocal Human-Robot Interaction
Viaarxiv icon

Hierarchical Attention Network for Evaluating Therapist Empathy in Counseling Session

Mar 31, 2022
Dehua Tao, Tan Lee, Harold Chui, Sarah Luk

Figure 1 for Hierarchical Attention Network for Evaluating Therapist Empathy in Counseling Session
Figure 2 for Hierarchical Attention Network for Evaluating Therapist Empathy in Counseling Session
Figure 3 for Hierarchical Attention Network for Evaluating Therapist Empathy in Counseling Session
Figure 4 for Hierarchical Attention Network for Evaluating Therapist Empathy in Counseling Session
Viaarxiv icon

Closing the Gap between Single-User and Multi-User VoiceFilter-Lite

Feb 24, 2022
Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ian McGraw

Figure 1 for Closing the Gap between Single-User and Multi-User VoiceFilter-Lite
Figure 2 for Closing the Gap between Single-User and Multi-User VoiceFilter-Lite
Figure 3 for Closing the Gap between Single-User and Multi-User VoiceFilter-Lite
Figure 4 for Closing the Gap between Single-User and Multi-User VoiceFilter-Lite
Viaarxiv icon

How Does Pre-trained Wav2Vec2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications

Add code
Bookmark button
Alert button
Mar 31, 2022
Juan Zuluaga-Gomez, Amrutha Prasad, Iuliia Nigmatulina, Saeed Sarfjoo, Petr Motlicek, Matthias Kleinert, Hartmut Helmke, Oliver Ohneiser, Qingran Zhan

Figure 1 for How Does Pre-trained Wav2Vec2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications
Figure 2 for How Does Pre-trained Wav2Vec2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications
Figure 3 for How Does Pre-trained Wav2Vec2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications
Viaarxiv icon