Alert button

"speech": models, code, and papers
Alert button

Location, Location: Enhancing the Evaluation of Text-to-Speech Synthesis Using the Rapid Prosody Transcription Paradigm

Jul 06, 2021
Elijah Gutierrez, Pilar Oplustil-Gallegos, Catherine Lai

Figure 1 for Location, Location: Enhancing the Evaluation of Text-to-Speech Synthesis Using the Rapid Prosody Transcription Paradigm
Figure 2 for Location, Location: Enhancing the Evaluation of Text-to-Speech Synthesis Using the Rapid Prosody Transcription Paradigm
Figure 3 for Location, Location: Enhancing the Evaluation of Text-to-Speech Synthesis Using the Rapid Prosody Transcription Paradigm
Figure 4 for Location, Location: Enhancing the Evaluation of Text-to-Speech Synthesis Using the Rapid Prosody Transcription Paradigm
Viaarxiv icon

Improving the Training Recipe for a Robust Conformer-based Hybrid Model

Add code
Bookmark button
Alert button
Jun 26, 2022
Mohammad Zeineldeen, Jingjing Xu, Christoph Lüscher, Ralf Schlüter, Hermann Ney

Figure 1 for Improving the Training Recipe for a Robust Conformer-based Hybrid Model
Figure 2 for Improving the Training Recipe for a Robust Conformer-based Hybrid Model
Figure 3 for Improving the Training Recipe for a Robust Conformer-based Hybrid Model
Figure 4 for Improving the Training Recipe for a Robust Conformer-based Hybrid Model
Viaarxiv icon

TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech

Add code
Bookmark button
Alert button
Jul 12, 2020
Andy T. Liu, Shang-Wen Li, Hung-yi Lee

Figure 1 for TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
Figure 2 for TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
Figure 3 for TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
Figure 4 for TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
Viaarxiv icon

More for Less: Non-Intrusive Speech Quality Assessment with Limited Annotations

Add code
Bookmark button
Alert button
Aug 19, 2021
Alessandro Ragano, Emmanouil Benetos, Andrew Hines

Figure 1 for More for Less: Non-Intrusive Speech Quality Assessment with Limited Annotations
Figure 2 for More for Less: Non-Intrusive Speech Quality Assessment with Limited Annotations
Figure 3 for More for Less: Non-Intrusive Speech Quality Assessment with Limited Annotations
Figure 4 for More for Less: Non-Intrusive Speech Quality Assessment with Limited Annotations
Viaarxiv icon

Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses

Feb 03, 2021
Shengkui Zhao, Trung Hieu Nguyen, Bin Ma

Figure 1 for Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses
Figure 2 for Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses
Figure 3 for Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses
Figure 4 for Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses
Viaarxiv icon

Multi-rate attention architecture for fast streamable Text-to-speech spectrum modeling

Add code
Bookmark button
Alert button
Apr 01, 2021
Qing He, Zhiping Xiu, Thilo Koehler, Jilong Wu

Figure 1 for Multi-rate attention architecture for fast streamable Text-to-speech spectrum modeling
Figure 2 for Multi-rate attention architecture for fast streamable Text-to-speech spectrum modeling
Figure 3 for Multi-rate attention architecture for fast streamable Text-to-speech spectrum modeling
Figure 4 for Multi-rate attention architecture for fast streamable Text-to-speech spectrum modeling
Viaarxiv icon

Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenario

Jan 07, 2021
Chiang-Jen Peng, Yun-Ju Chan, Cheng Yu, Syu-Siang Wang, Yu Tsao, Tai-Shih Chi

Figure 1 for Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenario
Figure 2 for Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenario
Figure 3 for Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenario
Figure 4 for Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenario
Viaarxiv icon

Is "moby dick" a Whale or a Bird? Named Entities and Terminology in Speech Translation

Add code
Bookmark button
Alert button
Sep 15, 2021
Marco Gaido, Susana Rodríguez, Matteo Negri, Luisa Bentivogli, Marco Turchi

Figure 1 for Is "moby dick" a Whale or a Bird? Named Entities and Terminology in Speech Translation
Figure 2 for Is "moby dick" a Whale or a Bird? Named Entities and Terminology in Speech Translation
Figure 3 for Is "moby dick" a Whale or a Bird? Named Entities and Terminology in Speech Translation
Figure 4 for Is "moby dick" a Whale or a Bird? Named Entities and Terminology in Speech Translation
Viaarxiv icon

Bias-Aware Loss for Training Image and Speech Quality Prediction Models from Multiple Datasets

Add code
Bookmark button
Alert button
Apr 20, 2021
Gabriel Mittag, Saman Zadtootaghaj, Thilo Michael, Babak Naderi, Sebastian Möller

Figure 1 for Bias-Aware Loss for Training Image and Speech Quality Prediction Models from Multiple Datasets
Figure 2 for Bias-Aware Loss for Training Image and Speech Quality Prediction Models from Multiple Datasets
Figure 3 for Bias-Aware Loss for Training Image and Speech Quality Prediction Models from Multiple Datasets
Figure 4 for Bias-Aware Loss for Training Image and Speech Quality Prediction Models from Multiple Datasets
Viaarxiv icon

Speech Recognition using EEG signals recorded using dry electrodes

Aug 13, 2020
Gautam Krishna, Co Tran, Mason Carnahan, Morgan M Hagood, Ahmed H Tewfik

Figure 1 for Speech Recognition using EEG signals recorded using dry electrodes
Figure 2 for Speech Recognition using EEG signals recorded using dry electrodes
Figure 3 for Speech Recognition using EEG signals recorded using dry electrodes
Figure 4 for Speech Recognition using EEG signals recorded using dry electrodes
Viaarxiv icon