Alert button

"speech": models, code, and papers
Alert button

Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech

Jan 21, 2021
Takuya Fujimura, Yuma Koizumi, Kohei Yatabe, Ryoichi Miyazaki

Figure 1 for Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech
Figure 2 for Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech
Figure 3 for Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech
Figure 4 for Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech
Viaarxiv icon

Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism

Jul 02, 2022
Kun Wei, Pengcheng Guo, Ning Jiang

Figure 1 for Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism
Figure 2 for Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism
Figure 3 for Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism
Figure 4 for Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism
Viaarxiv icon

UWSpeech: Speech to Speech Translation for Unwritten Languages

Add code
Bookmark button
Alert button
Jun 14, 2020
Chen Zhang, Xu Tan, Yi Ren, Tao Qin, Kejun Zhang, Tie-Yan Liu

Figure 1 for UWSpeech: Speech to Speech Translation for Unwritten Languages
Figure 2 for UWSpeech: Speech to Speech Translation for Unwritten Languages
Figure 3 for UWSpeech: Speech to Speech Translation for Unwritten Languages
Figure 4 for UWSpeech: Speech to Speech Translation for Unwritten Languages
Viaarxiv icon

AECMOS: A speech quality assessment metric for echo impairment

Add code
Bookmark button
Alert button
Oct 08, 2021
Marju Purin, Sten Sootla, Mateja Sponza, Ando Saabas, Ross Cutler

Figure 1 for AECMOS: A speech quality assessment metric for echo impairment
Figure 2 for AECMOS: A speech quality assessment metric for echo impairment
Figure 3 for AECMOS: A speech quality assessment metric for echo impairment
Figure 4 for AECMOS: A speech quality assessment metric for echo impairment
Viaarxiv icon

BirdSoundsDenoising: Deep Visual Audio Denoising for Bird Sounds

Add code
Bookmark button
Alert button
Oct 18, 2022
Youshan Zhang, Jialu Li

Figure 1 for BirdSoundsDenoising: Deep Visual Audio Denoising for Bird Sounds
Figure 2 for BirdSoundsDenoising: Deep Visual Audio Denoising for Bird Sounds
Figure 3 for BirdSoundsDenoising: Deep Visual Audio Denoising for Bird Sounds
Figure 4 for BirdSoundsDenoising: Deep Visual Audio Denoising for Bird Sounds
Viaarxiv icon

Is Attention always needed? A Case Study on Language Identification from Speech

Oct 05, 2021
Atanu Mandal, Santanu Pal, Indranil Dutta, Mahidas Bhattacharya, Sudip Kumar Naskar

Figure 1 for Is Attention always needed? A Case Study on Language Identification from Speech
Figure 2 for Is Attention always needed? A Case Study on Language Identification from Speech
Figure 3 for Is Attention always needed? A Case Study on Language Identification from Speech
Figure 4 for Is Attention always needed? A Case Study on Language Identification from Speech
Viaarxiv icon

Simultaneous Speech-to-Speech Translation System with Neural Incremental ASR, MT, and TTS

Add code
Bookmark button
Alert button
Nov 11, 2020
Katsuhito Sudoh, Takatomo Kano, Sashi Novitasari, Tomoya Yanagita, Sakriani Sakti, Satoshi Nakamura

Figure 1 for Simultaneous Speech-to-Speech Translation System with Neural Incremental ASR, MT, and TTS
Figure 2 for Simultaneous Speech-to-Speech Translation System with Neural Incremental ASR, MT, and TTS
Viaarxiv icon

Enhancing audio quality for expressive Neural Text-to-Speech

Aug 13, 2021
Abdelhamid Ezzerg, Adam Gabrys, Bartosz Putrycz, Daniel Korzekwa, Daniel Saez-Trigueros, David McHardy, Kamil Pokora, Jakub Lachowicz, Jaime Lorenzo-Trueba, Viacheslav Klimkov

Figure 1 for Enhancing audio quality for expressive Neural Text-to-Speech
Figure 2 for Enhancing audio quality for expressive Neural Text-to-Speech
Figure 3 for Enhancing audio quality for expressive Neural Text-to-Speech
Figure 4 for Enhancing audio quality for expressive Neural Text-to-Speech
Viaarxiv icon

Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information

Add code
Bookmark button
Alert button
Oct 19, 2021
Baolin Zheng, Peipei Jiang, Qian Wang, Qi Li, Chao Shen, Cong Wang, Yunjie Ge, Qingyang Teng, Shenyi Zhang

Figure 1 for Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information
Figure 2 for Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information
Figure 3 for Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information
Figure 4 for Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information
Viaarxiv icon

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

Add code
Bookmark button
Alert button
Apr 20, 2021
Yuzi Yan, Xu Tan, Bohan Li, Tao Qin, Sheng Zhao, Yuan Shen, Tie-Yan Liu

Figure 1 for AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
Figure 2 for AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
Figure 3 for AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
Figure 4 for AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
Viaarxiv icon