Alert button

"speech": models, code, and papers
Alert button

Cross Lingual Cross Corpus Speech Emotion Recognition

Mar 18, 2020
Shivali Goel, Homayoon Beigi

Figure 1 for Cross Lingual Cross Corpus Speech Emotion Recognition
Figure 2 for Cross Lingual Cross Corpus Speech Emotion Recognition
Figure 3 for Cross Lingual Cross Corpus Speech Emotion Recognition
Figure 4 for Cross Lingual Cross Corpus Speech Emotion Recognition
Viaarxiv icon

FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech

Add code
Bookmark button
Alert button
Jun 08, 2020
Yi Ren, Chenxu Hu, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu

Figure 1 for FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech
Figure 2 for FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech
Figure 3 for FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech
Figure 4 for FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech
Viaarxiv icon

From Universal Language Model to Downstream Task: Improving RoBERTa-Based Vietnamese Hate Speech Detection

Add code
Bookmark button
Alert button
Feb 24, 2021
Quang Huu Pham, Viet Anh Nguyen, Linh Bao Doan, Ngoc N. Tran, Ta Minh Thanh

Viaarxiv icon

DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors

Oct 28, 2020
Chandan K A Reddy, Vishak Gopal, Ross Cutler

Figure 1 for DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors
Figure 2 for DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors
Figure 3 for DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors
Figure 4 for DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors
Viaarxiv icon

USEV: Universal Speaker Extraction with Visual Cue

Add code
Bookmark button
Alert button
Sep 30, 2021
Zexu Pan, Meng Ge, Haizhou Li

Figure 1 for USEV: Universal Speaker Extraction with Visual Cue
Figure 2 for USEV: Universal Speaker Extraction with Visual Cue
Figure 3 for USEV: Universal Speaker Extraction with Visual Cue
Figure 4 for USEV: Universal Speaker Extraction with Visual Cue
Viaarxiv icon

Incorporating Symbolic Sequential Modeling for Speech Enhancement

Apr 30, 2019
Chien-Feng Liao, Yu Tsao, Xugang Lu, Hisashi Kawai

Figure 1 for Incorporating Symbolic Sequential Modeling for Speech Enhancement
Figure 2 for Incorporating Symbolic Sequential Modeling for Speech Enhancement
Figure 3 for Incorporating Symbolic Sequential Modeling for Speech Enhancement
Figure 4 for Incorporating Symbolic Sequential Modeling for Speech Enhancement
Viaarxiv icon

Enabling Real-time On-chip Audio Super Resolution for Bone Conduction Microphones

Add code
Bookmark button
Alert button
Dec 24, 2021
Yuang Li, Yuntao Wang, Xin Liu, Yuanchun Shi, Shao-fu Shih

Figure 1 for Enabling Real-time On-chip Audio Super Resolution for Bone Conduction Microphones
Figure 2 for Enabling Real-time On-chip Audio Super Resolution for Bone Conduction Microphones
Figure 3 for Enabling Real-time On-chip Audio Super Resolution for Bone Conduction Microphones
Figure 4 for Enabling Real-time On-chip Audio Super Resolution for Bone Conduction Microphones
Viaarxiv icon

Acoustic echo suppression using a learning-based multi-frame minimum variance distortionless response filter

May 07, 2022
Yuefeng Tsai, Yicheng Hsu, Mingsian Bai

Figure 1 for Acoustic echo suppression using a learning-based multi-frame minimum variance distortionless response filter
Figure 2 for Acoustic echo suppression using a learning-based multi-frame minimum variance distortionless response filter
Figure 3 for Acoustic echo suppression using a learning-based multi-frame minimum variance distortionless response filter
Figure 4 for Acoustic echo suppression using a learning-based multi-frame minimum variance distortionless response filter
Viaarxiv icon

Latency-Controlled Neural Architecture Search for Streaming Speech Recognition

May 08, 2021
Liqiang He, Shulin Feng, Dan Su, Dong Yu

Figure 1 for Latency-Controlled Neural Architecture Search for Streaming Speech Recognition
Figure 2 for Latency-Controlled Neural Architecture Search for Streaming Speech Recognition
Figure 3 for Latency-Controlled Neural Architecture Search for Streaming Speech Recognition
Figure 4 for Latency-Controlled Neural Architecture Search for Streaming Speech Recognition
Viaarxiv icon

Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition

Dec 03, 2020
Genta Indra Winata, Guangsen Wang, Caiming Xiong, Steven Hoi

Figure 1 for Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition
Figure 2 for Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition
Figure 3 for Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition
Figure 4 for Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition
Viaarxiv icon