Alert button

"speech": models, code, and papers
Alert button

Interpreting intermediate convolutional layers of CNNs trained on raw speech

Apr 21, 2021
Gašper Beguš, Alan Zhou

Figure 1 for Interpreting intermediate convolutional layers of CNNs trained on raw speech
Figure 2 for Interpreting intermediate convolutional layers of CNNs trained on raw speech
Figure 3 for Interpreting intermediate convolutional layers of CNNs trained on raw speech
Figure 4 for Interpreting intermediate convolutional layers of CNNs trained on raw speech
Viaarxiv icon

UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control

Add code
Bookmark button
Alert button
Jun 21, 2021
Minsu Kang, Sungjae Kim, Injung Kim

Figure 1 for UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control
Figure 2 for UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control
Figure 3 for UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control
Figure 4 for UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control
Viaarxiv icon

Robustifying automatic speech recognition by extracting slowly varying features

Dec 14, 2021
Matias Pizarro, Dorothea Kolossa, Asja Fischer

Figure 1 for Robustifying automatic speech recognition by extracting slowly varying features
Figure 2 for Robustifying automatic speech recognition by extracting slowly varying features
Figure 3 for Robustifying automatic speech recognition by extracting slowly varying features
Figure 4 for Robustifying automatic speech recognition by extracting slowly varying features
Viaarxiv icon

CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling

Add code
Bookmark button
Alert button
Oct 19, 2022
Jun Zhang, Shuyang Jiang, Jiangtao Feng, Lin Zheng, Lingpeng Kong

Figure 1 for CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Figure 2 for CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Figure 3 for CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Figure 4 for CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Viaarxiv icon

Mathematical Vocoder Algorithm : Modified Spectral Inversion for Efficient Neural Speech Synthesis

Add code
Bookmark button
Alert button
Jun 06, 2021
Hyun Gon Ryu, Jeong-Hoon Kim, Simon See

Figure 1 for Mathematical Vocoder Algorithm : Modified Spectral Inversion for Efficient Neural Speech Synthesis
Figure 2 for Mathematical Vocoder Algorithm : Modified Spectral Inversion for Efficient Neural Speech Synthesis
Figure 3 for Mathematical Vocoder Algorithm : Modified Spectral Inversion for Efficient Neural Speech Synthesis
Figure 4 for Mathematical Vocoder Algorithm : Modified Spectral Inversion for Efficient Neural Speech Synthesis
Viaarxiv icon

AngryBERT: Joint Learning Target and Emotion for Hate Speech Detection

Add code
Bookmark button
Alert button
Mar 14, 2021
Md Rabiul Awal, Rui Cao, Roy Ka-Wei Lee, Sandra Mitrovic

Figure 1 for AngryBERT: Joint Learning Target and Emotion for Hate Speech Detection
Figure 2 for AngryBERT: Joint Learning Target and Emotion for Hate Speech Detection
Figure 3 for AngryBERT: Joint Learning Target and Emotion for Hate Speech Detection
Figure 4 for AngryBERT: Joint Learning Target and Emotion for Hate Speech Detection
Viaarxiv icon

Dialectal Speech Recognition and Translation of Swiss German Speech to Standard German Text: Microsoft's Submission to SwissText 2021

Add code
Bookmark button
Alert button
Jun 15, 2021
Yuriy Arabskyy, Aashish Agarwal, Subhadeep Dey, Oscar Koller

Figure 1 for Dialectal Speech Recognition and Translation of Swiss German Speech to Standard German Text: Microsoft's Submission to SwissText 2021
Figure 2 for Dialectal Speech Recognition and Translation of Swiss German Speech to Standard German Text: Microsoft's Submission to SwissText 2021
Viaarxiv icon

Towards Energy-Efficient, Low-Latency and Accurate Spiking LSTMs

Oct 23, 2022
Gourav Datta, Haoqin Deng, Robert Aviles, Peter A. Beerel

Figure 1 for Towards Energy-Efficient, Low-Latency and Accurate Spiking LSTMs
Figure 2 for Towards Energy-Efficient, Low-Latency and Accurate Spiking LSTMs
Figure 3 for Towards Energy-Efficient, Low-Latency and Accurate Spiking LSTMs
Figure 4 for Towards Energy-Efficient, Low-Latency and Accurate Spiking LSTMs
Viaarxiv icon

Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge

Feb 24, 2022
Yen-Ju Lu, Samuele Cornell, Xuankai Chang, Wangyou Zhang, Chenda Li, Zhaoheng Ni, Zhong-Qiu Wang, Shinji Watanabe

Figure 1 for Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge
Figure 2 for Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge
Figure 3 for Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge
Figure 4 for Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge
Viaarxiv icon

Subject Envelope based Multitype Reconstruction Algorithm of Speech Samples of Parkinson's Disease

Aug 23, 2021
Yongming Li, Chengyu Liu, Pin Wang, Hehua Zhang, Anhai Wei

Figure 1 for Subject Envelope based Multitype Reconstruction Algorithm of Speech Samples of Parkinson's Disease
Figure 2 for Subject Envelope based Multitype Reconstruction Algorithm of Speech Samples of Parkinson's Disease
Figure 3 for Subject Envelope based Multitype Reconstruction Algorithm of Speech Samples of Parkinson's Disease
Figure 4 for Subject Envelope based Multitype Reconstruction Algorithm of Speech Samples of Parkinson's Disease
Viaarxiv icon