Alert button

"speech": models, code, and papers
Alert button

Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models

Feb 16, 2022
Sarala Padi, Seyed Omid Sadjadi, Dinesh Manocha, Ram D. Sriram

Figure 1 for Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models
Figure 2 for Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models
Figure 3 for Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models
Figure 4 for Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models
Viaarxiv icon

DeL-haTE: A Deep Learning Tunable Ensemble for Hate Speech Detection

Nov 03, 2020
Joshua Melton, Arunkumar Bagavathi, Siddharth Krishnan

Figure 1 for DeL-haTE: A Deep Learning Tunable Ensemble for Hate Speech Detection
Figure 2 for DeL-haTE: A Deep Learning Tunable Ensemble for Hate Speech Detection
Figure 3 for DeL-haTE: A Deep Learning Tunable Ensemble for Hate Speech Detection
Figure 4 for DeL-haTE: A Deep Learning Tunable Ensemble for Hate Speech Detection
Viaarxiv icon

Encrypted Speech Recognition using Deep Polynomial Networks

May 11, 2019
Shi-Xiong Zhang, Yifan Gong, Dong Yu

Figure 1 for Encrypted Speech Recognition using Deep Polynomial Networks
Figure 2 for Encrypted Speech Recognition using Deep Polynomial Networks
Figure 3 for Encrypted Speech Recognition using Deep Polynomial Networks
Figure 4 for Encrypted Speech Recognition using Deep Polynomial Networks
Viaarxiv icon

Improving End-to-End Speech-to-Intent Classification with Reptile

Aug 05, 2020
Yusheng Tian, Philip John Gorinski

Figure 1 for Improving End-to-End Speech-to-Intent Classification with Reptile
Figure 2 for Improving End-to-End Speech-to-Intent Classification with Reptile
Figure 3 for Improving End-to-End Speech-to-Intent Classification with Reptile
Figure 4 for Improving End-to-End Speech-to-Intent Classification with Reptile
Viaarxiv icon

Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search

Apr 13, 2021
Yukun Liu, Ta Li, Pengyuan Zhang, Yonghong Yan

Figure 1 for Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search
Figure 2 for Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search
Figure 3 for Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search
Figure 4 for Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search
Viaarxiv icon

TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition

Apr 04, 2021
Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Shuai Zhang, Zhengqi Wen, Xuefei Liu

Figure 1 for TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition
Figure 2 for TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition
Figure 3 for TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition
Figure 4 for TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition
Viaarxiv icon

ESNI: Domestic Robots Design for Elderly and Disabled People

Mar 30, 2022
Junchi Chu, Xueyun Tang

Figure 1 for ESNI: Domestic Robots Design for Elderly and Disabled People
Figure 2 for ESNI: Domestic Robots Design for Elderly and Disabled People
Figure 3 for ESNI: Domestic Robots Design for Elderly and Disabled People
Figure 4 for ESNI: Domestic Robots Design for Elderly and Disabled People
Viaarxiv icon

PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription

Sep 17, 2021
Chen Zhang, Jiaxing Yu, LuChin Chang, Xu Tan, Jiawei Chen, Tao Qin, Kejun Zhang

Figure 1 for PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription
Figure 2 for PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription
Figure 3 for PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription
Figure 4 for PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription
Viaarxiv icon

MSR-NV: Neural vocoder using multiple sampling rates

Sep 28, 2021
Kentaro Mitsui, Kei Sawada

Figure 1 for MSR-NV: Neural vocoder using multiple sampling rates
Figure 2 for MSR-NV: Neural vocoder using multiple sampling rates
Figure 3 for MSR-NV: Neural vocoder using multiple sampling rates
Figure 4 for MSR-NV: Neural vocoder using multiple sampling rates
Viaarxiv icon