Alert button

"speech": models, code, and papers
Alert button

Speaker Change Detection for Transformer Transducer ASR

Feb 16, 2023
Jian Wu, Zhuo Chen, Min Hu, Xiong Xiao, Jinyu Li

Figure 1 for Speaker Change Detection for Transformer Transducer ASR
Figure 2 for Speaker Change Detection for Transformer Transducer ASR
Figure 3 for Speaker Change Detection for Transformer Transducer ASR
Figure 4 for Speaker Change Detection for Transformer Transducer ASR
Viaarxiv icon

Cross-domain Neural Pitch and Periodicity Estimation

Add code
Bookmark button
Alert button
Jan 28, 2023
Max Morrison, Caedon Hsieh, Nathan Pruyne, Bryan Pardo

Figure 1 for Cross-domain Neural Pitch and Periodicity Estimation
Figure 2 for Cross-domain Neural Pitch and Periodicity Estimation
Figure 3 for Cross-domain Neural Pitch and Periodicity Estimation
Figure 4 for Cross-domain Neural Pitch and Periodicity Estimation
Viaarxiv icon

Language-specific Characteristic Assistance for Code-switching Speech Recognition

Add code
Bookmark button
Alert button
Jul 05, 2022
Tongtong Song, Qiang Xu, Meng Ge, Longbiao Wang, Hao Shi, Yongjie Lv, Yuqin Lin, Jianwu Dang

Figure 1 for Language-specific Characteristic Assistance for Code-switching Speech Recognition
Figure 2 for Language-specific Characteristic Assistance for Code-switching Speech Recognition
Figure 3 for Language-specific Characteristic Assistance for Code-switching Speech Recognition
Figure 4 for Language-specific Characteristic Assistance for Code-switching Speech Recognition
Viaarxiv icon

Deep Learning Approach for Classifying the Aggressive Comments on Social Media: Machine Translated Data Vs Real Life Data

Mar 13, 2023
Mst Shapna Akter, Hossain Shahriar, Nova Ahmed, Alfredo Cuzzocrea

Figure 1 for Deep Learning Approach for Classifying the Aggressive Comments on Social Media: Machine Translated Data Vs Real Life Data
Figure 2 for Deep Learning Approach for Classifying the Aggressive Comments on Social Media: Machine Translated Data Vs Real Life Data
Figure 3 for Deep Learning Approach for Classifying the Aggressive Comments on Social Media: Machine Translated Data Vs Real Life Data
Figure 4 for Deep Learning Approach for Classifying the Aggressive Comments on Social Media: Machine Translated Data Vs Real Life Data
Viaarxiv icon

Non-parallel Accent Conversion using Pseudo Siamese Disentanglement Network

Add code
Bookmark button
Alert button
Dec 12, 2022
Dongya Jia, Qiao Tian, Jiaxin Li, Yuanzhe Chen, Kainan Peng, Mingbo Ma, Yuping Wang, Yuxuan Wang

Figure 1 for Non-parallel Accent Conversion using Pseudo Siamese Disentanglement Network
Figure 2 for Non-parallel Accent Conversion using Pseudo Siamese Disentanglement Network
Figure 3 for Non-parallel Accent Conversion using Pseudo Siamese Disentanglement Network
Figure 4 for Non-parallel Accent Conversion using Pseudo Siamese Disentanglement Network
Viaarxiv icon

Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes

Jun 23, 2022
Danilo de Oliveira, Tal Peer, Timo Gerkmann

Figure 1 for Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes
Figure 2 for Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes
Figure 3 for Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes
Figure 4 for Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes
Viaarxiv icon

Modelling word learning and recognition using visually grounded speech

Mar 14, 2022
Danny Merkx, Sebastiaan Scholten, Stefan L. Frank, Mirjam Ernestus, Odette Scharenborg

Figure 1 for Modelling word learning and recognition using visually grounded speech
Figure 2 for Modelling word learning and recognition using visually grounded speech
Figure 3 for Modelling word learning and recognition using visually grounded speech
Figure 4 for Modelling word learning and recognition using visually grounded speech
Viaarxiv icon

AmbiSep: Ambisonic-to-Ambisonic Reverberant Speech Separation Using Transformer Networks

Jun 13, 2022
Adrian Herzog, Srikanth Raj Chetupalli, Emanuël A. P. Habets

Figure 1 for AmbiSep: Ambisonic-to-Ambisonic Reverberant Speech Separation Using Transformer Networks
Figure 2 for AmbiSep: Ambisonic-to-Ambisonic Reverberant Speech Separation Using Transformer Networks
Figure 3 for AmbiSep: Ambisonic-to-Ambisonic Reverberant Speech Separation Using Transformer Networks
Figure 4 for AmbiSep: Ambisonic-to-Ambisonic Reverberant Speech Separation Using Transformer Networks
Viaarxiv icon

Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification

Sep 13, 2022
Chao Zhang, Bo Li, Tara Sainath, Trevor Strohman, Sepand Mavandadi, Shuo-yiin Chang, Parisa Haghani

Figure 1 for Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Figure 2 for Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Figure 3 for Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Figure 4 for Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Viaarxiv icon

Multilingual Content Moderation: A Case Study on Reddit

Add code
Bookmark button
Alert button
Feb 19, 2023
Meng Ye, Karan Sikka, Katherine Atwell, Sabit Hassan, Ajay Divakaran, Malihe Alikhani

Figure 1 for Multilingual Content Moderation: A Case Study on Reddit
Figure 2 for Multilingual Content Moderation: A Case Study on Reddit
Figure 3 for Multilingual Content Moderation: A Case Study on Reddit
Figure 4 for Multilingual Content Moderation: A Case Study on Reddit
Viaarxiv icon