Alert button

"speech": models, code, and papers
Alert button

Auditory-Based Data Augmentation for End-to-End Automatic Speech Recognition

Apr 08, 2022
Zehai Tu, Jack Deadman, Ning Ma, Jon Barker

Figure 1 for Auditory-Based Data Augmentation for End-to-End Automatic Speech Recognition
Figure 2 for Auditory-Based Data Augmentation for End-to-End Automatic Speech Recognition
Figure 3 for Auditory-Based Data Augmentation for End-to-End Automatic Speech Recognition
Figure 4 for Auditory-Based Data Augmentation for End-to-End Automatic Speech Recognition
Viaarxiv icon

Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing

May 14, 2022
Heli Qi, Sashi Novitasari, Sakriani Sakti, Satoshi Nakamura

Figure 1 for Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing
Figure 2 for Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing
Figure 3 for Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing
Viaarxiv icon

Fast Blind Audio Copy-Move Detection and Localization Using Local Feature Tensors in Noise

Feb 15, 2023
Dong Yang, Mingle Liu, Muyong Cao

Figure 1 for Fast Blind Audio Copy-Move Detection and Localization Using Local Feature Tensors in Noise
Figure 2 for Fast Blind Audio Copy-Move Detection and Localization Using Local Feature Tensors in Noise
Figure 3 for Fast Blind Audio Copy-Move Detection and Localization Using Local Feature Tensors in Noise
Figure 4 for Fast Blind Audio Copy-Move Detection and Localization Using Local Feature Tensors in Noise
Viaarxiv icon

Improving Cross-lingual Speech Synthesis with Triplet Training Scheme

Add code
Bookmark button
Alert button
Feb 22, 2022
Jianhao Ye, Hongbin Zhou, Zhiba Su, Wendi He, Kaimeng Ren, Lin Li, Heng Lu

Figure 1 for Improving Cross-lingual Speech Synthesis with Triplet Training Scheme
Figure 2 for Improving Cross-lingual Speech Synthesis with Triplet Training Scheme
Figure 3 for Improving Cross-lingual Speech Synthesis with Triplet Training Scheme
Viaarxiv icon

Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis

Add code
Bookmark button
Alert button
Apr 06, 2022
Shun Lei, Yixuan Zhou, Liyang Chen, Jiankun Hu, Zhiyong Wu, Shiyin Kang, Helen Meng

Figure 1 for Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Figure 2 for Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Figure 3 for Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Figure 4 for Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Viaarxiv icon

Applying Automated Machine Translation to Educational Video Courses

Jan 09, 2023
Linden Wang

Figure 1 for Applying Automated Machine Translation to Educational Video Courses
Figure 2 for Applying Automated Machine Translation to Educational Video Courses
Figure 3 for Applying Automated Machine Translation to Educational Video Courses
Figure 4 for Applying Automated Machine Translation to Educational Video Courses
Viaarxiv icon

VLSP2022 EVJVQA Challenge: Multilingual Visual Question Answering

Add code
Bookmark button
Alert button
Feb 24, 2023
Ngan Luu-Thuy Nguyen, Nghia Hieu Nguyen, Duong T. D Vo, Khanh Quoc Tran, Kiet Van Nguyen

Figure 1 for VLSP2022 EVJVQA Challenge: Multilingual Visual Question Answering
Figure 2 for VLSP2022 EVJVQA Challenge: Multilingual Visual Question Answering
Figure 3 for VLSP2022 EVJVQA Challenge: Multilingual Visual Question Answering
Figure 4 for VLSP2022 EVJVQA Challenge: Multilingual Visual Question Answering
Viaarxiv icon

GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech Synthesis

Add code
Bookmark button
Alert button
May 15, 2022
Rongjie Huang, Yi Ren, Jinglin Liu, Chenye Cui, Zhou Zhao

Figure 1 for GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech Synthesis
Figure 2 for GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech Synthesis
Figure 3 for GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech Synthesis
Figure 4 for GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech Synthesis
Viaarxiv icon

A Survey of research in Deep Learning for Robotics for Undergraduate research interns

Jan 23, 2023
Narayanan PP, Palacode Narayana Iyer Anantharaman

Figure 1 for A Survey of research in Deep Learning for Robotics for Undergraduate research interns
Figure 2 for A Survey of research in Deep Learning for Robotics for Undergraduate research interns
Figure 3 for A Survey of research in Deep Learning for Robotics for Undergraduate research interns
Figure 4 for A Survey of research in Deep Learning for Robotics for Undergraduate research interns
Viaarxiv icon

Curriculum optimization for low-resource speech recognition

Feb 17, 2022
Anastasia Kuznetsova, Anurag Kumar, Jennifer Drexler Fox, Francis Tyers

Figure 1 for Curriculum optimization for low-resource speech recognition
Figure 2 for Curriculum optimization for low-resource speech recognition
Figure 3 for Curriculum optimization for low-resource speech recognition
Figure 4 for Curriculum optimization for low-resource speech recognition
Viaarxiv icon