Alert button

"speech recognition": models, code, and papers
Alert button

PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features

Dec 05, 2023
Tianshun Han, Shengnan Gui, Yiqing Huang, Baihui Li, Lijian Liu, Benjia Zhou, Ning Jiang, Quan Lu, Ruicong Zhi, Yanyan Liang, Du Zhang, Jun Wan

Viaarxiv icon

Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech Recognition

Add code
Bookmark button
Alert button
Sep 21, 2023
Chen Xu, Xiaoqian Liu, Erfeng He, Yuhao Zhang, Qianqian Dong, Tong Xiao, Jingbo Zhu, Dapeng Man, Wu Yang

Figure 1 for Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech Recognition
Figure 2 for Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech Recognition
Figure 3 for Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech Recognition
Figure 4 for Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech Recognition
Viaarxiv icon

FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning

Dec 07, 2023
Hossein Fereidooni, Alessandro Pegoraro, Phillip Rieger, Alexandra Dmitrienko, Ahmad-Reza Sadeghi

Figure 1 for FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning
Figure 2 for FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning
Figure 3 for FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning
Figure 4 for FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning
Viaarxiv icon

1-step Speech Processing and Understanding Using CTC Loss

Add code
Bookmark button
Alert button
Nov 08, 2023
Karan Singla, Shahab Jalavand, Yeon-Jun Kim, Antonio Moreno Daniel, Srinivas Bangalore, Andrej Ljolje, Ben Stern

Viaarxiv icon

LanSER: Language-Model Supported Speech Emotion Recognition

Sep 07, 2023
Taesik Gong, Josh Belanich, Krishna Somandepalli, Arsha Nagrani, Brian Eoff, Brendan Jou

Figure 1 for LanSER: Language-Model Supported Speech Emotion Recognition
Figure 2 for LanSER: Language-Model Supported Speech Emotion Recognition
Figure 3 for LanSER: Language-Model Supported Speech Emotion Recognition
Figure 4 for LanSER: Language-Model Supported Speech Emotion Recognition
Viaarxiv icon

1SPU: 1-step Speech Processing Unit

Add code
Bookmark button
Alert button
Nov 10, 2023
Karan Singla, Shahab Jalalvand, Yeon-Jun Kim, Antonio Moreno Daniel, Srinivas Bangalore, Andrej Ljolje, Ben Stern

Viaarxiv icon

Training dynamic models using early exits for automatic speech recognition on resource-constrained devices

Add code
Bookmark button
Alert button
Sep 18, 2023
George August Wright, Umberto Cappellazzo, Salah Zaiem, Desh Raj, Lucas Ondel Yang, Daniele Falavigna, Alessio Brutti

Figure 1 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 2 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 3 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 4 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Viaarxiv icon

End-to-End Speech-to-Text Translation: A Survey

Dec 02, 2023
Nivedita Sethiya, Chandresh Kumar Maurya

Viaarxiv icon

Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR

Nov 30, 2023
Jintao Jiang, Yingbo Gao, Zoltan Tuske

Figure 1 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 2 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 3 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 4 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Viaarxiv icon

The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System

Oct 18, 2023
Tae Jin Park, He Huang, Ante Jukic, Kunal Dhawan, Krishna C. Puvvada, Nithin Koluguri, Nikolay Karpov, Aleksandr Laptev, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon