Alert button

"speech recognition": models, code, and papers
Alert button

Learning Delays in Spiking Neural Networks using Dilated Convolutions with Learnable Spacings

Add code
Bookmark button
Alert button
Jun 30, 2023
Ilyass Hammouamri, Ismail Khalfaoui-Hassani, Timothée Masquelier

Figure 1 for Learning Delays in Spiking Neural Networks using Dilated Convolutions with Learnable Spacings
Figure 2 for Learning Delays in Spiking Neural Networks using Dilated Convolutions with Learnable Spacings
Figure 3 for Learning Delays in Spiking Neural Networks using Dilated Convolutions with Learnable Spacings
Figure 4 for Learning Delays in Spiking Neural Networks using Dilated Convolutions with Learnable Spacings
Viaarxiv icon

Masked Audio Text Encoders are Effective Multi-Modal Rescorers

Add code
Bookmark button
Alert button
May 24, 2023
Jinglun Cai, Monica Sunkara, Xilai Li, Anshu Bhatia, Xiao Pan, Sravan Bodapati

Figure 1 for Masked Audio Text Encoders are Effective Multi-Modal Rescorers
Figure 2 for Masked Audio Text Encoders are Effective Multi-Modal Rescorers
Figure 3 for Masked Audio Text Encoders are Effective Multi-Modal Rescorers
Figure 4 for Masked Audio Text Encoders are Effective Multi-Modal Rescorers
Viaarxiv icon

Adapting Multi-Lingual ASR Models for Handling Multiple Talkers

May 30, 2023
Chenda Li, Yao Qian, Zhuo Chen, Naoyuki Kanda, Dongmei Wang, Takuya Yoshioka, Yanmin Qian, Michael Zeng

Figure 1 for Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Figure 2 for Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Figure 3 for Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Figure 4 for Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Viaarxiv icon

Bridging the Granularity Gap for Acoustic Modeling

Add code
Bookmark button
Alert button
May 27, 2023
Chen Xu, Yuhao Zhang, Chengbo Jiao, Xiaoqian Liu, Chi Hu, Xin Zeng, Tong Xiao, Anxiang Ma, Huizhen Wang, JingBo Zhu

Figure 1 for Bridging the Granularity Gap for Acoustic Modeling
Figure 2 for Bridging the Granularity Gap for Acoustic Modeling
Figure 3 for Bridging the Granularity Gap for Acoustic Modeling
Figure 4 for Bridging the Granularity Gap for Acoustic Modeling
Viaarxiv icon

Multi-task learning of speech and speaker recognition

Add code
Bookmark button
Alert button
Feb 24, 2023
Nik Vaessen, David A. van Leeuwen

Figure 1 for Multi-task learning of speech and speaker recognition
Figure 2 for Multi-task learning of speech and speaker recognition
Figure 3 for Multi-task learning of speech and speaker recognition
Figure 4 for Multi-task learning of speech and speaker recognition
Viaarxiv icon

Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems

Nov 01, 2022
Shaan Bijwadia, Shuo-yiin Chang, Bo Li, Tara Sainath, Chao Zhang, Yanzhang He

Figure 1 for Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems
Figure 2 for Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems
Figure 3 for Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems
Figure 4 for Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems
Viaarxiv icon

Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis

May 09, 2022
Zhenzi Weng, Zhijin Qin, Xiaoming Tao, Chengkang Pan, Guangyi Liu, Geoffrey Ye Li

Figure 1 for Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis
Figure 2 for Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis
Figure 3 for Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis
Figure 4 for Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis
Viaarxiv icon

Self-supervised representations in speech-based depression detection

May 20, 2023
Wen Wu, Chao Zhang, Philip C. Woodland

Figure 1 for Self-supervised representations in speech-based depression detection
Figure 2 for Self-supervised representations in speech-based depression detection
Figure 3 for Self-supervised representations in speech-based depression detection
Figure 4 for Self-supervised representations in speech-based depression detection
Viaarxiv icon

End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation

Add code
Bookmark button
Alert button
Oct 19, 2022
Yoshiki Masuyama, Xuankai Chang, Samuele Cornell, Shinji Watanabe, Nobutaka Ono

Figure 1 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Figure 2 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Figure 3 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Figure 4 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Viaarxiv icon

Filter and evolve: progressive pseudo label refining for semi-supervised automatic speech recognition

Oct 28, 2022
Zezhong Jin, Dading Zhong, Xiao Song, Zhaoyi Liu, Naipeng Ye, Qingcheng Zeng

Figure 1 for Filter and evolve: progressive pseudo label refining for semi-supervised automatic speech recognition
Figure 2 for Filter and evolve: progressive pseudo label refining for semi-supervised automatic speech recognition
Figure 3 for Filter and evolve: progressive pseudo label refining for semi-supervised automatic speech recognition
Figure 4 for Filter and evolve: progressive pseudo label refining for semi-supervised automatic speech recognition
Viaarxiv icon