Alert button

"speech": models, code, and papers
Alert button

Weight, Block or Unit? Exploring Sparsity Tradeoffs for Speech Enhancement on Tiny Neural Accelerators

Nov 09, 2021
Marko Stamenovic, Nils L. Westhausen, Li-Chia Yang, Carl Jensen, Alex Pawlicki

Figure 1 for Weight, Block or Unit? Exploring Sparsity Tradeoffs for Speech Enhancement on Tiny Neural Accelerators
Figure 2 for Weight, Block or Unit? Exploring Sparsity Tradeoffs for Speech Enhancement on Tiny Neural Accelerators
Figure 3 for Weight, Block or Unit? Exploring Sparsity Tradeoffs for Speech Enhancement on Tiny Neural Accelerators
Figure 4 for Weight, Block or Unit? Exploring Sparsity Tradeoffs for Speech Enhancement on Tiny Neural Accelerators
Viaarxiv icon

Towards Deep Learning-aided Wireless Channel Estimation and Channel State Information Feedback for 6G

Sep 05, 2022
Wonjun Kim, Yongjun Ahn, Jinhong Kim, Byonghyo Shim

Figure 1 for Towards Deep Learning-aided Wireless Channel Estimation and Channel State Information Feedback for 6G
Figure 2 for Towards Deep Learning-aided Wireless Channel Estimation and Channel State Information Feedback for 6G
Figure 3 for Towards Deep Learning-aided Wireless Channel Estimation and Channel State Information Feedback for 6G
Figure 4 for Towards Deep Learning-aided Wireless Channel Estimation and Channel State Information Feedback for 6G
Viaarxiv icon

Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings

Aug 25, 2022
Chunyan Zeng, Shixiong Feng, Zhifeng Wang, Xiangkui Wan, Yunfan Chen, Nan Zhao

Figure 1 for Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings
Figure 2 for Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings
Figure 3 for Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings
Figure 4 for Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings
Viaarxiv icon

Input Length Matters: An Empirical Study Of RNN-T And MWER Training For Long-form Telephony Speech Recognition

Oct 08, 2021
Zhiyun Lu, Yanwei Pan, Thibault Doutre, Liangliang Cao, Rohit Prabhavalkar, Chao Zhang, Trevor Strohman

Figure 1 for Input Length Matters: An Empirical Study Of RNN-T And MWER Training For Long-form Telephony Speech Recognition
Figure 2 for Input Length Matters: An Empirical Study Of RNN-T And MWER Training For Long-form Telephony Speech Recognition
Figure 3 for Input Length Matters: An Empirical Study Of RNN-T And MWER Training For Long-form Telephony Speech Recognition
Figure 4 for Input Length Matters: An Empirical Study Of RNN-T And MWER Training For Long-form Telephony Speech Recognition
Viaarxiv icon

ASR Error Detection via Audio-Transcript entailment

Jul 22, 2022
Nimshi Venkat Meripo, Sandeep Konam

Figure 1 for ASR Error Detection via Audio-Transcript entailment
Figure 2 for ASR Error Detection via Audio-Transcript entailment
Figure 3 for ASR Error Detection via Audio-Transcript entailment
Figure 4 for ASR Error Detection via Audio-Transcript entailment
Viaarxiv icon

A study on the efficacy of model pre-training in developing neural text-to-speech system

Oct 08, 2021
Guangyan Zhang, Yichong Leng, Daxin Tan, Ying Qin, Kaitao Song, Xu Tan, Sheng Zhao, Tan Lee

Figure 1 for A study on the efficacy of model pre-training in developing neural text-to-speech system
Figure 2 for A study on the efficacy of model pre-training in developing neural text-to-speech system
Figure 3 for A study on the efficacy of model pre-training in developing neural text-to-speech system
Figure 4 for A study on the efficacy of model pre-training in developing neural text-to-speech system
Viaarxiv icon

A Neural Text-to-Speech Model Utilizing Broadcast Data Mixed with Background Music

Mar 04, 2021
Hanbin Bae, Jae-Sung Bae, Young-Sun Joo, Young-Ik Kim, Hoon-Young Cho

Figure 1 for A Neural Text-to-Speech Model Utilizing Broadcast Data Mixed with Background Music
Figure 2 for A Neural Text-to-Speech Model Utilizing Broadcast Data Mixed with Background Music
Figure 3 for A Neural Text-to-Speech Model Utilizing Broadcast Data Mixed with Background Music
Figure 4 for A Neural Text-to-Speech Model Utilizing Broadcast Data Mixed with Background Music
Viaarxiv icon

Exploiting ultrasound tongue imaging for the automatic detection of speech articulation errors

Feb 27, 2021
Manuel Sam Ribeiro, Joanne Cleland, Aciel Eshky, Korin Richmond, Steve Renals

Figure 1 for Exploiting ultrasound tongue imaging for the automatic detection of speech articulation errors
Figure 2 for Exploiting ultrasound tongue imaging for the automatic detection of speech articulation errors
Figure 3 for Exploiting ultrasound tongue imaging for the automatic detection of speech articulation errors
Figure 4 for Exploiting ultrasound tongue imaging for the automatic detection of speech articulation errors
Viaarxiv icon

SRIB-LEAP submission to Far-field Multi-Channel Speech Enhancement Challenge for Video Conferencing

Jun 24, 2021
R G Prithvi Raj, Rohit Kumar, M K Jayesh, Anurenjan Purushothaman, Sriram Ganapathy, M A Basha Shaik

Figure 1 for SRIB-LEAP submission to Far-field Multi-Channel Speech Enhancement Challenge for Video Conferencing
Figure 2 for SRIB-LEAP submission to Far-field Multi-Channel Speech Enhancement Challenge for Video Conferencing
Figure 3 for SRIB-LEAP submission to Far-field Multi-Channel Speech Enhancement Challenge for Video Conferencing
Figure 4 for SRIB-LEAP submission to Far-field Multi-Channel Speech Enhancement Challenge for Video Conferencing
Viaarxiv icon

Investigation of Densely Connected Convolutional Networks with Domain Adversarial Learning for Noise Robust Speech Recognition

Dec 19, 2021
Chia Yu Li, Ngoc Thang Vu

Figure 1 for Investigation of Densely Connected Convolutional Networks with Domain Adversarial Learning for Noise Robust Speech Recognition
Figure 2 for Investigation of Densely Connected Convolutional Networks with Domain Adversarial Learning for Noise Robust Speech Recognition
Figure 3 for Investigation of Densely Connected Convolutional Networks with Domain Adversarial Learning for Noise Robust Speech Recognition
Figure 4 for Investigation of Densely Connected Convolutional Networks with Domain Adversarial Learning for Noise Robust Speech Recognition
Viaarxiv icon