Alert button

"speech recognition": models, code, and papers
Alert button

Efficient domain adaptation of language models in ASR systems using Prompt-tuning

Oct 13, 2021
Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff

Figure 1 for Efficient domain adaptation of language models in ASR systems using Prompt-tuning
Viaarxiv icon

Speech Summarization using Restricted Self-Attention

Oct 12, 2021
Roshan Sharma, Shruti Palaskar, Alan W Black, Florian Metze

Figure 1 for Speech Summarization using Restricted Self-Attention
Figure 2 for Speech Summarization using Restricted Self-Attention
Figure 3 for Speech Summarization using Restricted Self-Attention
Figure 4 for Speech Summarization using Restricted Self-Attention
Viaarxiv icon

Spatial Diffuseness Features for DNN-Based Speech Recognition in Noisy and Reverberant Environments

Feb 16, 2015
Andreas Schwarz, Christian Huemmer, Roland Maas, Walter Kellermann

Figure 1 for Spatial Diffuseness Features for DNN-Based Speech Recognition in Noisy and Reverberant Environments
Figure 2 for Spatial Diffuseness Features for DNN-Based Speech Recognition in Noisy and Reverberant Environments
Figure 3 for Spatial Diffuseness Features for DNN-Based Speech Recognition in Noisy and Reverberant Environments
Viaarxiv icon

Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding

Oct 10, 2021
Chao Wang, Zhonghao Li, Benlai Tang, Xiang Yin, Yuan Wan, Yibiao Yu, Zejun Ma

Figure 1 for Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding
Figure 2 for Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding
Figure 3 for Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding
Figure 4 for Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding
Viaarxiv icon

Deep Speech: Scaling up end-to-end speech recognition

Dec 19, 2014
Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, Andrew Y. Ng

Figure 1 for Deep Speech: Scaling up end-to-end speech recognition
Figure 2 for Deep Speech: Scaling up end-to-end speech recognition
Figure 3 for Deep Speech: Scaling up end-to-end speech recognition
Figure 4 for Deep Speech: Scaling up end-to-end speech recognition
Viaarxiv icon

Speech Emotion Recognition Using Deep Sparse Auto-Encoder Extreme Learning Machine with a New Weighting Scheme and Spectro-Temporal Features Along with Classical Feature Selection and A New Quantum-Inspired Dimension Reduction Method

Nov 13, 2021
Fatemeh Daneshfar, Seyed Jahanshah Kabudian

Figure 1 for Speech Emotion Recognition Using Deep Sparse Auto-Encoder Extreme Learning Machine with a New Weighting Scheme and Spectro-Temporal Features Along with Classical Feature Selection and A New Quantum-Inspired Dimension Reduction Method
Figure 2 for Speech Emotion Recognition Using Deep Sparse Auto-Encoder Extreme Learning Machine with a New Weighting Scheme and Spectro-Temporal Features Along with Classical Feature Selection and A New Quantum-Inspired Dimension Reduction Method
Figure 3 for Speech Emotion Recognition Using Deep Sparse Auto-Encoder Extreme Learning Machine with a New Weighting Scheme and Spectro-Temporal Features Along with Classical Feature Selection and A New Quantum-Inspired Dimension Reduction Method
Figure 4 for Speech Emotion Recognition Using Deep Sparse Auto-Encoder Extreme Learning Machine with a New Weighting Scheme and Spectro-Temporal Features Along with Classical Feature Selection and A New Quantum-Inspired Dimension Reduction Method
Viaarxiv icon

iCub Being Social: Exploiting Social Cues for Interactive Object Detection Learning

Jul 27, 2022
Maria Lombardi, Elisa Maiettini, Vadim Tikhanoff, Lorenzo Natale

Figure 1 for iCub Being Social: Exploiting Social Cues for Interactive Object Detection Learning
Figure 2 for iCub Being Social: Exploiting Social Cues for Interactive Object Detection Learning
Figure 3 for iCub Being Social: Exploiting Social Cues for Interactive Object Detection Learning
Figure 4 for iCub Being Social: Exploiting Social Cues for Interactive Object Detection Learning
Viaarxiv icon

Revealing and Protecting Labels in Distributed Training

Oct 31, 2021
Trung Dang, Om Thakkar, Swaroop Ramaswamy, Rajiv Mathews, Peter Chin, Françoise Beaufays

Figure 1 for Revealing and Protecting Labels in Distributed Training
Figure 2 for Revealing and Protecting Labels in Distributed Training
Figure 3 for Revealing and Protecting Labels in Distributed Training
Figure 4 for Revealing and Protecting Labels in Distributed Training
Viaarxiv icon

Deep Learning-Aided 6G Wireless Networks: A Comprehensive Survey of Revolutionary PHY Architectures

Jan 11, 2022
Burak Ozpoyraz, A. Tugberk Dogukan, Yarkin Gevez, Ufuk Altun, Ertugrul Basar

Figure 1 for Deep Learning-Aided 6G Wireless Networks: A Comprehensive Survey of Revolutionary PHY Architectures
Figure 2 for Deep Learning-Aided 6G Wireless Networks: A Comprehensive Survey of Revolutionary PHY Architectures
Figure 3 for Deep Learning-Aided 6G Wireless Networks: A Comprehensive Survey of Revolutionary PHY Architectures
Figure 4 for Deep Learning-Aided 6G Wireless Networks: A Comprehensive Survey of Revolutionary PHY Architectures
Viaarxiv icon

Attention-based Region of Interest (ROI) Detection for Speech Emotion Recognition

Mar 03, 2022
Jay Desai, Houwei Cao, Ravi Shah

Figure 1 for Attention-based Region of Interest (ROI) Detection for Speech Emotion Recognition
Figure 2 for Attention-based Region of Interest (ROI) Detection for Speech Emotion Recognition
Figure 3 for Attention-based Region of Interest (ROI) Detection for Speech Emotion Recognition
Figure 4 for Attention-based Region of Interest (ROI) Detection for Speech Emotion Recognition
Viaarxiv icon