Alert button

"speech recognition": models, code, and papers
Alert button

Compressing 1D Time-Channel Separable Convolutions using Sparse Random Ternary Matrices

Add code
Bookmark button
Alert button
Apr 01, 2021
Gonçalo Mordido, Matthijs Van Keirsbilck, Alexander Keller

Figure 1 for Compressing 1D Time-Channel Separable Convolutions using Sparse Random Ternary Matrices
Figure 2 for Compressing 1D Time-Channel Separable Convolutions using Sparse Random Ternary Matrices
Figure 3 for Compressing 1D Time-Channel Separable Convolutions using Sparse Random Ternary Matrices
Figure 4 for Compressing 1D Time-Channel Separable Convolutions using Sparse Random Ternary Matrices
Viaarxiv icon

Knowledge Distillation for Neural Transducers from Large Self-Supervised Pre-trained Models

Oct 07, 2021
Xiaoyu Yang, Qiujia Li, Philip C. Woodland

Figure 1 for Knowledge Distillation for Neural Transducers from Large Self-Supervised Pre-trained Models
Figure 2 for Knowledge Distillation for Neural Transducers from Large Self-Supervised Pre-trained Models
Figure 3 for Knowledge Distillation for Neural Transducers from Large Self-Supervised Pre-trained Models
Figure 4 for Knowledge Distillation for Neural Transducers from Large Self-Supervised Pre-trained Models
Viaarxiv icon

End-to-End Spoken Language Understanding using RNN-Transducer ASR

Jul 08, 2021
Anirudh Raju, Gautam Tiwari, Milind Rao, Pranav Dheram, Bryan Anderson, Zhe Zhang, Bach Bui, Ariya Rastrow

Figure 1 for End-to-End Spoken Language Understanding using RNN-Transducer ASR
Figure 2 for End-to-End Spoken Language Understanding using RNN-Transducer ASR
Figure 3 for End-to-End Spoken Language Understanding using RNN-Transducer ASR
Figure 4 for End-to-End Spoken Language Understanding using RNN-Transducer ASR
Viaarxiv icon

SpeechFormer: A Hierarchical Efficient Framework Incorporating the Characteristics of Speech

Add code
Bookmark button
Alert button
Mar 10, 2022
Weidong Chen, Xiaofen Xing, Xiangmin Xu, Jianxin Pang, Lan Du

Figure 1 for SpeechFormer: A Hierarchical Efficient Framework Incorporating the Characteristics of Speech
Figure 2 for SpeechFormer: A Hierarchical Efficient Framework Incorporating the Characteristics of Speech
Figure 3 for SpeechFormer: A Hierarchical Efficient Framework Incorporating the Characteristics of Speech
Figure 4 for SpeechFormer: A Hierarchical Efficient Framework Incorporating the Characteristics of Speech
Viaarxiv icon

Bridging the Gap between Spatial and Spectral Domains: A Theoretical Framework for Graph Neural Networks

Add code
Bookmark button
Alert button
Jul 21, 2021
Zhiqian Chen, Fanglan Chen, Lei Zhang, Taoran Ji, Kaiqun Fu, Liang Zhao, Feng Chen, Lingfei Wu, Charu Aggarwal, Chang-Tien Lu

Figure 1 for Bridging the Gap between Spatial and Spectral Domains: A Theoretical Framework for Graph Neural Networks
Figure 2 for Bridging the Gap between Spatial and Spectral Domains: A Theoretical Framework for Graph Neural Networks
Figure 3 for Bridging the Gap between Spatial and Spectral Domains: A Theoretical Framework for Graph Neural Networks
Figure 4 for Bridging the Gap between Spatial and Spectral Domains: A Theoretical Framework for Graph Neural Networks
Viaarxiv icon

Hybrid Fusion Based Interpretable Multimodal Emotion Recognition with Insufficient Labelled Data

Aug 24, 2022
Puneet Kumar, Sarthak Malik, Balasubramanian Raman

Figure 1 for Hybrid Fusion Based Interpretable Multimodal Emotion Recognition with Insufficient Labelled Data
Figure 2 for Hybrid Fusion Based Interpretable Multimodal Emotion Recognition with Insufficient Labelled Data
Figure 3 for Hybrid Fusion Based Interpretable Multimodal Emotion Recognition with Insufficient Labelled Data
Figure 4 for Hybrid Fusion Based Interpretable Multimodal Emotion Recognition with Insufficient Labelled Data
Viaarxiv icon

ASR Rescoring and Confidence Estimation with ELECTRA

Oct 05, 2021
Hayato Futami, Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara

Figure 1 for ASR Rescoring and Confidence Estimation with ELECTRA
Figure 2 for ASR Rescoring and Confidence Estimation with ELECTRA
Figure 3 for ASR Rescoring and Confidence Estimation with ELECTRA
Figure 4 for ASR Rescoring and Confidence Estimation with ELECTRA
Viaarxiv icon

Investigation of Speaker-adaptation methods in Transformer based ASR

Aug 07, 2020
Vishwas M. Shetty, Metilda Sagaya Mary N J, S. Umesh

Figure 1 for Investigation of Speaker-adaptation methods in Transformer based ASR
Figure 2 for Investigation of Speaker-adaptation methods in Transformer based ASR
Figure 3 for Investigation of Speaker-adaptation methods in Transformer based ASR
Figure 4 for Investigation of Speaker-adaptation methods in Transformer based ASR
Viaarxiv icon

LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech

Add code
Bookmark button
Alert button
Apr 23, 2021
Solene Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar, Sina Alisamir, Ziyi Tong, Natalia Tomashenko, Marco Dinarelli, Titouan Parcollet, Alexandre Allauzen, Yannick Esteve, Benjamin Lecouteux, Francois Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier

Figure 1 for LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
Figure 2 for LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
Figure 3 for LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
Figure 4 for LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
Viaarxiv icon

Sequence Model with Self-Adaptive Sliding Window for Efficient Spoken Document Segmentation

Add code
Bookmark button
Alert button
Jul 20, 2021
Qinglin Zhang, Qian Chen, Yali Li, Jiaqing Liu, Wen Wang

Figure 1 for Sequence Model with Self-Adaptive Sliding Window for Efficient Spoken Document Segmentation
Figure 2 for Sequence Model with Self-Adaptive Sliding Window for Efficient Spoken Document Segmentation
Figure 3 for Sequence Model with Self-Adaptive Sliding Window for Efficient Spoken Document Segmentation
Figure 4 for Sequence Model with Self-Adaptive Sliding Window for Efficient Spoken Document Segmentation
Viaarxiv icon