Alert button

"speech recognition": models, code, and papers
Alert button

Sumformer: A Linear-Complexity Alternative to Self-Attention for Speech Recognition

Jul 12, 2023
Titouan Parcollet, Rogier van Dalen, Shucong Zhang, Sourav Bhattacharya

Figure 1 for Sumformer: A Linear-Complexity Alternative to Self-Attention for Speech Recognition
Figure 2 for Sumformer: A Linear-Complexity Alternative to Self-Attention for Speech Recognition
Figure 3 for Sumformer: A Linear-Complexity Alternative to Self-Attention for Speech Recognition
Figure 4 for Sumformer: A Linear-Complexity Alternative to Self-Attention for Speech Recognition
Viaarxiv icon

Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder

Add code
Bookmark button
Alert button
Aug 14, 2023
Yusheng Dai, Hang Chen, Jun Du, Xiaofei Ding, Ning Ding, Feijun Jiang, Chin-Hui Lee

Figure 1 for Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
Figure 2 for Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
Figure 3 for Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
Figure 4 for Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
Viaarxiv icon

Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR

Nov 30, 2023
Jintao Jiang, Yingbo Gao, Zoltan Tuske

Figure 1 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 2 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 3 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 4 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Viaarxiv icon

End-to-End Speech-to-Text Translation: A Survey

Dec 02, 2023
Nivedita Sethiya, Chandresh Kumar Maurya

Viaarxiv icon

Utilizing Speech Emotion Recognition and Recommender Systems for Negative Emotion Handling in Therapy Chatbots

Nov 18, 2023
Farideh Majidi, Marzieh Bahrami

Viaarxiv icon

HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models

Add code
Bookmark button
Alert button
Sep 27, 2023
Chen Chen, Yuchen Hu, Chao-Han Huck Yang, Sabato Macro Siniscalchi, Pin-Yu Chen, Eng Siong Chng

Figure 1 for HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Figure 2 for HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Figure 3 for HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Figure 4 for HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Viaarxiv icon

A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors

Nov 27, 2023
Shuyue Stella Li, Beining Xu, Xiangyu Zhang, Hexin Liu, Wenhan Chao, Leibny Paola Garcia

Viaarxiv icon

Phonetic-aware speaker embedding for far-field speaker verification

Nov 27, 2023
Zezhong Jin, Youzhi Tu, Man-Wai Mak

Viaarxiv icon

End-to-end Joint Rich and Normalized ASR with a limited amount of rich training data

Nov 29, 2023
Can Cui, Imran Ahamad Sheikh, Mostafa Sadeghi, Emmanuel Vincent

Viaarxiv icon

AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model

Aug 15, 2023
Jeong Hun Yeo, Minsu Kim, Jeongsoo Choi, Dae Hoe Kim, Yong Man Ro

Figure 1 for AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Figure 2 for AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Figure 3 for AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Figure 4 for AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Viaarxiv icon