Alert button

"speech recognition": models, code, and papers
Alert button

DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model

Jun 02, 2023
Haoyu Wang, Siyuan Wang, Wei-Qiang Zhang, Jinfeng Bai

Figure 1 for DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
Figure 2 for DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
Figure 3 for DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
Figure 4 for DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
Viaarxiv icon

Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization

Jun 07, 2023
Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Takatomo Kano, Atsunori Ogawa, Marc Delcroix

Figure 1 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Figure 2 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Figure 3 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Figure 4 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Viaarxiv icon

Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer

Jun 07, 2023
Lu Huang, Boyu Li, Jun Zhang, Lu Lu, Zejun Ma

Figure 1 for Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer
Figure 2 for Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer
Figure 3 for Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer
Figure 4 for Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer
Viaarxiv icon

Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning

Jun 23, 2023
Zhongzhi Yu, Yang Zhang, Kaizhi Qian, Yonggan Fu, Yingyan Lin

Figure 1 for Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning
Figure 2 for Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning
Figure 3 for Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning
Figure 4 for Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning
Viaarxiv icon

MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with Depth Information

Jun 04, 2023
Jianrong Wang, Yuchen Huo, Li Liu, Tianyi Xu, Qi Li, Sen Li

Figure 1 for MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with Depth Information
Figure 2 for MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with Depth Information
Figure 3 for MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with Depth Information
Figure 4 for MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with Depth Information
Viaarxiv icon

Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition

Oct 25, 2022
Xulong Zhang, Jianzong Wang, Ning Cheng, Mengyuan Zhao, Zhiyong Zhang, Jing Xiao

Figure 1 for Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition
Figure 2 for Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition
Figure 3 for Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition
Figure 4 for Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition
Viaarxiv icon

Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework

Jul 04, 2023
Eliya Segev, Maya Alroy, Ronen Katsir, Noam Wies, Ayana Shenhav, Yael Ben-Oren, David Zar, Oren Tadmor, Jacob Bitterman, Amnon Shashua, Tal Rosenwein

Figure 1 for Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Figure 2 for Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Figure 3 for Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Figure 4 for Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Viaarxiv icon

Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations

Apr 27, 2022
Dan Oneata, Horia Cucu

Figure 1 for Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Figure 2 for Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Figure 3 for Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Figure 4 for Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Viaarxiv icon

Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech

Aug 10, 2022
Kaitao Song, Teng Wan, Bixia Wang, Huiqiang Jiang, Luna Qiu, Jiahang Xu, Liping Jiang, Qun Lou, Yuqing Yang, Dongsheng Li, Xudong Wang, Lili Qiu

Figure 1 for Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech
Figure 2 for Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech
Figure 3 for Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech
Figure 4 for Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech
Viaarxiv icon

Strategies for improving low resource speech to text translation relying on pre-trained ASR models

May 31, 2023
Santosh Kesiraju, Marek Sarvas, Tomas Pavlicek, Cecile Macaire, Alejandro Ciuba

Figure 1 for Strategies for improving low resource speech to text translation relying on pre-trained ASR models
Figure 2 for Strategies for improving low resource speech to text translation relying on pre-trained ASR models
Figure 3 for Strategies for improving low resource speech to text translation relying on pre-trained ASR models
Figure 4 for Strategies for improving low resource speech to text translation relying on pre-trained ASR models
Viaarxiv icon