Alert button

"speech recognition": models, code, and papers
Alert button

Punctuation Restoration Improves Structure Understanding without Supervision

Feb 21, 2024
Junghyun Min, Minho Lee, Woochul Lee, Yeonsoo Lee

Viaarxiv icon

Automated speech audiometry: Can it work using open-source pre-trained Kaldi-NL automatic speech recognition?

Dec 19, 2023
Gloria Araiza-Illan, Luke Meyer, Khiet P. Truong, Deniz Baskent

Viaarxiv icon

TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion

Jan 25, 2024
Samuel Pegg, Kai Li, Xiaolin Hu

Viaarxiv icon

Communication-Efficient Personalized Federated Learning for Speech-to-Text Tasks

Jan 18, 2024
Yichao Du, Zhirui Zhang, Linan Yue, Xu Huang, Yuqing Zhang, Tong Xu, Linli Xu, Enhong Chen

Viaarxiv icon

An Integration of Pre-Trained Speech and Language Models for End-to-End Speech Recognition

Dec 06, 2023
Yukiya Hono, Koh Mitsuda, Tianyu Zhao, Kentaro Mitsui, Toshiaki Wakatsuki, Kei Sawada

Viaarxiv icon

AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition

Add code
Bookmark button
Alert button
Sep 29, 2023
Andrew Rouditchenko, Ronan Collobert, Tatiana Likhomanenko

Figure 1 for AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition
Figure 2 for AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition
Figure 3 for AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition
Figure 4 for AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition
Viaarxiv icon

On the compression of shallow non-causal ASR models using knowledge distillation and tied-and-reduced decoder for low-latency on-device speech recognition

Dec 15, 2023
Nagaraj Adiga, Jinhwan Park, Chintigari Shiva Kumar, Shatrughan Singh, Kyungmin Lee, Chanwoo Kim, Dhananjaya Gowda

Viaarxiv icon

Lightweight Protection for Privacy in Offloaded Speech Understanding

Jan 22, 2024
Dongqi Cai, Shangguang Wang, Zeling Zhang, Felix Xiaozhu Lin, Mengwei Xu

Viaarxiv icon

How does end-to-end speech recognition training impact speech enhancement artifacts?

Nov 20, 2023
Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri

Viaarxiv icon

Instruction-Following Speech Recognition

Sep 18, 2023
Cheng-I Jeff Lai, Zhiyun Lu, Liangliang Cao, Ruoming Pang

Figure 1 for Instruction-Following Speech Recognition
Figure 2 for Instruction-Following Speech Recognition
Figure 3 for Instruction-Following Speech Recognition
Figure 4 for Instruction-Following Speech Recognition
Viaarxiv icon