Alert button

"speech": models, code, and papers
Alert button

Use of Speech Impairment Severity for Dysarthric Speech Recognition

May 18, 2023
Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Jiajun Deng, Mingyu Cui, Guinan Li, Jianwei Yu, Xurong Xie, Xunying Liu

Figure 1 for Use of Speech Impairment Severity for Dysarthric Speech Recognition
Figure 2 for Use of Speech Impairment Severity for Dysarthric Speech Recognition
Figure 3 for Use of Speech Impairment Severity for Dysarthric Speech Recognition
Figure 4 for Use of Speech Impairment Severity for Dysarthric Speech Recognition
Viaarxiv icon

SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?

Add code
Bookmark button
Alert button
Jun 14, 2023
Takanori Ashihara, Takafumi Moriya, Kohei Matsuura, Tomohiro Tanaka, Yusuke Ijima, Taichi Asami, Marc Delcroix, Yukinori Honma

Figure 1 for SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
Figure 2 for SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
Figure 3 for SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
Figure 4 for SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
Viaarxiv icon

3D-Speaker: A Large-Scale Multi-Device, Multi-Distance, and Multi-Dialect Corpus for Speech Representation Disentanglement

Add code
Bookmark button
Alert button
Jun 28, 2023
Siqi Zheng, Luyao Cheng, Yafeng Chen, Hui Wang, Qian Chen

Figure 1 for 3D-Speaker: A Large-Scale Multi-Device, Multi-Distance, and Multi-Dialect Corpus for Speech Representation Disentanglement
Figure 2 for 3D-Speaker: A Large-Scale Multi-Device, Multi-Distance, and Multi-Dialect Corpus for Speech Representation Disentanglement
Figure 3 for 3D-Speaker: A Large-Scale Multi-Device, Multi-Distance, and Multi-Dialect Corpus for Speech Representation Disentanglement
Figure 4 for 3D-Speaker: A Large-Scale Multi-Device, Multi-Distance, and Multi-Dialect Corpus for Speech Representation Disentanglement
Viaarxiv icon

HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation

Add code
Bookmark button
Alert button
Jun 20, 2023
Cihan Xiao, Henry Li Xinyuan, Jinyi Yang, Dongji Gao, Matthew Wiesner, Kevin Duh, Sanjeev Khudanpur

Figure 1 for HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
Figure 2 for HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
Figure 3 for HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
Figure 4 for HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
Viaarxiv icon

Cross-modal Alignment with Optimal Transport for CTC-based ASR

Sep 24, 2023
Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai

Viaarxiv icon

KIT's Multilingual Speech Translation System for IWSLT 2023

Add code
Bookmark button
Alert button
Jun 08, 2023
Danni Liu, Thai Binh Nguyen, Sai Koneru, Enes Yavuz Ugan, Ngoc-Quan Pham, Tuan-Nam Nguyen, Tu Anh Dinh, Carlos Mullov, Alexander Waibel, Jan Niehues

Figure 1 for KIT's Multilingual Speech Translation System for IWSLT 2023
Figure 2 for KIT's Multilingual Speech Translation System for IWSLT 2023
Figure 3 for KIT's Multilingual Speech Translation System for IWSLT 2023
Figure 4 for KIT's Multilingual Speech Translation System for IWSLT 2023
Viaarxiv icon

Multi-Channel MOSRA: Mean Opinion Score and Room Acoustics Estimation Using Simulated Data and a Teacher Model

Sep 21, 2023
Jozef Coldenhoff, Andrew Harper, Paul Kendrick, Tijana Stojkovic, Milos Cernak

Viaarxiv icon

Multilingual context-based pronunciation learning for Text-to-Speech

Add code
Bookmark button
Alert button
Jul 31, 2023
Giulia Comini, Manuel Sam Ribeiro, Fan Yang, Heereen Shim, Jaime Lorenzo-Trueba

Figure 1 for Multilingual context-based pronunciation learning for Text-to-Speech
Figure 2 for Multilingual context-based pronunciation learning for Text-to-Speech
Figure 3 for Multilingual context-based pronunciation learning for Text-to-Speech
Figure 4 for Multilingual context-based pronunciation learning for Text-to-Speech
Viaarxiv icon

Wiki-En-ASR-Adapt: Large-scale synthetic dataset for English ASR Customization

Sep 29, 2023
Alexandra Antonova

Viaarxiv icon

Sparse Finetuning for Inference Acceleration of Large Language Models

Add code
Bookmark button
Alert button
Oct 10, 2023
Eldar Kurtic, Denis Kuznedelev, Elias Frantar, Michael Goin, Dan Alistarh

Viaarxiv icon