Alert button

"speech recognition": models, code, and papers
Alert button

MASRI-HEADSET: A Maltese Corpus for Speech Recognition

Add code
Bookmark button
Alert button
Aug 13, 2020
Carlos Mena, Albert Gatt, Andrea DeMarco, Claudia Borg, Lonneke van der Plas, Amanda Muscat, Ian Padovani

Figure 1 for MASRI-HEADSET: A Maltese Corpus for Speech Recognition
Figure 2 for MASRI-HEADSET: A Maltese Corpus for Speech Recognition
Figure 3 for MASRI-HEADSET: A Maltese Corpus for Speech Recognition
Figure 4 for MASRI-HEADSET: A Maltese Corpus for Speech Recognition
Viaarxiv icon

Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition

Apr 26, 2020
Li Fu, Xiaoxiao Li, Libo Zi

Figure 1 for Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition
Figure 2 for Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition
Figure 3 for Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition
Figure 4 for Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition
Viaarxiv icon

Token-level Speaker Change Detection Using Speaker Difference and Speech Content via Continuous Integrate-and-fire

Add code
Bookmark button
Alert button
Nov 17, 2022
Zhiyun Fan, Zhenlin Liang, Linhao Dong, Yi Liu, Shiyu Zhou, Meng Cai, Jun Zhang, Zejun Ma, Bo Xu

Figure 1 for Token-level Speaker Change Detection Using Speaker Difference and Speech Content via Continuous Integrate-and-fire
Figure 2 for Token-level Speaker Change Detection Using Speaker Difference and Speech Content via Continuous Integrate-and-fire
Figure 3 for Token-level Speaker Change Detection Using Speaker Difference and Speech Content via Continuous Integrate-and-fire
Figure 4 for Token-level Speaker Change Detection Using Speaker Difference and Speech Content via Continuous Integrate-and-fire
Viaarxiv icon

Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition

Apr 19, 2021
Wei Zhou, Mohammad Zeineldeen, Zuoyun Zheng, Ralf Schlüter, Hermann Ney

Figure 1 for Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition
Figure 2 for Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition
Figure 3 for Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition
Figure 4 for Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition
Viaarxiv icon

AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages

Apr 04, 2023
Chris Chinenye Emezue, Sanchit Gandhi, Lewis Tunstall, Abubakar Abid, Josh Meyer, Quentin Lhoest, Pete Allen, Patrick Von Platen, Douwe Kiela, Yacine Jernite, Julien Chaumond, Merve Noyan, Omar Sanseviero

Figure 1 for AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Figure 2 for AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Figure 3 for AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Figure 4 for AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Viaarxiv icon

A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition

Add code
Bookmark button
Alert button
Jun 23, 2020
Dongwei Jiang, Wubo Li, Ruixiong Zhang, Miao Cao, Ne Luo, Yang Han, Wei Zou, Xiangang Li

Figure 1 for A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition
Figure 2 for A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition
Figure 3 for A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition
Figure 4 for A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition
Viaarxiv icon

Combining Frame-Synchronous and Label-Synchronous Systems for Speech Recognition

Add code
Bookmark button
Alert button
Jul 01, 2021
Qiujia Li, Chao Zhang, Philip C. Woodland

Figure 1 for Combining Frame-Synchronous and Label-Synchronous Systems for Speech Recognition
Figure 2 for Combining Frame-Synchronous and Label-Synchronous Systems for Speech Recognition
Figure 3 for Combining Frame-Synchronous and Label-Synchronous Systems for Speech Recognition
Figure 4 for Combining Frame-Synchronous and Label-Synchronous Systems for Speech Recognition
Viaarxiv icon

Defending against Adversarial Audio via Diffusion Model

Add code
Bookmark button
Alert button
Mar 02, 2023
Shutong Wu, Jiongxiao Wang, Wei Ping, Weili Nie, Chaowei Xiao

Figure 1 for Defending against Adversarial Audio via Diffusion Model
Figure 2 for Defending against Adversarial Audio via Diffusion Model
Figure 3 for Defending against Adversarial Audio via Diffusion Model
Figure 4 for Defending against Adversarial Audio via Diffusion Model
Viaarxiv icon

DiaCorrect: End-to-end error correction for speaker diarization

Add code
Bookmark button
Alert button
Oct 31, 2022
Jiangyu Han, Yuhang Cao, Heng Lu, Yanhua Long

Figure 1 for DiaCorrect: End-to-end error correction for speaker diarization
Figure 2 for DiaCorrect: End-to-end error correction for speaker diarization
Figure 3 for DiaCorrect: End-to-end error correction for speaker diarization
Figure 4 for DiaCorrect: End-to-end error correction for speaker diarization
Viaarxiv icon

Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding

Add code
Bookmark button
Alert button
Dec 16, 2019
Yuchen Liu, Jiajun Zhang, Hao Xiong, Long Zhou, Zhongjun He, Hua Wu, Haifeng Wang, Chengqing Zong

Figure 1 for Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding
Figure 2 for Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding
Figure 3 for Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding
Figure 4 for Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding
Viaarxiv icon