Alert button

"speech recognition": models, code, and papers
Alert button

The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge

Feb 04, 2022
Naijun Zheng, Na Li, Xixin Wu, Lingwei Meng, Jiawen Kang, Haibin Wu, Chao Weng, Dan Su, Helen Meng

Figure 1 for The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge
Figure 2 for The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge
Figure 3 for The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge
Viaarxiv icon

Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding

Add code
Bookmark button
Alert button
Feb 10, 2022
Peter Sullivan, Toshiko Shibano, Muhammad Abdul-Mageed

Figure 1 for Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding
Figure 2 for Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding
Figure 3 for Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding
Figure 4 for Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding
Viaarxiv icon

Speech Recognition Front End Without Information Loss

Mar 30, 2015
Matthew Ager, Zoran Cvetkovic, Peter Sollich

Figure 1 for Speech Recognition Front End Without Information Loss
Figure 2 for Speech Recognition Front End Without Information Loss
Figure 3 for Speech Recognition Front End Without Information Loss
Figure 4 for Speech Recognition Front End Without Information Loss
Viaarxiv icon

Zero-shot keyword spotting for visual speech recognition in-the-wild

Add code
Bookmark button
Alert button
Jul 26, 2018
Themos Stafylakis, Georgios Tzimiropoulos

Figure 1 for Zero-shot keyword spotting for visual speech recognition in-the-wild
Figure 2 for Zero-shot keyword spotting for visual speech recognition in-the-wild
Figure 3 for Zero-shot keyword spotting for visual speech recognition in-the-wild
Figure 4 for Zero-shot keyword spotting for visual speech recognition in-the-wild
Viaarxiv icon

Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech

Nov 24, 2020
Yiling Huang, Yutian Chen, Jason Pelecanos, Quan Wang

Figure 1 for Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech
Figure 2 for Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech
Figure 3 for Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech
Figure 4 for Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech
Viaarxiv icon

QSpeech: Low-Qubit Quantum Speech Application Toolkit

Add code
Bookmark button
Alert button
May 26, 2022
Zhenhou Hong, Jianzong Wang, Xiaoyang Qu, Chendong Zhao, Wei Tao, Jing Xiao

Figure 1 for QSpeech: Low-Qubit Quantum Speech Application Toolkit
Figure 2 for QSpeech: Low-Qubit Quantum Speech Application Toolkit
Figure 3 for QSpeech: Low-Qubit Quantum Speech Application Toolkit
Figure 4 for QSpeech: Low-Qubit Quantum Speech Application Toolkit
Viaarxiv icon

WERd: Using Social Text Spelling Variants for Evaluating Dialectal Speech Recognition

Add code
Bookmark button
Alert button
Sep 21, 2017
Ahmed Ali, Preslav Nakov, Peter Bell, Steve Renals

Figure 1 for WERd: Using Social Text Spelling Variants for Evaluating Dialectal Speech Recognition
Figure 2 for WERd: Using Social Text Spelling Variants for Evaluating Dialectal Speech Recognition
Figure 3 for WERd: Using Social Text Spelling Variants for Evaluating Dialectal Speech Recognition
Figure 4 for WERd: Using Social Text Spelling Variants for Evaluating Dialectal Speech Recognition
Viaarxiv icon

Semantic-preserved Communication System for Highly Efficient Speech Transmission

Add code
Bookmark button
Alert button
May 25, 2022
Tianxiao Han, Qianqian Yang, Zhiguo Shi, Shibo He, Zhaoyang Zhang

Figure 1 for Semantic-preserved Communication System for Highly Efficient Speech Transmission
Figure 2 for Semantic-preserved Communication System for Highly Efficient Speech Transmission
Figure 3 for Semantic-preserved Communication System for Highly Efficient Speech Transmission
Figure 4 for Semantic-preserved Communication System for Highly Efficient Speech Transmission
Viaarxiv icon

DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering

Add code
Bookmark button
Alert button
Mar 26, 2022
Guan-Ting Lin, Yung-Sung Chuang, Ho-Lam Chung, Shu-wen Yang, Hsuan-Jui Chen, Shuyan Dong, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Lin-shan Lee

Figure 1 for DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering
Figure 2 for DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering
Figure 3 for DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering
Figure 4 for DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering
Viaarxiv icon

WaBERT: A Low-resource End-to-end Model for Spoken Language Understanding and Speech-to-BERT Alignment

Apr 22, 2022
Lin Yao, Jianfei Song, Ruizhuo Xu, Yingfang Yang, Zijian Chen, Yafeng Deng

Figure 1 for WaBERT: A Low-resource End-to-end Model for Spoken Language Understanding and Speech-to-BERT Alignment
Figure 2 for WaBERT: A Low-resource End-to-end Model for Spoken Language Understanding and Speech-to-BERT Alignment
Figure 3 for WaBERT: A Low-resource End-to-end Model for Spoken Language Understanding and Speech-to-BERT Alignment
Figure 4 for WaBERT: A Low-resource End-to-end Model for Spoken Language Understanding and Speech-to-BERT Alignment
Viaarxiv icon