Alert button

"speech": models, code, and papers
Alert button

Enhancing dysarthria speech feature representation with empirical mode decomposition and Walsh-Hadamard transform

Dec 30, 2023
Ting Zhu, Shufei Duan, Camille Dingam, Huizhi Liang, Wei Zhang

Viaarxiv icon

A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model

Jan 05, 2024
Dongdi Zhao, Jianbo Ma, Lu Lu, Jinke Li, Xuan Ji, Lei Zhu, Fuming Fang, Ming Liu, Feijun Jiang

Viaarxiv icon

Intelli-Z: Toward Intelligible Zero-Shot TTS

Jan 25, 2024
Sunghee Jung, Won Jang, Jaesam Yoon, Bongwan Kim

Viaarxiv icon

Music Auto-Tagging with Robust Music Representation Learned via Domain Adversarial Training

Jan 27, 2024
Haesun Joung, Kyogu Lee

Viaarxiv icon

SpeechDPR: End-to-End Spoken Passage Retrieval for Open-Domain Spoken Question Answering

Jan 24, 2024
Chyi-Jiunn Lin, Guan-Ting Lin, Yung-Sung Chuang, Wei-Lun Wu, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Lin-shan Lee

Viaarxiv icon

MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction

Jan 25, 2024
Wangjin Zhou, Zhengdong Yang, Chenhui Chu, Sheng Li, Raj Dabre, Yi Zhao, Tatsuya Kawahara

Viaarxiv icon

Adaptive scheduling for adaptive sampling in POS taggers construction

Feb 04, 2024
Manuel Vilares Ferro, Victor M. Darriba Bilbao, Jesús Vilares Ferro

Viaarxiv icon

The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in CNVSRC 2023

Jan 07, 2024
He Wang, Pengcheng Guo, Wei Chen, Pan Zhou, Lei Xie

Figure 1 for The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in CNVSRC 2023
Figure 2 for The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in CNVSRC 2023
Viaarxiv icon

ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations

Dec 22, 2023
Cheng Gong, Xin Wang, Erica Cooper, Dan Wells, Longbiao Wang, Jianwu Dang, Korin Richmond, Junichi Yamagishi

Viaarxiv icon

Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation

Jan 01, 2024
Huimeng Wang, Zengrui Jin, Mengzhe Geng, Shujie Hu, Guinan Li, Tianzi Wang, Haoning Xu, Xunying Liu

Viaarxiv icon