Alert button

"speech": models, code, and papers
Alert button

SE Territory: Monaural Speech Enhancement Meets the Fixed Virtual Perceptual Space Mapping

Nov 03, 2023
Xinmeng Xu, Jibin Wu, Xiaoyong Wei, Yan Liu, Richard So, Yuhong Yang, Weiping Tu, Kay Chen Tan

Figure 1 for SE Territory: Monaural Speech Enhancement Meets the Fixed Virtual Perceptual Space Mapping
Figure 2 for SE Territory: Monaural Speech Enhancement Meets the Fixed Virtual Perceptual Space Mapping
Figure 3 for SE Territory: Monaural Speech Enhancement Meets the Fixed Virtual Perceptual Space Mapping
Figure 4 for SE Territory: Monaural Speech Enhancement Meets the Fixed Virtual Perceptual Space Mapping
Viaarxiv icon

Joint Training or Not: An Exploration of Pre-trained Speech Models in Audio-Visual Speaker Diarization

Dec 07, 2023
Huan Zhao, Li Zhang, Yue Li, Yannan Wang, Hongji Wang, Wei Rao, Qing Wang, Lei Xie

Viaarxiv icon

From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Add code
Bookmark button
Alert button
Jan 03, 2024
Evonne Ng, Javier Romero, Timur Bagautdinov, Shaojie Bai, Trevor Darrell, Angjoo Kanazawa, Alexander Richard

Viaarxiv icon

Grounding-Prompter: Prompting LLM with Multimodal Information for Temporal Sentence Grounding in Long Videos

Dec 28, 2023
Houlun Chen, Xin Wang, Hong Chen, Zihan Song, Jia Jia, Wenwu Zhu

Viaarxiv icon

A comparative analysis between Conformer-Transducer, Whisper, and wav2vec2 for improving the child speech recognition

Nov 07, 2023
Andrei Barcovschi, Rishabh Jain, Peter Corcoran

Viaarxiv icon

Whisper in Focus: Enhancing Stuttered Speech Classification with Encoder Layer Optimization

Nov 09, 2023
Huma Ameer, Seemab Latif, Rabia Latif, Sana Mukhtar

Viaarxiv icon

GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse

Add code
Bookmark button
Alert button
Jan 07, 2024
Hongzhan Lin, Ziyang Luo, Bo Wang, Ruichao Yang, Jing Ma

Viaarxiv icon

Selective HuBERT: Self-Supervised Pre-Training for Target Speaker in Clean and Mixture Speech

Nov 08, 2023
Jingru Lin, Meng Ge, Wupeng Wang, Haizhou Li, Mengling Feng

Viaarxiv icon

Nonlinear functional regression by functional deep neural network with kernel embedding

Jan 05, 2024
Zhongjie Shi, Jun Fan, Linhao Song, Ding-Xuan Zhou, Johan A. K. Suykens

Viaarxiv icon

Fine-tuning convergence model in Bengali speech recognition

Nov 07, 2023
Zhu Ruiying, Shen Meng

Viaarxiv icon