Alert button

"speech": models, code, and papers
Alert button

Keyword Augmented Retrieval: Novel framework for Information Retrieval integrated with speech interface

Add code
Bookmark button
Alert button
Oct 06, 2023
Anupam Purwar, Rahul Sundar

Figure 1 for Keyword Augmented Retrieval: Novel framework for Information Retrieval integrated with speech interface
Figure 2 for Keyword Augmented Retrieval: Novel framework for Information Retrieval integrated with speech interface
Figure 3 for Keyword Augmented Retrieval: Novel framework for Information Retrieval integrated with speech interface
Figure 4 for Keyword Augmented Retrieval: Novel framework for Information Retrieval integrated with speech interface
Viaarxiv icon

Decoding Emotions: A comprehensive Multilingual Study of Speech Models for Speech Emotion Recognition

Add code
Bookmark button
Alert button
Aug 17, 2023
Anant Singh, Akshat Gupta

Figure 1 for Decoding Emotions: A comprehensive Multilingual Study of Speech Models for Speech Emotion Recognition
Figure 2 for Decoding Emotions: A comprehensive Multilingual Study of Speech Models for Speech Emotion Recognition
Figure 3 for Decoding Emotions: A comprehensive Multilingual Study of Speech Models for Speech Emotion Recognition
Figure 4 for Decoding Emotions: A comprehensive Multilingual Study of Speech Models for Speech Emotion Recognition
Viaarxiv icon

Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language Adapter

Add code
Bookmark button
Alert button
Sep 19, 2023
Song Li, Yongbin You, Xuezhi Wang, Ke Ding, Guanglu Wan

Figure 1 for Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language Adapter
Figure 2 for Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language Adapter
Figure 3 for Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language Adapter
Figure 4 for Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language Adapter
Viaarxiv icon

AV2Wav: Diffusion-Based Re-synthesis from Continuous Self-supervised Features for Audio-Visual Speech Enhancement

Sep 14, 2023
Ju-Chieh Chou, Chung-Ming Chien, Karen Livescu

Figure 1 for AV2Wav: Diffusion-Based Re-synthesis from Continuous Self-supervised Features for Audio-Visual Speech Enhancement
Figure 2 for AV2Wav: Diffusion-Based Re-synthesis from Continuous Self-supervised Features for Audio-Visual Speech Enhancement
Figure 3 for AV2Wav: Diffusion-Based Re-synthesis from Continuous Self-supervised Features for Audio-Visual Speech Enhancement
Figure 4 for AV2Wav: Diffusion-Based Re-synthesis from Continuous Self-supervised Features for Audio-Visual Speech Enhancement
Viaarxiv icon

Effect of Attention and Self-Supervised Speech Embeddings on Non-Semantic Speech Tasks

Add code
Bookmark button
Alert button
Aug 28, 2023
Payal Mohapatra, Akash Pandey, Yueyuan Sui, Qi Zhu

Viaarxiv icon

A Unified Framework for Multimodal, Multi-Part Human Motion Synthesis

Nov 28, 2023
Zixiang Zhou, Yu Wan, Baoyuan Wang

Viaarxiv icon

Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions

Add code
Bookmark button
Alert button
Sep 16, 2023
Heming Wang, Meng Yu, Hao Zhang, Chunlei Zhang, Zhongweiyang Xu, Muqiao Yang, Yixuan Zhang, Dong Yu

Figure 1 for Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions
Figure 2 for Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions
Figure 3 for Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions
Figure 4 for Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions
Viaarxiv icon

TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis

Add code
Bookmark button
Alert button
Nov 29, 2023
Ali Najafi, Onur Varol

Figure 1 for TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis
Figure 2 for TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis
Figure 3 for TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis
Figure 4 for TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis
Viaarxiv icon

tinyCLAP: Distilling Constrastive Language-Audio Pretrained Models

Nov 24, 2023
Francesco Paissan, Elisabetta Farella

Viaarxiv icon

Rep2wav: Noise Robust text-to-speech Using self-supervised representations

Add code
Bookmark button
Alert button
Sep 04, 2023
Qiushi Zhu, Yu Gu, Rilin Chen, Chao Weng, Yuchen Hu, Lirong Dai, Jie Zhang

Figure 1 for Rep2wav: Noise Robust text-to-speech Using self-supervised representations
Figure 2 for Rep2wav: Noise Robust text-to-speech Using self-supervised representations
Figure 3 for Rep2wav: Noise Robust text-to-speech Using self-supervised representations
Figure 4 for Rep2wav: Noise Robust text-to-speech Using self-supervised representations
Viaarxiv icon