Alert button

"speech": models, code, and papers
Alert button

Voice Conversion Augmentation for Speaker Recognition on Defective Datasets

Apr 01, 2024
Ruijie Tao, Zhan Shi, Yidi Jiang, Tianchi Liu, Haizhou Li

Viaarxiv icon

Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation

Add code
Bookmark button
Alert button
Mar 07, 2024
Sai Akarsh, Vamshi Raghusimha, Anindita Mondal, Anil Vuppala

Figure 1 for Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation
Figure 2 for Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation
Figure 3 for Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation
Figure 4 for Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation
Viaarxiv icon

Robust Active Speaker Detection in Noisy Environments

Mar 30, 2024
Siva Sai Nagender Vasireddy, Chenxu Zhang, Xiaohu Guo, Yapeng Tian

Viaarxiv icon

Hierarchical Recurrent Adapters for Efficient Multi-Task Adaptation of Large Speech Models

Mar 25, 2024
Tsendsuren Munkhdalai, Youzheng Chen, Khe Chai Sim, Fadi Biadsy, Tara Sainath, Pedro Moreno Mengibar

Viaarxiv icon

OPSD: an Offensive Persian Social media Dataset and its baseline evaluations

Apr 08, 2024
Mehran Safayani, Amir Sartipi, Amir Hossein Ahmadi, Parniyan Jalali, Amir Hossein Mansouri, Mohammad Bisheh-Niasar, Zahra Pourbahman

Viaarxiv icon

Masked Modeling Duo: Towards a Universal Audio Pre-training Framework

Add code
Bookmark button
Alert button
Apr 09, 2024
Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, Kunio Kashino

Viaarxiv icon

Accuracy enhancement method for speech emotion recognition from spectrogram using temporal frequency correlation and positional information learning through knowledge transfer

Mar 26, 2024
Jeong-Yoon Kim, Seung-Ho Lee

Viaarxiv icon

Speech-driven Personalized Gesture Synthetics: Harnessing Automatic Fuzzy Feature Inference

Add code
Bookmark button
Alert button
Mar 16, 2024
Fan Zhang, Zhaohan Wang, Xin Lyu, Siyuan Zhao, Mengjian Li, Weidong Geng, Naye Ji, Hui Du, Fuxing Gao, Hao Wu, Shunman Li

Figure 1 for Speech-driven Personalized Gesture Synthetics: Harnessing Automatic Fuzzy Feature Inference
Figure 2 for Speech-driven Personalized Gesture Synthetics: Harnessing Automatic Fuzzy Feature Inference
Figure 3 for Speech-driven Personalized Gesture Synthetics: Harnessing Automatic Fuzzy Feature Inference
Figure 4 for Speech-driven Personalized Gesture Synthetics: Harnessing Automatic Fuzzy Feature Inference
Viaarxiv icon

Energy-Based Models with Applications to Speech and Language Processing

Add code
Bookmark button
Alert button
Mar 16, 2024
Zhijian Ou

Figure 1 for Energy-Based Models with Applications to Speech and Language Processing
Figure 2 for Energy-Based Models with Applications to Speech and Language Processing
Figure 3 for Energy-Based Models with Applications to Speech and Language Processing
Figure 4 for Energy-Based Models with Applications to Speech and Language Processing
Viaarxiv icon

A Morphology-Based Investigation of Positional Encodings

Apr 06, 2024
Poulami Ghosh, Shikhar Vashishth, Raj Dabre, Pushpak Bhattacharyya

Viaarxiv icon