Alert button

"speech": models, code, and papers
Alert button

Transfer Learning from Whisper for Microscopic Intelligibility Prediction

Apr 02, 2024
Paul Best, Santiago Cuervo, Ricard Marxer

Viaarxiv icon

DANCER: Entity Description Augmented Named Entity Corrector for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Mar 28, 2024
Yi-Cheng Wang, Hsin-Wei Wang, Bi-Cheng Yan, Chi-Han Lin, Berlin Chen

Viaarxiv icon

The VoicePrivacy 2024 Challenge Evaluation Plan

Apr 03, 2024
Natalia Tomashenko, Xiaoxiao Miao, Pierre Champion, Sarina Meyer, Xin Wang, Emmanuel Vincent, Michele Panariello, Nicholas Evans, Junichi Yamagishi, Massimiliano Todisco

Viaarxiv icon

A Multimodal Approach to Device-Directed Speech Detection with Large Language Models

Mar 26, 2024
Dominik Wagner, Alexander Churchill, Siddharth Sigtia, Panayiotis Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi

Figure 1 for A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Figure 2 for A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Figure 3 for A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Figure 4 for A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Viaarxiv icon

Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation

Add code
Bookmark button
Alert button
Mar 07, 2024
Sai Akarsh, Vamshi Raghusimha, Anindita Mondal, Anil Vuppala

Figure 1 for Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation
Figure 2 for Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation
Figure 3 for Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation
Figure 4 for Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation
Viaarxiv icon

Voice Conversion Augmentation for Speaker Recognition on Defective Datasets

Apr 01, 2024
Ruijie Tao, Zhan Shi, Yidi Jiang, Tianchi Liu, Haizhou Li

Viaarxiv icon

An inclusive review on deep learning techniques and their scope in handwriting recognition

Apr 10, 2024
Sukhdeep Singh, Sudhir Rohilla, Anuj Sharma

Viaarxiv icon

Hierarchical Recurrent Adapters for Efficient Multi-Task Adaptation of Large Speech Models

Mar 25, 2024
Tsendsuren Munkhdalai, Youzheng Chen, Khe Chai Sim, Fadi Biadsy, Tara Sainath, Pedro Moreno Mengibar

Viaarxiv icon

Robust Active Speaker Detection in Noisy Environments

Mar 30, 2024
Siva Sai Nagender Vasireddy, Chenxu Zhang, Xiaohu Guo, Yapeng Tian

Viaarxiv icon

OPSD: an Offensive Persian Social media Dataset and its baseline evaluations

Apr 08, 2024
Mehran Safayani, Amir Sartipi, Amir Hossein Ahmadi, Parniyan Jalali, Amir Hossein Mansouri, Mohammad Bisheh-Niasar, Zahra Pourbahman

Viaarxiv icon