Alert button

"speech": models, code, and papers
Alert button

Speech Emotion Diarization: Which Emotion Appears When?

Jun 22, 2023
Yingzhi Wang, Mirco Ravanelli, Alaa Nfissi, Alya Yacoubi

Figure 1 for Speech Emotion Diarization: Which Emotion Appears When?
Figure 2 for Speech Emotion Diarization: Which Emotion Appears When?
Figure 3 for Speech Emotion Diarization: Which Emotion Appears When?
Figure 4 for Speech Emotion Diarization: Which Emotion Appears When?
Viaarxiv icon

Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques

Aug 04, 2023
Samiul Islam, Md. Maksudul Haque, Abu Jobayer Md. Sadat

Figure 1 for Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques
Figure 2 for Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques
Figure 3 for Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques
Figure 4 for Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques
Viaarxiv icon

The CHiME-7 UDASE task: Unsupervised domain adaptation for conversational speech enhancement

Jul 07, 2023
Simon Leglaive, Léonie Borne, Efthymios Tzinis, Mostafa Sadeghi, Matthieu Fraticelli, Scott Wisdom, Manuel Pariente, Daniel Pressnitzer, John R. Hershey

Figure 1 for The CHiME-7 UDASE task: Unsupervised domain adaptation for conversational speech enhancement
Figure 2 for The CHiME-7 UDASE task: Unsupervised domain adaptation for conversational speech enhancement
Figure 3 for The CHiME-7 UDASE task: Unsupervised domain adaptation for conversational speech enhancement
Viaarxiv icon

Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation

Sep 14, 2023
Shaoshi Ling, Guoli Ye, Rui Zhao, Yifan Gong

Figure 1 for Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation
Figure 2 for Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation
Figure 3 for Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation
Figure 4 for Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation
Viaarxiv icon

NarrativePlay: Interactive Narrative Understanding

Oct 02, 2023
Runcong Zhao, Wenjia Zhang, Jiazheng Li, Lixing Zhu, Yanran Li, Yulan He, Lin Gui

Figure 1 for NarrativePlay: Interactive Narrative Understanding
Figure 2 for NarrativePlay: Interactive Narrative Understanding
Figure 3 for NarrativePlay: Interactive Narrative Understanding
Figure 4 for NarrativePlay: Interactive Narrative Understanding
Viaarxiv icon

All-for-One and One-For-All: Deep learning-based feature fusion for Synthetic Speech Detection

Jul 28, 2023
Daniele Mari, Davide Salvi, Paolo Bestagini, Simone Milani

Viaarxiv icon

speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition

May 30, 2023
Haoyu Lu, Nan Li, Tongtong Song, Longbiao Wang, Jianwu Dang, Xiaobao Wang, Shiliang Zhang

Figure 1 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Figure 2 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Figure 3 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Figure 4 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Viaarxiv icon

Can Large Language Models Aid in Annotating Speech Emotional Data? Uncovering New Frontiers

Jul 12, 2023
Siddique Latif, Muhammad Usama, Mohammad Ibrahim Malik, Björn W. Schuller

Figure 1 for Can Large Language Models Aid in Annotating Speech Emotional Data? Uncovering New Frontiers
Figure 2 for Can Large Language Models Aid in Annotating Speech Emotional Data? Uncovering New Frontiers
Figure 3 for Can Large Language Models Aid in Annotating Speech Emotional Data? Uncovering New Frontiers
Figure 4 for Can Large Language Models Aid in Annotating Speech Emotional Data? Uncovering New Frontiers
Viaarxiv icon

AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description

Oct 10, 2023
Tengda Han, Max Bain, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman

Figure 1 for AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
Figure 2 for AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
Figure 3 for AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
Figure 4 for AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
Viaarxiv icon

Sparse Finetuning for Inference Acceleration of Large Language Models

Oct 10, 2023
Eldar Kurtic, Denis Kuznedelev, Elias Frantar, Michael Goin, Dan Alistarh

Viaarxiv icon