Multimodal Emotion Recognition


Multimodal emotion recognition is the process of recognizing emotions from multiple modalities, such as speech, text, and facial expressions.

Beyond Words: Enhancing Desire, Emotion, and Sentiment Recognition with Non-Verbal Cues

Add code
Sep 19, 2025
Viaarxiv icon

EmoQ: Speech Emotion Recognition via Speech-Aware Q-Former and Large Language Model

Add code
Sep 19, 2025
Viaarxiv icon

Are Multimodal Foundation Models All That Is Needed for Emofake Detection?

Add code
Sep 19, 2025
Viaarxiv icon

Speech Emotion Recognition via Entropy-Aware Score Selection

Add code
Aug 28, 2025
Viaarxiv icon

Social-MAE: A Transformer-Based Multimodal Autoencoder for Face and Voice

Add code
Aug 24, 2025
Viaarxiv icon

Beyond Emotion Recognition: A Multi-Turn Multimodal Emotion Understanding and Reasoning Benchmark

Add code
Aug 23, 2025
Viaarxiv icon

LPGNet: A Lightweight Network with Parallel Attention and Gated Fusion for Multimodal Emotion Recognition

Add code
Aug 12, 2025
Viaarxiv icon

A Trustworthy Method for Multimodal Emotion Recognition

Add code
Aug 11, 2025
Viaarxiv icon

Arabic Multimodal Machine Learning: Datasets, Applications, Approaches, and Challenges

Add code
Aug 17, 2025
Viaarxiv icon

Silicon Minds versus Human Hearts: The Wisdom of Crowds Beats the Wisdom of AI in Emotion Recognition

Add code
Aug 12, 2025
Viaarxiv icon