Picture for Zhenzhen Hu

Zhenzhen Hu

Bidirectional Learning of Facial Action Units and Expressions via Structured Semantic Mapping across Heterogeneous Datasets

Add code
Apr 12, 2026
Viaarxiv icon

Disentangling Foreground and Background for vision-Language Navigation via Online Augmentation

Add code
Oct 01, 2025
Viaarxiv icon

Generalizable Engagement Estimation in Conversation via Domain Prompting and Parallel Attention

Add code
Aug 20, 2025
Figure 1 for Generalizable Engagement Estimation in Conversation via Domain Prompting and Parallel Attention
Figure 2 for Generalizable Engagement Estimation in Conversation via Domain Prompting and Parallel Attention
Figure 3 for Generalizable Engagement Estimation in Conversation via Domain Prompting and Parallel Attention
Figure 4 for Generalizable Engagement Estimation in Conversation via Domain Prompting and Parallel Attention
Viaarxiv icon

Traits Run Deep: Enhancing Personality Assessment via Psychology-Guided LLM Representations and Multimodal Apparent Behaviors

Add code
Jul 30, 2025
Figure 1 for Traits Run Deep: Enhancing Personality Assessment via Psychology-Guided LLM Representations and Multimodal Apparent Behaviors
Figure 2 for Traits Run Deep: Enhancing Personality Assessment via Psychology-Guided LLM Representations and Multimodal Apparent Behaviors
Figure 3 for Traits Run Deep: Enhancing Personality Assessment via Psychology-Guided LLM Representations and Multimodal Apparent Behaviors
Figure 4 for Traits Run Deep: Enhancing Personality Assessment via Psychology-Guided LLM Representations and Multimodal Apparent Behaviors
Viaarxiv icon

Listening to the Unspoken: Exploring 365 Aspects of Multimodal Interview Performance Assessment

Add code
Jul 30, 2025
Figure 1 for Listening to the Unspoken: Exploring 365 Aspects of Multimodal Interview Performance Assessment
Figure 2 for Listening to the Unspoken: Exploring 365 Aspects of Multimodal Interview Performance Assessment
Figure 3 for Listening to the Unspoken: Exploring 365 Aspects of Multimodal Interview Performance Assessment
Figure 4 for Listening to the Unspoken: Exploring 365 Aspects of Multimodal Interview Performance Assessment
Viaarxiv icon

Rebalancing Contrastive Alignment with Learnable Semantic Gaps in Text-Video Retrieval

Add code
May 18, 2025
Viaarxiv icon

Concept Drift Guided LayerNorm Tuning for Efficient Multimodal Metaphor Identification

Add code
May 16, 2025
Viaarxiv icon

VAEmo: Efficient Representation Learning for Visual-Audio Emotion with Knowledge Injection

Add code
May 05, 2025
Figure 1 for VAEmo: Efficient Representation Learning for Visual-Audio Emotion with Knowledge Injection
Figure 2 for VAEmo: Efficient Representation Learning for Visual-Audio Emotion with Knowledge Injection
Figure 3 for VAEmo: Efficient Representation Learning for Visual-Audio Emotion with Knowledge Injection
Figure 4 for VAEmo: Efficient Representation Learning for Visual-Audio Emotion with Knowledge Injection
Viaarxiv icon

PhysioSync: Temporal and Cross-Modal Contrastive Learning Inspired by Physiological Synchronization for EEG-Based Emotion Recognition

Add code
Apr 24, 2025
Viaarxiv icon

Video Flow as Time Series: Discovering Temporal Consistency and Variability for VideoQA

Add code
Apr 08, 2025
Viaarxiv icon