Picture for Zhenzhen Hu

Zhenzhen Hu

Generalizable Engagement Estimation in Conversation via Domain Prompting and Parallel Attention

Add code
Aug 20, 2025
Viaarxiv icon

Listening to the Unspoken: Exploring 365 Aspects of Multimodal Interview Performance Assessment

Add code
Jul 30, 2025
Viaarxiv icon

Traits Run Deep: Enhancing Personality Assessment via Psychology-Guided LLM Representations and Multimodal Apparent Behaviors

Add code
Jul 30, 2025
Viaarxiv icon

Rebalancing Contrastive Alignment with Learnable Semantic Gaps in Text-Video Retrieval

Add code
May 18, 2025
Viaarxiv icon

Concept Drift Guided LayerNorm Tuning for Efficient Multimodal Metaphor Identification

Add code
May 16, 2025
Viaarxiv icon

VAEmo: Efficient Representation Learning for Visual-Audio Emotion with Knowledge Injection

Add code
May 05, 2025
Viaarxiv icon

PhysioSync: Temporal and Cross-Modal Contrastive Learning Inspired by Physiological Synchronization for EEG-Based Emotion Recognition

Add code
Apr 24, 2025
Viaarxiv icon

Video Flow as Time Series: Discovering Temporal Consistency and Variability for VideoQA

Add code
Apr 08, 2025
Viaarxiv icon

Agent Journey Beyond RGB: Unveiling Hybrid Semantic-Spatial Environmental Representations for Vision-and-Language Navigation

Add code
Dec 10, 2024
Figure 1 for Agent Journey Beyond RGB: Unveiling Hybrid Semantic-Spatial Environmental Representations for Vision-and-Language Navigation
Figure 2 for Agent Journey Beyond RGB: Unveiling Hybrid Semantic-Spatial Environmental Representations for Vision-and-Language Navigation
Figure 3 for Agent Journey Beyond RGB: Unveiling Hybrid Semantic-Spatial Environmental Representations for Vision-and-Language Navigation
Figure 4 for Agent Journey Beyond RGB: Unveiling Hybrid Semantic-Spatial Environmental Representations for Vision-and-Language Navigation
Viaarxiv icon

Decomposing Relationship from 1-to-N into N 1-to-1 for Text-Video Retrieval

Add code
Oct 09, 2024
Figure 1 for Decomposing Relationship from 1-to-N into N 1-to-1 for Text-Video Retrieval
Figure 2 for Decomposing Relationship from 1-to-N into N 1-to-1 for Text-Video Retrieval
Figure 3 for Decomposing Relationship from 1-to-N into N 1-to-1 for Text-Video Retrieval
Figure 4 for Decomposing Relationship from 1-to-N into N 1-to-1 for Text-Video Retrieval
Viaarxiv icon