Picture for Zitong Yu

Zitong Yu

Retrieving to Recover: Towards Incomplete Audio-Visual Question Answering via Semantic-consistent Purification

Add code
Apr 14, 2026
Viaarxiv icon

AffectAgent: Collaborative Multi-Agent Reasoning for Retrieval-Augmented Multimodal Emotion Recognition

Add code
Apr 14, 2026
Viaarxiv icon

YUV20K: A Complexity-Driven Benchmark and Trajectory-Aware Alignment Model for Video Camouflaged Object Detection

Add code
Apr 11, 2026
Viaarxiv icon

SVC 2026: the Second Multimodal Deception Detection Challenge and the First Domain Generalized Remote Physiological Measurement Challenge

Add code
Apr 07, 2026
Viaarxiv icon

FreqPhys: Repurposing Implicit Physiological Frequency Prior for Robust Remote Photoplethysmography

Add code
Apr 01, 2026
Viaarxiv icon

GazeCLIP: Gaze-Guided CLIP with Adaptive-Enhanced Fine-Grained Language Prompt for Deepfake Attribution and Detection

Add code
Mar 31, 2026
Viaarxiv icon

ForensicZip: More Tokens are Better but Not Necessary in Forensic Vision-Language Models

Add code
Mar 12, 2026
Viaarxiv icon

AULLM++: Structural Reasoning with Large Language Models for Micro-Expression Recognition

Add code
Mar 09, 2026
Viaarxiv icon

$Δ$VLA: Prior-Guided Vision-Language-Action Models via World Knowledge Variation

Add code
Mar 09, 2026
Viaarxiv icon

High-Resolution Underwater Camouflaged Object Detection: GBU-UCOD Dataset and Topology-Aware and Frequency-Decoupled Networks

Add code
Feb 03, 2026
Viaarxiv icon