Picture for Hao Fei

Hao Fei

Code-MIE: A Code-style Model for Multimodal Information Extraction with Scene Graph and Entity Attribute Knowledge Enhancement

Add code
Mar 21, 2026
Viaarxiv icon

GraphiContact: Pose-aware Human-Scene Robust Contact Perception for Interactive Systems

Add code
Mar 19, 2026
Viaarxiv icon

UniM: A Unified Any-to-Any Interleaved Multimodal Benchmark

Add code
Mar 05, 2026
Viaarxiv icon

Orthogonal Spatial-temporal Distributional Transfer for 4D Generation

Add code
Mar 05, 2026
Viaarxiv icon

Spatial Causal Prediction in Video

Add code
Mar 04, 2026
Viaarxiv icon

Modeling Cross-vision Synergy for Unified Large Vision Model

Add code
Mar 03, 2026
Viaarxiv icon

Synergizing Understanding and Generation with Interleaved Analyzing-Drafting Thinking

Add code
Feb 24, 2026
Viaarxiv icon

JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation

Add code
Feb 22, 2026
Viaarxiv icon

Unveiling the Cognitive Compass: Theory-of-Mind-Guided Multimodal Emotion Reasoning

Add code
Feb 01, 2026
Viaarxiv icon

SAMTok: Representing Any Mask with Two Words

Add code
Jan 22, 2026
Viaarxiv icon