Picture for Hao Fei

Hao Fei

SceneParser: Hierarchical Scene Parsing for Visual Semantics Understanding

Add code
May 14, 2026
Viaarxiv icon

RADAR: Redundancy-Aware Diffusion for Multi-Agent Communication Structure Generation

Add code
May 11, 2026
Viaarxiv icon

Audio-Visual Intelligence in Large Foundation Models

Add code
May 05, 2026
Viaarxiv icon

Taming Actor-Observer Asymmetry in Agents via Dialectical Alignment

Add code
Apr 21, 2026
Viaarxiv icon

SOAR: Self-Correction for Optimal Alignment and Refinement in Diffusion Models

Add code
Apr 14, 2026
Viaarxiv icon

Code-MIE: A Code-style Model for Multimodal Information Extraction with Scene Graph and Entity Attribute Knowledge Enhancement

Add code
Mar 21, 2026
Viaarxiv icon

GraphiContact: Pose-aware Human-Scene Robust Contact Perception for Interactive Systems

Add code
Mar 19, 2026
Viaarxiv icon

Orthogonal Spatial-temporal Distributional Transfer for 4D Generation

Add code
Mar 05, 2026
Viaarxiv icon

UniM: A Unified Any-to-Any Interleaved Multimodal Benchmark

Add code
Mar 05, 2026
Viaarxiv icon

Spatial Causal Prediction in Video

Add code
Mar 04, 2026
Viaarxiv icon