Picture for Hao Fei

Hao Fei

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

Add code
Apr 17, 2025
Viaarxiv icon

Probing then Editing Response Personality of Large Language Models

Add code
Apr 14, 2025
Viaarxiv icon

VEGAS: Towards Visually Explainable and Grounded Artificial Social Intelligence

Add code
Apr 03, 2025
Viaarxiv icon

Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

Add code
Mar 31, 2025
Viaarxiv icon

JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization

Add code
Mar 30, 2025
Viaarxiv icon

Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology

Add code
Mar 19, 2025
Viaarxiv icon

Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene

Add code
Mar 19, 2025
Viaarxiv icon

Universal Scene Graph Generation

Add code
Mar 19, 2025
Viaarxiv icon

Multi-Granular Multimodal Clue Fusion for Meme Understanding

Add code
Mar 16, 2025
Viaarxiv icon

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Add code
Mar 16, 2025
Viaarxiv icon