Picture for Hao Tang

Hao Tang

Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models

Add code
Apr 24, 2025
Viaarxiv icon

Multimodal Perception for Goal-oriented Navigation: A Survey

Add code
Apr 22, 2025
Viaarxiv icon

EventVAD: Training-Free Event-Aware Video Anomaly Detection

Add code
Apr 17, 2025
Viaarxiv icon

3D CoCa: Contrastive Learners are 3D Captioners

Add code
Apr 13, 2025
Viaarxiv icon

Multi-scale Activation, Refinement, and Aggregation: Exploring Diverse Cues for Fine-Grained Bird Recognition

Add code
Apr 12, 2025
Viaarxiv icon

Follow Your Motion: A Generic Temporal Consistency Portrait Editing Framework with Trajectory Guidance

Add code
Mar 28, 2025
Viaarxiv icon

PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model

Add code
Mar 25, 2025
Viaarxiv icon

HOIGPT: Learning Long Sequence Hand-Object Interaction with Language Models

Add code
Mar 24, 2025
Viaarxiv icon

Beyond Semantics: Rediscovering Spatial Awareness in Vision-Language Models

Add code
Mar 21, 2025
Viaarxiv icon

MambaIC: State Space Models for High-Performance Learned Image Compression

Add code
Mar 16, 2025
Viaarxiv icon