Picture for Gao Huang

Gao Huang

Steering Visual Generation in Unified Multimodal Models with Understanding Supervision

Add code
May 07, 2026
Viaarxiv icon

Linear-Time Global Visual Modeling without Explicit Attention

Add code
May 06, 2026
Viaarxiv icon

Linearizing Vision Transformer with Test-Time Training

Add code
May 04, 2026
Viaarxiv icon

Refinement via Regeneration: Enlarging Modification Space Boosts Image Refinement in Unified Multimodal Models

Add code
Apr 28, 2026
Viaarxiv icon

Bridging the RGB-IR Gap: Consensus and Discrepancy Modeling for Text-Guided Multispectral Detection

Add code
Apr 13, 2026
Viaarxiv icon

MAG-3D: Multi-Agent Grounded Reasoning for 3D Understanding

Add code
Apr 10, 2026
Viaarxiv icon

HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

Add code
Mar 19, 2026
Viaarxiv icon

UltraStar: Semantic-Aware Star Graph Modeling for Echocardiography Navigation

Add code
Mar 02, 2026
Viaarxiv icon

TwinRL-VLA: Digital Twin-Driven Reinforcement Learning for Real-World Robotic Manipulation

Add code
Feb 09, 2026
Viaarxiv icon

SiameseNorm: Breaking the Barrier to Reconciling Pre/Post-Norm

Add code
Feb 08, 2026
Viaarxiv icon