Picture for Yongkang Zhang

Yongkang Zhang

MetaWorld-X: Hierarchical World Modeling via VLM-Orchestrated Experts for Humanoid Loco-Manipulation

Add code
Mar 09, 2026
Viaarxiv icon

Order Is Not Layout: Order-to-Space Bias in Image Generation

Add code
Mar 04, 2026
Viaarxiv icon

Learning with Challenges: Adaptive Difficulty-Aware Data Generation for Mobile GUI Agent Training

Add code
Jan 30, 2026
Viaarxiv icon

MVSS: A Unified Framework for Multi-View Structured Survey Generation

Add code
Jan 14, 2026
Viaarxiv icon

Temporal Transformer Networks with Self-Supervision for Action Recognition

Add code
Dec 17, 2021
Figure 1 for Temporal Transformer Networks with Self-Supervision for Action Recognition
Figure 2 for Temporal Transformer Networks with Self-Supervision for Action Recognition
Figure 3 for Temporal Transformer Networks with Self-Supervision for Action Recognition
Figure 4 for Temporal Transformer Networks with Self-Supervision for Action Recognition
Viaarxiv icon