Picture for Han Cai

Han Cai

ActionMap: Robot Policy Learning via Voxel Action Heatmap

Add code
Jun 05, 2026
Viaarxiv icon

Cosmos 3: Omnimodal World Models for Physical AI

Add code
Jun 01, 2026
Viaarxiv icon

JetViT: Efficient High-Resolution Vision Transformer with Post-Training Attention Search

Add code
May 26, 2026
Viaarxiv icon

Hide to Guide: Learning via Semantic Masking

Add code
May 24, 2026
Viaarxiv icon

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

Add code
May 13, 2026
Viaarxiv icon

Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization

Add code
Feb 03, 2026
Viaarxiv icon

Jet-RL: Enabling On-Policy FP8 Reinforcement Learning with Unified Training and Rollout Precision Flow

Add code
Jan 20, 2026
Viaarxiv icon

Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs

Add code
May 26, 2025
Viaarxiv icon

Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation

Add code
May 24, 2025
Viaarxiv icon

Scaling Vision Pre-Training to 4K Resolution

Add code
Mar 25, 2025
Viaarxiv icon