Picture for Haotian Yang

Haotian Yang

Asynchronous Fast-Slow Vision-Language-Action Policies for Whole-Body Robotic Manipulation

Add code
Dec 23, 2025
Figure 1 for Asynchronous Fast-Slow Vision-Language-Action Policies for Whole-Body Robotic Manipulation
Figure 2 for Asynchronous Fast-Slow Vision-Language-Action Policies for Whole-Body Robotic Manipulation
Figure 3 for Asynchronous Fast-Slow Vision-Language-Action Policies for Whole-Body Robotic Manipulation
Figure 4 for Asynchronous Fast-Slow Vision-Language-Action Policies for Whole-Body Robotic Manipulation
Viaarxiv icon

VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization

Add code
Dec 18, 2025
Figure 1 for VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization
Figure 2 for VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization
Figure 3 for VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization
Figure 4 for VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization
Viaarxiv icon

Mind to Hand: Purposeful Robotic Control via Embodied Reasoning

Add code
Dec 10, 2025
Viaarxiv icon

A Hybrid Force-Position Strategy for Shape Control of Deformable Linear Objects With Graph Attention Networks

Add code
Aug 10, 2025
Viaarxiv icon

Imbalance in Balance: Online Concept Balancing in Generation Models

Add code
Jul 17, 2025
Viaarxiv icon

An Empirical Study of the Impact of Federated Learning on Machine Learning Model Accuracy

Add code
Mar 27, 2025
Figure 1 for An Empirical Study of the Impact of Federated Learning on Machine Learning Model Accuracy
Figure 2 for An Empirical Study of the Impact of Federated Learning on Machine Learning Model Accuracy
Figure 3 for An Empirical Study of the Impact of Federated Learning on Machine Learning Model Accuracy
Figure 4 for An Empirical Study of the Impact of Federated Learning on Machine Learning Model Accuracy
Viaarxiv icon

DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers

Add code
Mar 18, 2025
Viaarxiv icon

Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content

Add code
Oct 10, 2024
Viaarxiv icon

VideoTetris: Towards Compositional Text-to-Video Generation

Add code
Jun 06, 2024
Figure 1 for VideoTetris: Towards Compositional Text-to-Video Generation
Figure 2 for VideoTetris: Towards Compositional Text-to-Video Generation
Figure 3 for VideoTetris: Towards Compositional Text-to-Video Generation
Figure 4 for VideoTetris: Towards Compositional Text-to-Video Generation
Viaarxiv icon

Text-Driven Diverse Facial Texture Generation via Progressive Latent-Space Refinement

Add code
Apr 15, 2024
Figure 1 for Text-Driven Diverse Facial Texture Generation via Progressive Latent-Space Refinement
Figure 2 for Text-Driven Diverse Facial Texture Generation via Progressive Latent-Space Refinement
Figure 3 for Text-Driven Diverse Facial Texture Generation via Progressive Latent-Space Refinement
Figure 4 for Text-Driven Diverse Facial Texture Generation via Progressive Latent-Space Refinement
Viaarxiv icon