Picture for Weijie Kong

Weijie Kong

Refer to the report for detailed contributions

OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning

Add code
Mar 25, 2026
Viaarxiv icon

Manifold-Aware Exploration for Reinforcement Learning in Video Generation

Add code
Mar 23, 2026
Viaarxiv icon

IG-RFT: An Interaction-Guided RL Framework for VLA Models in Long-Horizon Robotic Manipulation

Add code
Feb 24, 2026
Viaarxiv icon

F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

Add code
Sep 09, 2025
Figure 1 for F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions
Figure 2 for F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions
Figure 3 for F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions
Figure 4 for F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions
Viaarxiv icon

Hunyuan-Game: Industrial-grade Intelligent Game Creation Model

Add code
May 20, 2025
Viaarxiv icon

HunyuanVideo: A Systematic Framework For Large Video Generative Models

Add code
Dec 03, 2024
Figure 1 for HunyuanVideo: A Systematic Framework For Large Video Generative Models
Figure 2 for HunyuanVideo: A Systematic Framework For Large Video Generative Models
Figure 3 for HunyuanVideo: A Systematic Framework For Large Video Generative Models
Figure 4 for HunyuanVideo: A Systematic Framework For Large Video Generative Models
Viaarxiv icon

Global and Local Semantic Completion Learning for Vision-Language Pre-training

Add code
Jun 12, 2023
Figure 1 for Global and Local Semantic Completion Learning for Vision-Language Pre-training
Figure 2 for Global and Local Semantic Completion Learning for Vision-Language Pre-training
Figure 3 for Global and Local Semantic Completion Learning for Vision-Language Pre-training
Figure 4 for Global and Local Semantic Completion Learning for Vision-Language Pre-training
Viaarxiv icon

Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning

Add code
Nov 24, 2022
Figure 1 for Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
Figure 2 for Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
Figure 3 for Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
Figure 4 for Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
Viaarxiv icon

Egocentric Video-Language Pretraining @ Ego4D Challenge 2022

Add code
Jul 04, 2022
Figure 1 for Egocentric Video-Language Pretraining @ Ego4D Challenge 2022
Figure 2 for Egocentric Video-Language Pretraining @ Ego4D Challenge 2022
Figure 3 for Egocentric Video-Language Pretraining @ Ego4D Challenge 2022
Figure 4 for Egocentric Video-Language Pretraining @ Ego4D Challenge 2022
Viaarxiv icon

Egocentric Video-Language Pretraining @ EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022

Add code
Jul 04, 2022
Figure 1 for Egocentric Video-Language Pretraining @ EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022
Figure 2 for Egocentric Video-Language Pretraining @ EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022
Figure 3 for Egocentric Video-Language Pretraining @ EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022
Figure 4 for Egocentric Video-Language Pretraining @ EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022
Viaarxiv icon