Picture for Xuehai He

Xuehai He

Latent Recurrent Transformer: Architecture Exploration, Training Strategies, and Scaling Behavior

Add code
May 26, 2026
Viaarxiv icon

Self-Evolving 3D Scene Generation from a Single Image

Add code
Dec 09, 2025
Viaarxiv icon

MorphoSim: An Interactive, Controllable, and Editable Language-guided 4D World Simulator

Add code
Oct 05, 2025
Figure 1 for MorphoSim: An Interactive, Controllable, and Editable Language-guided 4D World Simulator
Figure 2 for MorphoSim: An Interactive, Controllable, and Editable Language-guided 4D World Simulator
Figure 3 for MorphoSim: An Interactive, Controllable, and Editable Language-guided 4D World Simulator
Figure 4 for MorphoSim: An Interactive, Controllable, and Editable Language-guided 4D World Simulator
Viaarxiv icon

GRIT: Teaching MLLMs to Think with Images

Add code
May 21, 2025
Viaarxiv icon

Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space

Add code
May 21, 2025
Viaarxiv icon

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Add code
Apr 29, 2025
Figure 1 for Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Figure 2 for Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Figure 3 for Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Figure 4 for Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Viaarxiv icon

Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation

Add code
Dec 17, 2024
Figure 1 for Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
Figure 2 for Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
Figure 3 for Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
Figure 4 for Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
Viaarxiv icon

Mojito: Motion Trajectory and Intensity Control for Video Generation

Add code
Dec 12, 2024
Figure 1 for Mojito: Motion Trajectory and Intensity Control for Video Generation
Figure 2 for Mojito: Motion Trajectory and Intensity Control for Video Generation
Figure 3 for Mojito: Motion Trajectory and Intensity Control for Video Generation
Figure 4 for Mojito: Motion Trajectory and Intensity Control for Video Generation
Viaarxiv icon

EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing

Add code
Oct 03, 2024
Figure 1 for EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
Figure 2 for EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
Figure 3 for EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
Figure 4 for EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
Viaarxiv icon

MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos

Add code
Jun 12, 2024
Figure 1 for MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
Figure 2 for MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
Figure 3 for MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
Figure 4 for MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
Viaarxiv icon