Picture for Yilun Du

Yilun Du

Derek

Disentangled Acoustic Fields For Multimodal Physical Scene Understanding

Add code
Jul 16, 2024
Viaarxiv icon

Potential Based Diffusion Motion Planning

Add code
Jul 08, 2024
Viaarxiv icon

Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

Add code
Jul 02, 2024
Figure 1 for Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Figure 2 for Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Figure 3 for Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Figure 4 for Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Viaarxiv icon

Compositional Image Decomposition with Diffusion Models

Add code
Jun 27, 2024
Viaarxiv icon

"Set It Up!": Functional Object Arrangement with Compositional Generative Models

Add code
May 20, 2024
Figure 1 for "Set It Up!": Functional Object Arrangement with Compositional Generative Models
Figure 2 for "Set It Up!": Functional Object Arrangement with Compositional Generative Models
Figure 3 for "Set It Up!": Functional Object Arrangement with Compositional Generative Models
Figure 4 for "Set It Up!": Functional Object Arrangement with Compositional Generative Models
Viaarxiv icon

Towards Generalist Robot Learning from Internet Video: A Survey

Add code
Apr 30, 2024
Viaarxiv icon

RoboDreamer: Learning Compositional World Models for Robot Imagination

Add code
Apr 18, 2024
Viaarxiv icon

COMBO: Compositional World Models for Embodied Multi-Agent Cooperation

Add code
Apr 16, 2024
Figure 1 for COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Figure 2 for COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Figure 3 for COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Figure 4 for COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Viaarxiv icon

3D-VLA: A 3D Vision-Language-Action Generative World Model

Add code
Mar 14, 2024
Figure 1 for 3D-VLA: A 3D Vision-Language-Action Generative World Model
Figure 2 for 3D-VLA: A 3D Vision-Language-Action Generative World Model
Figure 3 for 3D-VLA: A 3D Vision-Language-Action Generative World Model
Figure 4 for 3D-VLA: A 3D Vision-Language-Action Generative World Model
Viaarxiv icon

Video as the New Language for Real-World Decision Making

Add code
Feb 27, 2024
Figure 1 for Video as the New Language for Real-World Decision Making
Figure 2 for Video as the New Language for Real-World Decision Making
Figure 3 for Video as the New Language for Real-World Decision Making
Figure 4 for Video as the New Language for Real-World Decision Making
Viaarxiv icon