Picture for Danny Driess

Danny Driess

Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers

Add code
Mar 19, 2024
Figure 1 for Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Figure 2 for Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Figure 3 for Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Figure 4 for Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Viaarxiv icon

PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs

Add code
Feb 12, 2024
Figure 1 for PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Figure 2 for PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Figure 3 for PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Figure 4 for PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Viaarxiv icon

SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

Add code
Jan 22, 2024
Viaarxiv icon

Foundation Models in Robotics: Applications, Challenges, and the Future

Add code
Dec 13, 2023
Figure 1 for Foundation Models in Robotics: Applications, Challenges, and the Future
Figure 2 for Foundation Models in Robotics: Applications, Challenges, and the Future
Figure 3 for Foundation Models in Robotics: Applications, Challenges, and the Future
Viaarxiv icon

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Add code
Oct 17, 2023
Figure 1 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 2 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 3 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 4 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Viaarxiv icon

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Add code
Jul 28, 2023
Figure 1 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Figure 2 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Figure 3 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Figure 4 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Viaarxiv icon

Towards Generalist Biomedical AI

Add code
Jul 26, 2023
Figure 1 for Towards Generalist Biomedical AI
Figure 2 for Towards Generalist Biomedical AI
Figure 3 for Towards Generalist Biomedical AI
Figure 4 for Towards Generalist Biomedical AI
Viaarxiv icon

Large Language Models as General Pattern Machines

Add code
Jul 10, 2023
Figure 1 for Large Language Models as General Pattern Machines
Figure 2 for Large Language Models as General Pattern Machines
Figure 3 for Large Language Models as General Pattern Machines
Figure 4 for Large Language Models as General Pattern Machines
Viaarxiv icon

PaLM-E: An Embodied Multimodal Language Model

Add code
Mar 06, 2023
Figure 1 for PaLM-E: An Embodied Multimodal Language Model
Figure 2 for PaLM-E: An Embodied Multimodal Language Model
Figure 3 for PaLM-E: An Embodied Multimodal Language Model
Figure 4 for PaLM-E: An Embodied Multimodal Language Model
Viaarxiv icon

Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control

Add code
Mar 01, 2023
Figure 1 for Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control
Figure 2 for Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control
Figure 3 for Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control
Figure 4 for Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control
Viaarxiv icon