Picture for Pierre Sermanet

Pierre Sermanet

Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers

Add code
Mar 19, 2024
Figure 1 for Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Figure 2 for Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Figure 3 for Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Figure 4 for Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Viaarxiv icon

RT-H: Action Hierarchies Using Language

Add code
Mar 04, 2024
Figure 1 for RT-H: Action Hierarchies Using Language
Figure 2 for RT-H: Action Hierarchies Using Language
Figure 3 for RT-H: Action Hierarchies Using Language
Figure 4 for RT-H: Action Hierarchies Using Language
Viaarxiv icon

AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents

Add code
Jan 23, 2024
Viaarxiv icon

RoboVQA: Multimodal Long-Horizon Reasoning for Robotics

Add code
Nov 01, 2023
Figure 1 for RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
Figure 2 for RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
Figure 3 for RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
Figure 4 for RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
Viaarxiv icon

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Add code
Oct 17, 2023
Figure 1 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 2 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 3 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 4 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Viaarxiv icon

Video Language Planning

Add code
Oct 16, 2023
Figure 1 for Video Language Planning
Figure 2 for Video Language Planning
Figure 3 for Video Language Planning
Figure 4 for Video Language Planning
Viaarxiv icon

Robotic Table Tennis: A Case Study into a High Speed Learning System

Add code
Sep 06, 2023
Figure 1 for Robotic Table Tennis: A Case Study into a High Speed Learning System
Figure 2 for Robotic Table Tennis: A Case Study into a High Speed Learning System
Figure 3 for Robotic Table Tennis: A Case Study into a High Speed Learning System
Figure 4 for Robotic Table Tennis: A Case Study into a High Speed Learning System
Viaarxiv icon

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Add code
Jul 28, 2023
Figure 1 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Figure 2 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Figure 3 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Figure 4 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Viaarxiv icon

PaLM-E: An Embodied Multimodal Language Model

Add code
Mar 06, 2023
Figure 1 for PaLM-E: An Embodied Multimodal Language Model
Figure 2 for PaLM-E: An Embodied Multimodal Language Model
Figure 3 for PaLM-E: An Embodied Multimodal Language Model
Figure 4 for PaLM-E: An Embodied Multimodal Language Model
Viaarxiv icon

Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models

Add code
Nov 22, 2022
Figure 1 for Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models
Figure 2 for Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models
Figure 3 for Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models
Figure 4 for Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models
Viaarxiv icon