Alert button
Picture for Pete Florence

Pete Florence

Alert button

SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

Jan 22, 2024
Boyuan Chen, Zhuo Xu, Sean Kirmani, Brian Ichter, Danny Driess, Pete Florence, Dorsa Sadigh, Leonidas Guibas, Fei Xia

Viaarxiv icon

RoboVQA: Multimodal Long-Horizon Reasoning for Robotics

Nov 01, 2023
Pierre Sermanet, Tianli Ding, Jeffrey Zhao, Fei Xia, Debidatta Dwibedi, Keerthana Gopalakrishnan, Christine Chan, Gabriel Dulac-Arnold, Sharath Maddineni, Nikhil J Joshi, Pete Florence, Wei Han, Robert Baruch, Yao Lu, Suvir Mirchandani, Peng Xu, Pannag Sanketi, Karol Hausman, Izhak Shafran, Brian Ichter, Yuan Cao

Figure 1 for RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
Figure 2 for RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
Figure 3 for RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
Figure 4 for RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
Viaarxiv icon

Video Language Planning

Oct 16, 2023
Yilun Du, Mengjiao Yang, Pete Florence, Fei Xia, Ayzaan Wahid, Brian Ichter, Pierre Sermanet, Tianhe Yu, Pieter Abbeel, Joshua B. Tenenbaum, Leslie Kaelbling, Andy Zeng, Jonathan Tompson

Viaarxiv icon

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Jul 28, 2023
Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, Pete Florence, Chuyuan Fu, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Kehang Han, Karol Hausman, Alexander Herzog, Jasmine Hsu, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Lisa Lee, Tsang-Wei Edward Lee, Sergey Levine, Yao Lu, Henryk Michalewski, Igor Mordatch, Karl Pertsch, Kanishka Rao, Krista Reymann, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Pierre Sermanet, Jaspiar Singh, Anikait Singh, Radu Soricut, Huong Tran, Vincent Vanhoucke, Quan Vuong, Ayzaan Wahid, Stefan Welker, Paul Wohlhart, Jialin Wu, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich

Figure 1 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Figure 2 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Figure 3 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Figure 4 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Viaarxiv icon

Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition

Jul 26, 2023
Huy Ha, Pete Florence, Shuran Song

Figure 1 for Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition
Figure 2 for Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition
Figure 3 for Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition
Figure 4 for Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition
Viaarxiv icon

Towards Generalist Biomedical AI

Jul 26, 2023
Tao Tu, Shekoofeh Azizi, Danny Driess, Mike Schaekermann, Mohamed Amin, Pi-Chuan Chang, Andrew Carroll, Chuck Lau, Ryutaro Tanno, Ira Ktena, Basil Mustafa, Aakanksha Chowdhery, Yun Liu, Simon Kornblith, David Fleet, Philip Mansfield, Sushant Prakash, Renee Wong, Sunny Virmani, Christopher Semturs, S Sara Mahdavi, Bradley Green, Ewa Dominowska, Blaise Aguera y Arcas, Joelle Barral, Dale Webster, Greg S. Corrado, Yossi Matias, Karan Singhal, Pete Florence, Alan Karthikesalingam, Vivek Natarajan

Figure 1 for Towards Generalist Biomedical AI
Figure 2 for Towards Generalist Biomedical AI
Figure 3 for Towards Generalist Biomedical AI
Figure 4 for Towards Generalist Biomedical AI
Viaarxiv icon

Large Language Models as General Pattern Machines

Jul 10, 2023
Suvir Mirchandani, Fei Xia, Pete Florence, Brian Ichter, Danny Driess, Montserrat Gonzalez Arenas, Kanishka Rao, Dorsa Sadigh, Andy Zeng

Figure 1 for Large Language Models as General Pattern Machines
Figure 2 for Large Language Models as General Pattern Machines
Figure 3 for Large Language Models as General Pattern Machines
Figure 4 for Large Language Models as General Pattern Machines
Viaarxiv icon

RoboPianist: A Benchmark for High-Dimensional Robot Control

Apr 09, 2023
Kevin Zakka, Laura Smith, Nimrod Gileadi, Taylor Howell, Xue Bin Peng, Sumeet Singh, Yuval Tassa, Pete Florence, Andy Zeng, Pieter Abbeel

Figure 1 for RoboPianist: A Benchmark for High-Dimensional Robot Control
Figure 2 for RoboPianist: A Benchmark for High-Dimensional Robot Control
Figure 3 for RoboPianist: A Benchmark for High-Dimensional Robot Control
Figure 4 for RoboPianist: A Benchmark for High-Dimensional Robot Control
Viaarxiv icon

PaLM-E: An Embodied Multimodal Language Model

Mar 06, 2023
Danny Driess, Fei Xia, Mehdi S. M. Sajjadi, Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, Jonathan Tompson, Quan Vuong, Tianhe Yu, Wenlong Huang, Yevgen Chebotar, Pierre Sermanet, Daniel Duckworth, Sergey Levine, Vincent Vanhoucke, Karol Hausman, Marc Toussaint, Klaus Greff, Andy Zeng, Igor Mordatch, Pete Florence

Figure 1 for PaLM-E: An Embodied Multimodal Language Model
Figure 2 for PaLM-E: An Embodied Multimodal Language Model
Figure 3 for PaLM-E: An Embodied Multimodal Language Model
Figure 4 for PaLM-E: An Embodied Multimodal Language Model
Viaarxiv icon

Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control

Mar 01, 2023
Wenlong Huang, Fei Xia, Dhruv Shah, Danny Driess, Andy Zeng, Yao Lu, Pete Florence, Igor Mordatch, Sergey Levine, Karol Hausman, Brian Ichter

Figure 1 for Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control
Figure 2 for Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control
Figure 3 for Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control
Figure 4 for Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control
Viaarxiv icon