Alert button
Picture for Yusuf Aytar

Yusuf Aytar

Alert button

FlexCap: Generating Rich, Localized, and Flexible Captions in Images

Mar 18, 2024
Debidatta Dwibedi, Vidhi Jain, Jonathan Tompson, Andrew Zisserman, Yusuf Aytar

Viaarxiv icon

Genie: Generative Interactive Environments

Feb 23, 2024
Jake Bruce, Michael Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, Edward Hughes, Matthew Lai, Aditi Mavalankar, Richie Steigerwald, Chris Apps, Yusuf Aytar, Sarah Bechtle, Feryal Behbahani, Stephanie Chan, Nicolas Heess, Lucy Gonzalez, Simon Osindero, Sherjil Ozair, Scott Reed, Jingwei Zhang, Konrad Zolna, Jeff Clune, Nando de Freitas, Satinder Singh, Tim Rocktäschel

Viaarxiv icon

Learning from One Continuous Video Stream

Dec 01, 2023
João Carreira, Michael King, Viorica Pătrăucean, Dilara Gokay, Cătălin Ionescu, Yi Yang, Daniel Zoran, Joseph Heyward, Carl Doersch, Yusuf Aytar, Dima Damen, Andrew Zisserman

Figure 1 for Learning from One Continuous Video Stream
Figure 2 for Learning from One Continuous Video Stream
Figure 3 for Learning from One Continuous Video Stream
Figure 4 for Learning from One Continuous Video Stream
Viaarxiv icon

RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation

Aug 31, 2023
Mel Vecerik, Carl Doersch, Yi Yang, Todor Davchev, Yusuf Aytar, Guangyao Zhou, Raia Hadsell, Lourdes Agapito, Jon Scholz

Figure 1 for RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation
Figure 2 for RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation
Figure 3 for RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation
Figure 4 for RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation
Viaarxiv icon

RoboCat: A Self-Improving Foundation Agent for Robotic Manipulation

Jun 20, 2023
Konstantinos Bousmalis, Giulia Vezzani, Dushyant Rao, Coline Devin, Alex X. Lee, Maria Bauza, Todor Davchev, Yuxiang Zhou, Agrim Gupta, Akhil Raju, Antoine Laurens, Claudio Fantacci, Valentin Dalibard, Martina Zambelli, Murilo Martins, Rugile Pevceviciute, Michiel Blokzijl, Misha Denil, Nathan Batchelor, Thomas Lampe, Emilio Parisotto, Konrad Żołna, Scott Reed, Sergio Gómez Colmenarejo, Jon Scholz, Abbas Abdolmaleki, Oliver Groth, Jean-Baptiste Regli, Oleg Sushkov, Tom Rothörl, José Enrique Chen, Yusuf Aytar, Dave Barker, Joy Ortiz, Martin Riedmiller, Jost Tobias Springenberg, Raia Hadsell, Francesco Nori, Nicolas Heess

Viaarxiv icon

TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement

Jun 14, 2023
Carl Doersch, Yi Yang, Mel Vecerik, Dilara Gokay, Ankush Gupta, Yusuf Aytar, Joao Carreira, Andrew Zisserman

Figure 1 for TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement
Figure 2 for TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement
Figure 3 for TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement
Figure 4 for TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement
Viaarxiv icon

Perception Test: A Diagnostic Benchmark for Multimodal Video Models

May 23, 2023
Viorica Pătrăucean, Lucas Smaira, Ankush Gupta, Adrià Recasens Continente, Larisa Markeeva, Dylan Banarse, Skanda Koppula, Joseph Heyward, Mateusz Malinowski, Yi Yang, Carl Doersch, Tatiana Matejovicova, Yury Sulsky, Antoine Miech, Alex Frechette, Hanna Klimczak, Raphael Koster, Junlin Zhang, Stephanie Winkler, Yusuf Aytar, Simon Osindero, Dima Damen, Andrew Zisserman, João Carreira

Figure 1 for Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Figure 2 for Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Figure 3 for Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Figure 4 for Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Viaarxiv icon

Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation

Apr 13, 2023
Mohit Sharma, Claudio Fantacci, Yuxiang Zhou, Skanda Koppula, Nicolas Heess, Jon Scholz, Yusuf Aytar

Figure 1 for Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation
Figure 2 for Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation
Figure 3 for Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation
Figure 4 for Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation
Viaarxiv icon

TAP-Vid: A Benchmark for Tracking Any Point in a Video

Nov 07, 2022
Carl Doersch, Ankush Gupta, Larisa Markeeva, Adrià Recasens, Lucas Smaira, Yusuf Aytar, João Carreira, Andrew Zisserman, Yi Yang

Figure 1 for TAP-Vid: A Benchmark for Tracking Any Point in a Video
Figure 2 for TAP-Vid: A Benchmark for Tracking Any Point in a Video
Figure 3 for TAP-Vid: A Benchmark for Tracking Any Point in a Video
Figure 4 for TAP-Vid: A Benchmark for Tracking Any Point in a Video
Viaarxiv icon

Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies

Dec 09, 2021
Dushyant Rao, Fereshteh Sadeghi, Leonard Hasenclever, Markus Wulfmeier, Martina Zambelli, Giulia Vezzani, Dhruva Tirumala, Yusuf Aytar, Josh Merel, Nicolas Heess, Raia Hadsell

Figure 1 for Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies
Figure 2 for Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies
Figure 3 for Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies
Figure 4 for Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies
Viaarxiv icon