Alert button
Picture for Cordelia Schmid

Cordelia Schmid

Alert button

Modular Visual Question Answering via Code Generation

Jun 08, 2023
Sanjay Subramanian, Medhini Narasimhan, Kushal Khangaonkar, Kevin Yang, Arsha Nagrani, Cordelia Schmid, Andy Zeng, Trevor Darrell, Dan Klein

Figure 1 for Modular Visual Question Answering via Code Generation
Figure 2 for Modular Visual Question Answering via Code Generation
Figure 3 for Modular Visual Question Answering via Code Generation
Figure 4 for Modular Visual Question Answering via Code Generation
Viaarxiv icon

Learning Video-Conditioned Policies for Unseen Manipulation Tasks

May 10, 2023
Elliot Chane-Sane, Cordelia Schmid, Ivan Laptev

Figure 1 for Learning Video-Conditioned Policies for Unseen Manipulation Tasks
Figure 2 for Learning Video-Conditioned Policies for Unseen Manipulation Tasks
Figure 3 for Learning Video-Conditioned Policies for Unseen Manipulation Tasks
Figure 4 for Learning Video-Conditioned Policies for Unseen Manipulation Tasks
Viaarxiv icon

End-to-End Spatio-Temporal Action Localisation with Video Transformers

Apr 24, 2023
Alexey Gritsenko, Xuehan Xiong, Josip Djolonga, Mostafa Dehghani, Chen Sun, Mario Lučić, Cordelia Schmid, Anurag Arnab

Figure 1 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Figure 2 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Figure 3 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Figure 4 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Viaarxiv icon

gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction

Apr 24, 2023
Zerui Chen, Shizhe Chen, Cordelia Schmid, Ivan Laptev

Figure 1 for gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction
Figure 2 for gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction
Figure 3 for gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction
Figure 4 for gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction
Viaarxiv icon

Verbs in Action: Improving verb understanding in video-language models

Apr 13, 2023
Liliane Momeni, Mathilde Caron, Arsha Nagrani, Andrew Zisserman, Cordelia Schmid

Figure 1 for Verbs in Action: Improving verb understanding in video-language models
Figure 2 for Verbs in Action: Improving verb understanding in video-language models
Figure 3 for Verbs in Action: Improving verb understanding in video-language models
Figure 4 for Verbs in Action: Improving verb understanding in video-language models
Viaarxiv icon

Contact Models in Robotics: a Comparative Analysis

Apr 13, 2023
Quentin Le Lidec, Wilson Jallet, Louis Montaut, Ivan Laptev, Cordelia Schmid, Justin Carpentier

Figure 1 for Contact Models in Robotics: a Comparative Analysis
Figure 2 for Contact Models in Robotics: a Comparative Analysis
Figure 3 for Contact Models in Robotics: a Comparative Analysis
Figure 4 for Contact Models in Robotics: a Comparative Analysis
Viaarxiv icon

Improving Image Recognition by Retrieving from Web-Scale Image-Text Data

Apr 11, 2023
Ahmet Iscen, Alireza Fathi, Cordelia Schmid

Figure 1 for Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Figure 2 for Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Figure 3 for Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Figure 4 for Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Viaarxiv icon

Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval

Apr 06, 2023
Jae Myung Kim, A. Sophia Koepke, Cordelia Schmid, Zeynep Akata

Figure 1 for Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval
Figure 2 for Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval
Figure 3 for Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval
Figure 4 for Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval
Viaarxiv icon

Bridging the Gap between Model Explanations in Partially Annotated Multi-label Classification

Apr 04, 2023
Youngwook Kim, Jae Myung Kim, Jieun Jeong, Cordelia Schmid, Zeynep Akata, Jungwoo Lee

Figure 1 for Bridging the Gap between Model Explanations in Partially Annotated Multi-label Classification
Viaarxiv icon

AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR

Mar 29, 2023
Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid

Figure 1 for AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Figure 2 for AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Figure 3 for AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Figure 4 for AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Viaarxiv icon