Alert button
Picture for Cordelia Schmid

Cordelia Schmid

Alert button

Improving Image Recognition by Retrieving from Web-Scale Image-Text Data

Add code
Bookmark button
Alert button
Apr 11, 2023
Ahmet Iscen, Alireza Fathi, Cordelia Schmid

Figure 1 for Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Figure 2 for Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Figure 3 for Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Figure 4 for Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Viaarxiv icon

Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval

Add code
Bookmark button
Alert button
Apr 06, 2023
Jae Myung Kim, A. Sophia Koepke, Cordelia Schmid, Zeynep Akata

Figure 1 for Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval
Figure 2 for Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval
Figure 3 for Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval
Figure 4 for Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval
Viaarxiv icon

Bridging the Gap between Model Explanations in Partially Annotated Multi-label Classification

Add code
Bookmark button
Alert button
Apr 04, 2023
Youngwook Kim, Jae Myung Kim, Jieun Jeong, Cordelia Schmid, Zeynep Akata, Jungwoo Lee

Figure 1 for Bridging the Gap between Model Explanations in Partially Annotated Multi-label Classification
Viaarxiv icon

AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR

Add code
Bookmark button
Alert button
Mar 29, 2023
Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid

Figure 1 for AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Figure 2 for AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Figure 3 for AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Figure 4 for AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Viaarxiv icon

Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning

Add code
Bookmark button
Alert button
Mar 21, 2023
Antoine Yang, Arsha Nagrani, Paul Hongsuck Seo, Antoine Miech, Jordi Pont-Tuset, Ivan Laptev, Josef Sivic, Cordelia Schmid

Figure 1 for Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Figure 2 for Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Figure 3 for Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Figure 4 for Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Viaarxiv icon

Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation

Add code
Bookmark button
Alert button
Dec 20, 2022
Matthieu Futeral, Cordelia Schmid, Ivan Laptev, Benoît Sagot, Rachel Bawden

Figure 1 for Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation
Figure 2 for Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation
Figure 3 for Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation
Figure 4 for Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation
Viaarxiv icon

REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory

Add code
Bookmark button
Alert button
Dec 10, 2022
Ziniu Hu, Ahmet Iscen, Chen Sun, Zirui Wang, Kai-Wei Chang, Yizhou Sun, Cordelia Schmid, David A. Ross, Alireza Fathi

Figure 1 for REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Figure 2 for REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Figure 3 for REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Figure 4 for REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Viaarxiv icon

Audiovisual Masked Autoencoders

Add code
Bookmark button
Alert button
Dec 09, 2022
Mariana-Iuliana Georgescu, Eduardo Fonseca, Radu Tudor Ionescu, Mario Lucic, Cordelia Schmid, Anurag Arnab

Figure 1 for Audiovisual Masked Autoencoders
Figure 2 for Audiovisual Masked Autoencoders
Figure 3 for Audiovisual Masked Autoencoders
Figure 4 for Audiovisual Masked Autoencoders
Viaarxiv icon

Location-Aware Self-Supervised Transformers

Add code
Bookmark button
Alert button
Dec 05, 2022
Mathilde Caron, Neil Houlsby, Cordelia Schmid

Figure 1 for Location-Aware Self-Supervised Transformers
Figure 2 for Location-Aware Self-Supervised Transformers
Figure 3 for Location-Aware Self-Supervised Transformers
Figure 4 for Location-Aware Self-Supervised Transformers
Viaarxiv icon