Alert button
Picture for Mathilde Caron

Mathilde Caron

Alert button

A Generative Approach for Wikipedia-Scale Visual Entity Recognition

Add code
Bookmark button
Alert button
Mar 04, 2024
Mathilde Caron, Ahmet Iscen, Alireza Fathi, Cordelia Schmid

Figure 1 for A Generative Approach for Wikipedia-Scale Visual Entity Recognition
Figure 2 for A Generative Approach for Wikipedia-Scale Visual Entity Recognition
Figure 3 for A Generative Approach for Wikipedia-Scale Visual Entity Recognition
Figure 4 for A Generative Approach for Wikipedia-Scale Visual Entity Recognition
Viaarxiv icon

Guided Diffusion from Self-Supervised Diffusion Features

Add code
Bookmark button
Alert button
Dec 14, 2023
Vincent Tao Hu, Yunlu Chen, Mathilde Caron, Yuki M. Asano, Cees G. M. Snoek, Bjorn Ommer

Viaarxiv icon

Weakly-Supervised Surgical Phase Recognition

Add code
Bookmark button
Alert button
Oct 26, 2023
Roy Hirsch, Regev Cohen, Mathilde Caron, Tomer Golany, Daniel Freedman, Ehud Rivlin

Viaarxiv icon

Self-Supervised Learning for Endoscopic Video Analysis

Add code
Bookmark button
Alert button
Aug 23, 2023
Roy Hirsch, Mathilde Caron, Regev Cohen, Amir Livne, Ron Shapiro, Tomer Golany, Roman Goldenberg, Daniel Freedman, Ehud Rivlin

Figure 1 for Self-Supervised Learning for Endoscopic Video Analysis
Figure 2 for Self-Supervised Learning for Endoscopic Video Analysis
Figure 3 for Self-Supervised Learning for Endoscopic Video Analysis
Figure 4 for Self-Supervised Learning for Endoscopic Video Analysis
Viaarxiv icon

Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

Add code
Bookmark button
Alert button
Jul 12, 2023
Mostafa Dehghani, Basil Mustafa, Josip Djolonga, Jonathan Heek, Matthias Minderer, Mathilde Caron, Andreas Steiner, Joan Puigcerver, Robert Geirhos, Ibrahim Alabdulmohsin, Avital Oliver, Piotr Padlewski, Alexey Gritsenko, Mario Lučić, Neil Houlsby

Figure 1 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Figure 2 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Figure 3 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Figure 4 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Viaarxiv icon

Retrieval-Enhanced Contrastive Vision-Text Models

Add code
Bookmark button
Alert button
Jun 12, 2023
Ahmet Iscen, Mathilde Caron, Alireza Fathi, Cordelia Schmid

Figure 1 for Retrieval-Enhanced Contrastive Vision-Text Models
Figure 2 for Retrieval-Enhanced Contrastive Vision-Text Models
Figure 3 for Retrieval-Enhanced Contrastive Vision-Text Models
Figure 4 for Retrieval-Enhanced Contrastive Vision-Text Models
Viaarxiv icon

Verbs in Action: Improving verb understanding in video-language models

Add code
Bookmark button
Alert button
Apr 13, 2023
Liliane Momeni, Mathilde Caron, Arsha Nagrani, Andrew Zisserman, Cordelia Schmid

Figure 1 for Verbs in Action: Improving verb understanding in video-language models
Figure 2 for Verbs in Action: Improving verb understanding in video-language models
Figure 3 for Verbs in Action: Improving verb understanding in video-language models
Figure 4 for Verbs in Action: Improving verb understanding in video-language models
Viaarxiv icon

Scaling Vision Transformers to 22 Billion Parameters

Add code
Bookmark button
Alert button
Feb 10, 2023
Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey Gritsenko, Vighnesh Birodkar, Cristina Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetić, Dustin Tran, Thomas Kipf, Mario Lučić, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen, Neil Houlsby

Figure 1 for Scaling Vision Transformers to 22 Billion Parameters
Figure 2 for Scaling Vision Transformers to 22 Billion Parameters
Figure 3 for Scaling Vision Transformers to 22 Billion Parameters
Figure 4 for Scaling Vision Transformers to 22 Billion Parameters
Viaarxiv icon

FlexiViT: One Model for All Patch Sizes

Add code
Bookmark button
Alert button
Dec 15, 2022
Lucas Beyer, Pavel Izmailov, Alexander Kolesnikov, Mathilde Caron, Simon Kornblith, Xiaohua Zhai, Matthias Minderer, Michael Tschannen, Ibrahim Alabdulmohsin, Filip Pavetic

Figure 1 for FlexiViT: One Model for All Patch Sizes
Figure 2 for FlexiViT: One Model for All Patch Sizes
Figure 3 for FlexiViT: One Model for All Patch Sizes
Figure 4 for FlexiViT: One Model for All Patch Sizes
Viaarxiv icon

Location-Aware Self-Supervised Transformers

Add code
Bookmark button
Alert button
Dec 05, 2022
Mathilde Caron, Neil Houlsby, Cordelia Schmid

Figure 1 for Location-Aware Self-Supervised Transformers
Figure 2 for Location-Aware Self-Supervised Transformers
Figure 3 for Location-Aware Self-Supervised Transformers
Figure 4 for Location-Aware Self-Supervised Transformers
Viaarxiv icon