Picture for Mathilde Caron

Mathilde Caron

A Generative Approach for Wikipedia-Scale Visual Entity Recognition

Add code
Mar 04, 2024
Figure 1 for A Generative Approach for Wikipedia-Scale Visual Entity Recognition
Figure 2 for A Generative Approach for Wikipedia-Scale Visual Entity Recognition
Figure 3 for A Generative Approach for Wikipedia-Scale Visual Entity Recognition
Figure 4 for A Generative Approach for Wikipedia-Scale Visual Entity Recognition
Viaarxiv icon

Guided Diffusion from Self-Supervised Diffusion Features

Add code
Dec 14, 2023
Figure 1 for Guided Diffusion from Self-Supervised Diffusion Features
Figure 2 for Guided Diffusion from Self-Supervised Diffusion Features
Figure 3 for Guided Diffusion from Self-Supervised Diffusion Features
Figure 4 for Guided Diffusion from Self-Supervised Diffusion Features
Viaarxiv icon

Weakly-Supervised Surgical Phase Recognition

Add code
Oct 26, 2023
Viaarxiv icon

Self-Supervised Learning for Endoscopic Video Analysis

Add code
Aug 23, 2023
Figure 1 for Self-Supervised Learning for Endoscopic Video Analysis
Figure 2 for Self-Supervised Learning for Endoscopic Video Analysis
Figure 3 for Self-Supervised Learning for Endoscopic Video Analysis
Figure 4 for Self-Supervised Learning for Endoscopic Video Analysis
Viaarxiv icon

Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

Add code
Jul 12, 2023
Figure 1 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Figure 2 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Figure 3 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Figure 4 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Viaarxiv icon

Retrieval-Enhanced Contrastive Vision-Text Models

Add code
Jun 12, 2023
Figure 1 for Retrieval-Enhanced Contrastive Vision-Text Models
Figure 2 for Retrieval-Enhanced Contrastive Vision-Text Models
Figure 3 for Retrieval-Enhanced Contrastive Vision-Text Models
Figure 4 for Retrieval-Enhanced Contrastive Vision-Text Models
Viaarxiv icon

Verbs in Action: Improving verb understanding in video-language models

Add code
Apr 13, 2023
Figure 1 for Verbs in Action: Improving verb understanding in video-language models
Figure 2 for Verbs in Action: Improving verb understanding in video-language models
Figure 3 for Verbs in Action: Improving verb understanding in video-language models
Figure 4 for Verbs in Action: Improving verb understanding in video-language models
Viaarxiv icon

Scaling Vision Transformers to 22 Billion Parameters

Add code
Feb 10, 2023
Figure 1 for Scaling Vision Transformers to 22 Billion Parameters
Figure 2 for Scaling Vision Transformers to 22 Billion Parameters
Figure 3 for Scaling Vision Transformers to 22 Billion Parameters
Figure 4 for Scaling Vision Transformers to 22 Billion Parameters
Viaarxiv icon

FlexiViT: One Model for All Patch Sizes

Add code
Dec 15, 2022
Figure 1 for FlexiViT: One Model for All Patch Sizes
Figure 2 for FlexiViT: One Model for All Patch Sizes
Figure 3 for FlexiViT: One Model for All Patch Sizes
Figure 4 for FlexiViT: One Model for All Patch Sizes
Viaarxiv icon

Location-Aware Self-Supervised Transformers

Add code
Dec 05, 2022
Figure 1 for Location-Aware Self-Supervised Transformers
Figure 2 for Location-Aware Self-Supervised Transformers
Figure 3 for Location-Aware Self-Supervised Transformers
Figure 4 for Location-Aware Self-Supervised Transformers
Viaarxiv icon