Picture for Sergi Caelles

Sergi Caelles

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers

Add code
Jan 03, 2024
Figure 1 for Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers
Figure 2 for Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers
Figure 3 for Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers
Figure 4 for Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

VCT: A Video Compression Transformer

Add code
Jun 15, 2022
Figure 1 for VCT: A Video Compression Transformer
Figure 2 for VCT: A Video Compression Transformer
Figure 3 for VCT: A Video Compression Transformer
Figure 4 for VCT: A Video Compression Transformer
Viaarxiv icon

The 2019 DAVIS Challenge on VOS: Unsupervised Multi-Object Segmentation

Add code
May 02, 2019
Figure 1 for The 2019 DAVIS Challenge on VOS: Unsupervised Multi-Object Segmentation
Figure 2 for The 2019 DAVIS Challenge on VOS: Unsupervised Multi-Object Segmentation
Figure 3 for The 2019 DAVIS Challenge on VOS: Unsupervised Multi-Object Segmentation
Viaarxiv icon

Fast video object segmentation with Spatio-Temporal GANs

Add code
Mar 28, 2019
Figure 1 for Fast video object segmentation with Spatio-Temporal GANs
Figure 2 for Fast video object segmentation with Spatio-Temporal GANs
Figure 3 for Fast video object segmentation with Spatio-Temporal GANs
Figure 4 for Fast video object segmentation with Spatio-Temporal GANs
Viaarxiv icon

Iterative Deep Learning for Road Topology Extraction

Add code
Aug 28, 2018
Figure 1 for Iterative Deep Learning for Road Topology Extraction
Figure 2 for Iterative Deep Learning for Road Topology Extraction
Figure 3 for Iterative Deep Learning for Road Topology Extraction
Figure 4 for Iterative Deep Learning for Road Topology Extraction
Viaarxiv icon

Semantically-Guided Video Object Segmentation

Add code
Jul 17, 2018
Figure 1 for Semantically-Guided Video Object Segmentation
Figure 2 for Semantically-Guided Video Object Segmentation
Figure 3 for Semantically-Guided Video Object Segmentation
Figure 4 for Semantically-Guided Video Object Segmentation
Viaarxiv icon

Video Object Segmentation Without Temporal Information

Add code
May 16, 2018
Figure 1 for Video Object Segmentation Without Temporal Information
Figure 2 for Video Object Segmentation Without Temporal Information
Figure 3 for Video Object Segmentation Without Temporal Information
Figure 4 for Video Object Segmentation Without Temporal Information
Viaarxiv icon

Deep Extreme Cut: From Extreme Points to Object Segmentation

Add code
Mar 27, 2018
Figure 1 for Deep Extreme Cut: From Extreme Points to Object Segmentation
Figure 2 for Deep Extreme Cut: From Extreme Points to Object Segmentation
Figure 3 for Deep Extreme Cut: From Extreme Points to Object Segmentation
Figure 4 for Deep Extreme Cut: From Extreme Points to Object Segmentation
Viaarxiv icon