Picture for Jean-Baptiste Alayrac

Jean-Baptiste Alayrac

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Making the Most of What You Have: Adapting Pre-trained Visual Language Models in the Low-data Regime

Add code
May 03, 2023
Figure 1 for Making the Most of What You Have: Adapting Pre-trained Visual Language Models in the Low-data Regime
Figure 2 for Making the Most of What You Have: Adapting Pre-trained Visual Language Models in the Low-data Regime
Figure 3 for Making the Most of What You Have: Adapting Pre-trained Visual Language Models in the Low-data Regime
Figure 4 for Making the Most of What You Have: Adapting Pre-trained Visual Language Models in the Low-data Regime
Viaarxiv icon

Three ways to improve feature alignment for open vocabulary detection

Add code
Mar 23, 2023
Figure 1 for Three ways to improve feature alignment for open vocabulary detection
Figure 2 for Three ways to improve feature alignment for open vocabulary detection
Figure 3 for Three ways to improve feature alignment for open vocabulary detection
Figure 4 for Three ways to improve feature alignment for open vocabulary detection
Viaarxiv icon

Multi-Task Learning of Object State Changes from Uncurated Videos

Add code
Nov 24, 2022
Figure 1 for Multi-Task Learning of Object State Changes from Uncurated Videos
Figure 2 for Multi-Task Learning of Object State Changes from Uncurated Videos
Figure 3 for Multi-Task Learning of Object State Changes from Uncurated Videos
Figure 4 for Multi-Task Learning of Object State Changes from Uncurated Videos
Viaarxiv icon

Flamingo: a Visual Language Model for Few-Shot Learning

Add code
Apr 29, 2022
Figure 1 for Flamingo: a Visual Language Model for Few-Shot Learning
Figure 2 for Flamingo: a Visual Language Model for Few-Shot Learning
Figure 3 for Flamingo: a Visual Language Model for Few-Shot Learning
Figure 4 for Flamingo: a Visual Language Model for Few-Shot Learning
Viaarxiv icon

Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos

Add code
Mar 22, 2022
Figure 1 for Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos
Figure 2 for Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos
Figure 3 for Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos
Figure 4 for Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos
Viaarxiv icon

General-purpose, long-context autoregressive modeling with Perceiver AR

Add code
Feb 15, 2022
Figure 1 for General-purpose, long-context autoregressive modeling with Perceiver AR
Figure 2 for General-purpose, long-context autoregressive modeling with Perceiver AR
Figure 3 for General-purpose, long-context autoregressive modeling with Perceiver AR
Figure 4 for General-purpose, long-context autoregressive modeling with Perceiver AR
Viaarxiv icon

Towards Learning Universal Audio Representations

Add code
Dec 01, 2021
Figure 1 for Towards Learning Universal Audio Representations
Figure 2 for Towards Learning Universal Audio Representations
Figure 3 for Towards Learning Universal Audio Representations
Figure 4 for Towards Learning Universal Audio Representations
Viaarxiv icon

Perceiver IO: A General Architecture for Structured Inputs & Outputs

Add code
Aug 02, 2021
Figure 1 for Perceiver IO: A General Architecture for Structured Inputs & Outputs
Figure 2 for Perceiver IO: A General Architecture for Structured Inputs & Outputs
Figure 3 for Perceiver IO: A General Architecture for Structured Inputs & Outputs
Figure 4 for Perceiver IO: A General Architecture for Structured Inputs & Outputs
Viaarxiv icon

Generative Art Using Neural Visual Grammars and Dual Encoders

Add code
May 04, 2021
Figure 1 for Generative Art Using Neural Visual Grammars and Dual Encoders
Figure 2 for Generative Art Using Neural Visual Grammars and Dual Encoders
Figure 3 for Generative Art Using Neural Visual Grammars and Dual Encoders
Figure 4 for Generative Art Using Neural Visual Grammars and Dual Encoders
Viaarxiv icon