Picture for Jean-Baptiste Alayrac

Jean-Baptiste Alayrac

Dima

Perceiver IO: A General Architecture for Structured Inputs & Outputs

Add code
Aug 02, 2021
Figure 1 for Perceiver IO: A General Architecture for Structured Inputs & Outputs
Figure 2 for Perceiver IO: A General Architecture for Structured Inputs & Outputs
Figure 3 for Perceiver IO: A General Architecture for Structured Inputs & Outputs
Figure 4 for Perceiver IO: A General Architecture for Structured Inputs & Outputs
Viaarxiv icon

Generative Art Using Neural Visual Grammars and Dual Encoders

Add code
May 04, 2021
Figure 1 for Generative Art Using Neural Visual Grammars and Dual Encoders
Figure 2 for Generative Art Using Neural Visual Grammars and Dual Encoders
Figure 3 for Generative Art Using Neural Visual Grammars and Dual Encoders
Figure 4 for Generative Art Using Neural Visual Grammars and Dual Encoders
Viaarxiv icon

Multimodal Self-Supervised Learning of General Audio Representations

Add code
Apr 28, 2021
Figure 1 for Multimodal Self-Supervised Learning of General Audio Representations
Figure 2 for Multimodal Self-Supervised Learning of General Audio Representations
Figure 3 for Multimodal Self-Supervised Learning of General Audio Representations
Figure 4 for Multimodal Self-Supervised Learning of General Audio Representations
Viaarxiv icon

Machine Translation Decoding beyond Beam Search

Add code
Apr 12, 2021
Figure 1 for Machine Translation Decoding beyond Beam Search
Figure 2 for Machine Translation Decoding beyond Beam Search
Figure 3 for Machine Translation Decoding beyond Beam Search
Figure 4 for Machine Translation Decoding beyond Beam Search
Viaarxiv icon

Broaden Your Views for Self-Supervised Video Learning

Add code
Mar 30, 2021
Figure 1 for Broaden Your Views for Self-Supervised Video Learning
Figure 2 for Broaden Your Views for Self-Supervised Video Learning
Figure 3 for Broaden Your Views for Self-Supervised Video Learning
Figure 4 for Broaden Your Views for Self-Supervised Video Learning
Viaarxiv icon

Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers

Add code
Mar 30, 2021
Figure 1 for Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
Figure 2 for Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
Figure 3 for Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
Figure 4 for Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
Viaarxiv icon

Efficient Visual Pretraining with Contrastive Detection

Add code
Mar 19, 2021
Figure 1 for Efficient Visual Pretraining with Contrastive Detection
Figure 2 for Efficient Visual Pretraining with Contrastive Detection
Figure 3 for Efficient Visual Pretraining with Contrastive Detection
Figure 4 for Efficient Visual Pretraining with Contrastive Detection
Viaarxiv icon

Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers

Add code
Jan 31, 2021
Figure 1 for Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Figure 2 for Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Figure 3 for Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Figure 4 for Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Viaarxiv icon

RareAct: A video dataset of unusual interactions

Add code
Aug 03, 2020
Figure 1 for RareAct: A video dataset of unusual interactions
Figure 2 for RareAct: A video dataset of unusual interactions
Figure 3 for RareAct: A video dataset of unusual interactions
Figure 4 for RareAct: A video dataset of unusual interactions
Viaarxiv icon

Self-Supervised MultiModal Versatile Networks

Add code
Jun 29, 2020
Figure 1 for Self-Supervised MultiModal Versatile Networks
Figure 2 for Self-Supervised MultiModal Versatile Networks
Figure 3 for Self-Supervised MultiModal Versatile Networks
Figure 4 for Self-Supervised MultiModal Versatile Networks
Viaarxiv icon