Picture for Josip Djolonga

Josip Djolonga

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

Add code
Jul 12, 2023
Figure 1 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Figure 2 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Figure 3 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Figure 4 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Viaarxiv icon

PaLI-X: On Scaling up a Multilingual Vision and Language Model

Add code
May 29, 2023
Figure 1 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 2 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 3 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 4 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Viaarxiv icon

End-to-End Spatio-Temporal Action Localisation with Video Transformers

Add code
Apr 24, 2023
Figure 1 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Figure 2 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Figure 3 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Figure 4 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Viaarxiv icon

Scaling Vision Transformers to 22 Billion Parameters

Add code
Feb 10, 2023
Figure 1 for Scaling Vision Transformers to 22 Billion Parameters
Figure 2 for Scaling Vision Transformers to 22 Billion Parameters
Figure 3 for Scaling Vision Transformers to 22 Billion Parameters
Figure 4 for Scaling Vision Transformers to 22 Billion Parameters
Viaarxiv icon

Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective

Add code
Feb 06, 2023
Figure 1 for Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective
Figure 2 for Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective
Figure 3 for Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective
Figure 4 for Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective
Viaarxiv icon

Beyond Transfer Learning: Co-finetuning for Action Localisation

Add code
Jul 08, 2022
Figure 1 for Beyond Transfer Learning: Co-finetuning for Action Localisation
Figure 2 for Beyond Transfer Learning: Co-finetuning for Action Localisation
Figure 3 for Beyond Transfer Learning: Co-finetuning for Action Localisation
Figure 4 for Beyond Transfer Learning: Co-finetuning for Action Localisation
Viaarxiv icon

Revisiting the Calibration of Modern Neural Networks

Add code
Jun 15, 2021
Figure 1 for Revisiting the Calibration of Modern Neural Networks
Figure 2 for Revisiting the Calibration of Modern Neural Networks
Figure 3 for Revisiting the Calibration of Modern Neural Networks
Figure 4 for Revisiting the Calibration of Modern Neural Networks
Viaarxiv icon

Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning

Add code
Jun 07, 2021
Figure 1 for Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning
Figure 2 for Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning
Figure 3 for Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning
Figure 4 for Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning
Viaarxiv icon