Alert button
Picture for Alexey Gritsenko

Alexey Gritsenko

Alert button

Time-, Memory- and Parameter-Efficient Visual Adaptation

Add code
Bookmark button
Alert button
Feb 05, 2024
Otniel-Bogdan Mercea, Alexey Gritsenko, Cordelia Schmid, Anurag Arnab

Viaarxiv icon

Video OWL-ViT: Temporally-consistent open-world localization in video

Add code
Bookmark button
Alert button
Aug 22, 2023
Georg Heigold, Matthias Minderer, Alexey Gritsenko, Alex Bewley, Daniel Keysers, Mario Lučić, Fisher Yu, Thomas Kipf

Figure 1 for Video OWL-ViT: Temporally-consistent open-world localization in video
Figure 2 for Video OWL-ViT: Temporally-consistent open-world localization in video
Figure 3 for Video OWL-ViT: Temporally-consistent open-world localization in video
Figure 4 for Video OWL-ViT: Temporally-consistent open-world localization in video
Viaarxiv icon

Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

Add code
Bookmark button
Alert button
Jul 12, 2023
Mostafa Dehghani, Basil Mustafa, Josip Djolonga, Jonathan Heek, Matthias Minderer, Mathilde Caron, Andreas Steiner, Joan Puigcerver, Robert Geirhos, Ibrahim Alabdulmohsin, Avital Oliver, Piotr Padlewski, Alexey Gritsenko, Mario Lučić, Neil Houlsby

Figure 1 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Figure 2 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Figure 3 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Figure 4 for Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Viaarxiv icon

Scaling Open-Vocabulary Object Detection

Add code
Bookmark button
Alert button
Jun 16, 2023
Matthias Minderer, Alexey Gritsenko, Neil Houlsby

Figure 1 for Scaling Open-Vocabulary Object Detection
Figure 2 for Scaling Open-Vocabulary Object Detection
Figure 3 for Scaling Open-Vocabulary Object Detection
Figure 4 for Scaling Open-Vocabulary Object Detection
Viaarxiv icon

End-to-End Spatio-Temporal Action Localisation with Video Transformers

Add code
Bookmark button
Alert button
Apr 24, 2023
Alexey Gritsenko, Xuehan Xiong, Josip Djolonga, Mostafa Dehghani, Chen Sun, Mario Lučić, Cordelia Schmid, Anurag Arnab

Figure 1 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Figure 2 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Figure 3 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Figure 4 for End-to-End Spatio-Temporal Action Localisation with Video Transformers
Viaarxiv icon

Scaling Vision Transformers to 22 Billion Parameters

Add code
Bookmark button
Alert button
Feb 10, 2023
Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey Gritsenko, Vighnesh Birodkar, Cristina Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetić, Dustin Tran, Thomas Kipf, Mario Lučić, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen, Neil Houlsby

Figure 1 for Scaling Vision Transformers to 22 Billion Parameters
Figure 2 for Scaling Vision Transformers to 22 Billion Parameters
Figure 3 for Scaling Vision Transformers to 22 Billion Parameters
Figure 4 for Scaling Vision Transformers to 22 Billion Parameters
Viaarxiv icon

Imagen Video: High Definition Video Generation with Diffusion Models

Add code
Bookmark button
Alert button
Oct 05, 2022
Jonathan Ho, William Chan, Chitwan Saharia, Jay Whang, Ruiqi Gao, Alexey Gritsenko, Diederik P. Kingma, Ben Poole, Mohammad Norouzi, David J. Fleet, Tim Salimans

Figure 1 for Imagen Video: High Definition Video Generation with Diffusion Models
Figure 2 for Imagen Video: High Definition Video Generation with Diffusion Models
Figure 3 for Imagen Video: High Definition Video Generation with Diffusion Models
Figure 4 for Imagen Video: High Definition Video Generation with Diffusion Models
Viaarxiv icon

Beyond Transfer Learning: Co-finetuning for Action Localisation

Add code
Bookmark button
Alert button
Jul 08, 2022
Anurag Arnab, Xuehan Xiong, Alexey Gritsenko, Rob Romijnders, Josip Djolonga, Mostafa Dehghani, Chen Sun, Mario Lučić, Cordelia Schmid

Figure 1 for Beyond Transfer Learning: Co-finetuning for Action Localisation
Figure 2 for Beyond Transfer Learning: Co-finetuning for Action Localisation
Figure 3 for Beyond Transfer Learning: Co-finetuning for Action Localisation
Figure 4 for Beyond Transfer Learning: Co-finetuning for Action Localisation
Viaarxiv icon

Simple Open-Vocabulary Object Detection with Vision Transformers

Add code
Bookmark button
Alert button
May 12, 2022
Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, Neil Houlsby

Figure 1 for Simple Open-Vocabulary Object Detection with Vision Transformers
Figure 2 for Simple Open-Vocabulary Object Detection with Vision Transformers
Figure 3 for Simple Open-Vocabulary Object Detection with Vision Transformers
Figure 4 for Simple Open-Vocabulary Object Detection with Vision Transformers
Viaarxiv icon

Video Diffusion Models

Add code
Bookmark button
Alert button
Apr 07, 2022
Jonathan Ho, Tim Salimans, Alexey Gritsenko, William Chan, Mohammad Norouzi, David J. Fleet

Figure 1 for Video Diffusion Models
Figure 2 for Video Diffusion Models
Figure 3 for Video Diffusion Models
Figure 4 for Video Diffusion Models
Viaarxiv icon