Alert button
Picture for Lucas Beyer

Lucas Beyer

Alert button

PaLI-3 Vision Language Models: Smaller, Faster, Stronger

Oct 17, 2023
Xi Chen, Xiao Wang, Lucas Beyer, Alexander Kolesnikov, Jialin Wu, Paul Voigtlaender, Basil Mustafa, Sebastian Goodman, Ibrahim Alabdulmohsin, Piotr Padlewski, Daniel Salz, Xi Xiong, Daniel Vlasic, Filip Pavetic, Keran Rong, Tianli Yu, Daniel Keysers, Xiaohua Zhai, Radu Soricut

Figure 1 for PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Figure 2 for PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Figure 3 for PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Figure 4 for PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Viaarxiv icon

Image Captioners Are Scalable Vision Learners Too

Jun 13, 2023
Michael Tschannen, Manoj Kumar, Andreas Steiner, Xiaohua Zhai, Neil Houlsby, Lucas Beyer

Figure 1 for Image Captioners Are Scalable Vision Learners Too
Figure 2 for Image Captioners Are Scalable Vision Learners Too
Figure 3 for Image Captioners Are Scalable Vision Learners Too
Figure 4 for Image Captioners Are Scalable Vision Learners Too
Viaarxiv icon

PaLI-X: On Scaling up a Multilingual Vision and Language Model

May 29, 2023
Xi Chen, Josip Djolonga, Piotr Padlewski, Basil Mustafa, Soravit Changpinyo, Jialin Wu, Carlos Riquelme Ruiz, Sebastian Goodman, Xiao Wang, Yi Tay, Siamak Shakeri, Mostafa Dehghani, Daniel Salz, Mario Lucic, Michael Tschannen, Arsha Nagrani, Hexiang Hu, Mandar Joshi, Bo Pang, Ceslee Montgomery, Paulina Pietrzyk, Marvin Ritter, AJ Piergiovanni, Matthias Minderer, Filip Pavetic, Austin Waters, Gang Li, Ibrahim Alabdulmohsin, Lucas Beyer, Julien Amelot, Kenton Lee, Andreas Peter Steiner, Yang Li, Daniel Keysers, Anurag Arnab, Yuanzhong Xu, Keran Rong, Alexander Kolesnikov, Mojtaba Seyedhosseini, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut

Figure 1 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 2 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 3 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 4 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Viaarxiv icon

Three Towers: Flexible Contrastive Learning with Pretrained Image Models

May 29, 2023
Jannik Kossen, Mark Collier, Basil Mustafa, Xiao Wang, Xiaohua Zhai, Lucas Beyer, Andreas Steiner, Jesse Berent, Rodolphe Jenatton, Efi Kokiopoulou

Figure 1 for Three Towers: Flexible Contrastive Learning with Pretrained Image Models
Figure 2 for Three Towers: Flexible Contrastive Learning with Pretrained Image Models
Figure 3 for Three Towers: Flexible Contrastive Learning with Pretrained Image Models
Figure 4 for Three Towers: Flexible Contrastive Learning with Pretrained Image Models
Viaarxiv icon

Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design

May 22, 2023
Ibrahim Alabdulmohsin, Xiaohua Zhai, Alexander Kolesnikov, Lucas Beyer

Figure 1 for Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design
Figure 2 for Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design
Figure 3 for Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design
Figure 4 for Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design
Viaarxiv icon

Sigmoid Loss for Language Image Pre-Training

Mar 30, 2023
Xiaohua Zhai, Basil Mustafa, Alexander Kolesnikov, Lucas Beyer

Figure 1 for Sigmoid Loss for Language Image Pre-Training
Figure 2 for Sigmoid Loss for Language Image Pre-Training
Figure 3 for Sigmoid Loss for Language Image Pre-Training
Figure 4 for Sigmoid Loss for Language Image Pre-Training
Viaarxiv icon

A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision

Mar 30, 2023
Lucas Beyer, Bo Wan, Gagan Madan, Filip Pavetic, Andreas Steiner, Alexander Kolesnikov, André Susano Pinto, Emanuele Bugliarello, Xiao Wang, Qihang Yu, Liang-Chieh Chen, Xiaohua Zhai

Figure 1 for A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision
Figure 2 for A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision
Figure 3 for A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision
Figure 4 for A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision
Viaarxiv icon