Alert button
Picture for Vighnesh Birodkar

Vighnesh Birodkar

Alert button

VideoPoet: A Large Language Model for Zero-Shot Video Generation

Dec 21, 2023
Dan Kondratyuk, Lijun Yu, Xiuye Gu, José Lezama, Jonathan Huang, Rachel Hornung, Hartwig Adam, Hassan Akbari, Yair Alon, Vighnesh Birodkar, Yong Cheng, Ming-Chang Chiu, Josh Dillon, Irfan Essa, Agrim Gupta, Meera Hahn, Anja Hauth, David Hendon, Alonso Martinez, David Minnen, David Ross, Grant Schindler, Mikhail Sirotenko, Kihyuk Sohn, Krishna Somandepalli, Huisheng Wang, Jimmy Yan, Ming-Hsuan Yang, Xuan Yang, Bryan Seybold, Lu Jiang

Viaarxiv icon

Text and Click inputs for unambiguous open vocabulary instance segmentation

Nov 24, 2023
Nikolai Warner, Meera Hahn, Jonathan Huang, Irfan Essa, Vighnesh Birodkar

Viaarxiv icon

Scaling Vision Transformers to 22 Billion Parameters

Feb 10, 2023
Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey Gritsenko, Vighnesh Birodkar, Cristina Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetić, Dustin Tran, Thomas Kipf, Mario Lučić, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen, Neil Houlsby

Figure 1 for Scaling Vision Transformers to 22 Billion Parameters
Figure 2 for Scaling Vision Transformers to 22 Billion Parameters
Figure 3 for Scaling Vision Transformers to 22 Billion Parameters
Figure 4 for Scaling Vision Transformers to 22 Billion Parameters
Viaarxiv icon

Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features

Dec 20, 2022
Vivek Rathod, Bryan Seybold, Sudheendra Vijayanarasimhan, Austin Myers, Xiuye Gu, Vighnesh Birodkar, David A. Ross

Figure 1 for Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features
Figure 2 for Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features
Figure 3 for Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features
Figure 4 for Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features
Viaarxiv icon

Proper Reuse of Image Classification Features Improves Object Detection

Apr 01, 2022
Cristina Vasconcelos, Vighnesh Birodkar, Vincent Dumoulin

Figure 1 for Proper Reuse of Image Classification Features Improves Object Detection
Figure 2 for Proper Reuse of Image Classification Features Improves Object Detection
Figure 3 for Proper Reuse of Image Classification Features Improves Object Detection
Figure 4 for Proper Reuse of Image Classification Features Improves Object Detection
Viaarxiv icon

Less is More: Generating Grounded Navigation Instructions from Landmarks

Nov 29, 2021
Su Wang, Ceslee Montgomery, Jordi Orbay, Vighnesh Birodkar, Aleksandra Faust, Izzeddin Gur, Natasha Jaques, Austin Waters, Jason Baldridge, Peter Anderson

Figure 1 for Less is More: Generating Grounded Navigation Instructions from Landmarks
Figure 2 for Less is More: Generating Grounded Navigation Instructions from Landmarks
Figure 3 for Less is More: Generating Grounded Navigation Instructions from Landmarks
Figure 4 for Less is More: Generating Grounded Navigation Instructions from Landmarks
Viaarxiv icon

The iWildCam 2021 Competition Dataset

May 07, 2021
Sara Beery, Arushi Agarwal, Elijah Cole, Vighnesh Birodkar

Figure 1 for The iWildCam 2021 Competition Dataset
Figure 2 for The iWildCam 2021 Competition Dataset
Figure 3 for The iWildCam 2021 Competition Dataset
Figure 4 for The iWildCam 2021 Competition Dataset
Viaarxiv icon

The surprising impact of mask-head architecture on novel class segmentation

Apr 01, 2021
Vighnesh Birodkar, Zhichao Lu, Siyang Li, Vivek Rathod, Jonathan Huang

Figure 1 for The surprising impact of mask-head architecture on novel class segmentation
Figure 2 for The surprising impact of mask-head architecture on novel class segmentation
Figure 3 for The surprising impact of mask-head architecture on novel class segmentation
Figure 4 for The surprising impact of mask-head architecture on novel class segmentation
Viaarxiv icon

A Closed-Form Learned Pooling for Deep Classification Networks

Jun 10, 2019
Vighnesh Birodkar, Hossein Mobahi, Dilip Krishnan, Samy Bengio

Figure 1 for A Closed-Form Learned Pooling for Deep Classification Networks
Figure 2 for A Closed-Form Learned Pooling for Deep Classification Networks
Figure 3 for A Closed-Form Learned Pooling for Deep Classification Networks
Figure 4 for A Closed-Form Learned Pooling for Deep Classification Networks
Viaarxiv icon