Alert button
Picture for Mannat Singh

Mannat Singh

Alert button

Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning

Add code
Bookmark button
Alert button
Nov 17, 2023
Rohit Girdhar, Mannat Singh, Andrew Brown, Quentin Duval, Samaneh Azadi, Sai Saketh Rambhatla, Akbar Shah, Xi Yin, Devi Parikh, Ishan Misra

Viaarxiv icon

ImageBind: One Embedding Space To Bind Them All

Add code
Bookmark button
Alert button
May 09, 2023
Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, Ishan Misra

Figure 1 for ImageBind: One Embedding Space To Bind Them All
Figure 2 for ImageBind: One Embedding Space To Bind Them All
Figure 3 for ImageBind: One Embedding Space To Bind Them All
Figure 4 for ImageBind: One Embedding Space To Bind Them All
Viaarxiv icon

The effectiveness of MAE pre-pretraining for billion-scale pretraining

Add code
Bookmark button
Alert button
Mar 23, 2023
Mannat Singh, Quentin Duval, Kalyan Vasudev Alwala, Haoqi Fan, Vaibhav Aggarwal, Aaron Adcock, Armand Joulin, Piotr Dollár, Christoph Feichtenhofer, Ross Girshick, Rohit Girdhar, Ishan Misra

Figure 1 for The effectiveness of MAE pre-pretraining for billion-scale pretraining
Figure 2 for The effectiveness of MAE pre-pretraining for billion-scale pretraining
Figure 3 for The effectiveness of MAE pre-pretraining for billion-scale pretraining
Figure 4 for The effectiveness of MAE pre-pretraining for billion-scale pretraining
Viaarxiv icon

OmniMAE: Single Model Masked Pretraining on Images and Videos

Add code
Bookmark button
Alert button
Jun 16, 2022
Rohit Girdhar, Alaaeldin El-Nouby, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, Ishan Misra

Figure 1 for OmniMAE: Single Model Masked Pretraining on Images and Videos
Figure 2 for OmniMAE: Single Model Masked Pretraining on Images and Videos
Figure 3 for OmniMAE: Single Model Masked Pretraining on Images and Videos
Figure 4 for OmniMAE: Single Model Masked Pretraining on Images and Videos
Viaarxiv icon

Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision

Add code
Bookmark button
Alert button
Feb 16, 2022
Priya Goyal, Quentin Duval, Isaac Seessel, Mathilde Caron, Mannat Singh, Ishan Misra, Levent Sagun, Armand Joulin, Piotr Bojanowski

Figure 1 for Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
Figure 2 for Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
Figure 3 for Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
Figure 4 for Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
Viaarxiv icon

Omnivore: A Single Model for Many Visual Modalities

Add code
Bookmark button
Alert button
Jan 20, 2022
Rohit Girdhar, Mannat Singh, Nikhila Ravi, Laurens van der Maaten, Armand Joulin, Ishan Misra

Figure 1 for Omnivore: A Single Model for Many Visual Modalities
Figure 2 for Omnivore: A Single Model for Many Visual Modalities
Figure 3 for Omnivore: A Single Model for Many Visual Modalities
Figure 4 for Omnivore: A Single Model for Many Visual Modalities
Viaarxiv icon

Revisiting Weakly Supervised Pre-Training of Visual Perception Models

Add code
Bookmark button
Alert button
Jan 20, 2022
Mannat Singh, Laura Gustafson, Aaron Adcock, Vinicius de Freitas Reis, Bugra Gedik, Raj Prateek Kosaraju, Dhruv Mahajan, Ross Girshick, Piotr Dollár, Laurens van der Maaten

Figure 1 for Revisiting Weakly Supervised Pre-Training of Visual Perception Models
Figure 2 for Revisiting Weakly Supervised Pre-Training of Visual Perception Models
Figure 3 for Revisiting Weakly Supervised Pre-Training of Visual Perception Models
Figure 4 for Revisiting Weakly Supervised Pre-Training of Visual Perception Models
Viaarxiv icon

Early Convolutions Help Transformers See Better

Add code
Bookmark button
Alert button
Jul 12, 2021
Tete Xiao, Mannat Singh, Eric Mintun, Trevor Darrell, Piotr Dollár, Ross Girshick

Figure 1 for Early Convolutions Help Transformers See Better
Figure 2 for Early Convolutions Help Transformers See Better
Figure 3 for Early Convolutions Help Transformers See Better
Figure 4 for Early Convolutions Help Transformers See Better
Viaarxiv icon

MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding

Add code
Bookmark button
Alert button
Apr 26, 2021
Aishwarya Kamath, Mannat Singh, Yann LeCun, Ishan Misra, Gabriel Synnaeve, Nicolas Carion

Figure 1 for MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Figure 2 for MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Figure 3 for MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Figure 4 for MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Viaarxiv icon