Alert button
Picture for Anelia Angelova

Anelia Angelova

Alert button

MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks

Add code
Bookmark button
Alert button
Mar 30, 2023
Weicheng Kuo, AJ Piergiovanni, Dahun Kim, Xiyang Luo, Ben Caine, Wei Li, Abhijit Ogale, Luowei Zhou, Andrew Dai, Zhifeng Chen, Claire Cui, Anelia Angelova

Figure 1 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Figure 2 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Figure 3 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Figure 4 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Viaarxiv icon

Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning

Add code
Bookmark button
Alert button
Dec 06, 2022
AJ Piergiovanni, Weicheng Kuo, Anelia Angelova

Figure 1 for Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
Figure 2 for Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
Figure 3 for Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
Figure 4 for Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
Viaarxiv icon

F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models

Add code
Bookmark button
Alert button
Sep 30, 2022
Weicheng Kuo, Yin Cui, Xiuye Gu, AJ Piergiovanni, Anelia Angelova

Figure 1 for F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Figure 2 for F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Figure 3 for F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Figure 4 for F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Viaarxiv icon

PaLI: A Jointly-Scaled Multilingual Language-Image Model

Add code
Bookmark button
Alert button
Sep 16, 2022
Xi Chen, Xiao Wang, Soravit Changpinyo, AJ Piergiovanni, Piotr Padlewski, Daniel Salz, Sebastian Goodman, Adam Grycner, Basil Mustafa, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Nan Ding, Keran Rong, Hassan Akbari, Gaurav Mishra, Linting Xue, Ashish Thapliyal, James Bradbury, Weicheng Kuo, Mojtaba Seyedhosseini, Chao Jia, Burcu Karagol Ayan, Carlos Riquelme, Andreas Steiner, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut

Figure 1 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 2 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 3 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 4 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Viaarxiv icon

Pre-training image-language transformers for open-vocabulary tasks

Add code
Bookmark button
Alert button
Sep 09, 2022
AJ Piergiovanni, Weicheng Kuo, Anelia Angelova

Figure 1 for Pre-training image-language transformers for open-vocabulary tasks
Figure 2 for Pre-training image-language transformers for open-vocabulary tasks
Figure 3 for Pre-training image-language transformers for open-vocabulary tasks
Figure 4 for Pre-training image-language transformers for open-vocabulary tasks
Viaarxiv icon

Video Question Answering with Iterative Video-Text Co-Tokenization

Add code
Bookmark button
Alert button
Aug 01, 2022
AJ Piergiovanni, Kairo Morton, Weicheng Kuo, Michael S. Ryoo, Anelia Angelova

Figure 1 for Video Question Answering with Iterative Video-Text Co-Tokenization
Figure 2 for Video Question Answering with Iterative Video-Text Co-Tokenization
Figure 3 for Video Question Answering with Iterative Video-Text Co-Tokenization
Figure 4 for Video Question Answering with Iterative Video-Text Co-Tokenization
Viaarxiv icon

Mechanical Search on Shelves with Efficient Stacking and Destacking of Objects

Add code
Bookmark button
Alert button
Jul 05, 2022
Huang Huang, Letian Fu, Michael Danielczuk, Chung Min Kim, Zachary Tam, Jeffrey Ichnowski, Anelia Angelova, Brian Ichter, Ken Goldberg

Figure 1 for Mechanical Search on Shelves with Efficient Stacking and Destacking of Objects
Figure 2 for Mechanical Search on Shelves with Efficient Stacking and Destacking of Objects
Figure 3 for Mechanical Search on Shelves with Efficient Stacking and Destacking of Objects
Figure 4 for Mechanical Search on Shelves with Efficient Stacking and Destacking of Objects
Viaarxiv icon

Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering

Add code
Bookmark button
Alert button
May 02, 2022
AJ Piergiovanni, Wei Li, Weicheng Kuo, Mohammad Saffar, Fred Bertsch, Anelia Angelova

Figure 1 for Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
Figure 2 for Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
Figure 3 for Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
Figure 4 for Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
Viaarxiv icon

FindIt: Generalized Localization with Natural Language Queries

Add code
Bookmark button
Alert button
Mar 31, 2022
Weicheng Kuo, Fred Bertsch, Wei Li, AJ Piergiovanni, Mohammad Saffar, Anelia Angelova

Figure 1 for FindIt: Generalized Localization with Natural Language Queries
Figure 2 for FindIt: Generalized Localization with Natural Language Queries
Figure 3 for FindIt: Generalized Localization with Natural Language Queries
Figure 4 for FindIt: Generalized Localization with Natural Language Queries
Viaarxiv icon

Mechanical Search on Shelves using a Novel "Bluction" Tool

Add code
Bookmark button
Alert button
Jan 22, 2022
Huang Huang, Michael Danielczuk, Chung Min Kim, Letian Fu, Zachary Tam, Jeffrey Ichnowski, Anelia Angelova, Brian Ichter, Ken Goldberg

Figure 1 for Mechanical Search on Shelves using a Novel "Bluction" Tool
Figure 2 for Mechanical Search on Shelves using a Novel "Bluction" Tool
Figure 3 for Mechanical Search on Shelves using a Novel "Bluction" Tool
Figure 4 for Mechanical Search on Shelves using a Novel "Bluction" Tool
Viaarxiv icon