Picture for Anelia Angelova

Anelia Angelova

PaLI-X: On Scaling up a Multilingual Vision and Language Model

Add code
May 29, 2023
Figure 1 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 2 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 3 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 4 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Viaarxiv icon

Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers

Add code
May 11, 2023
Figure 1 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 2 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 3 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 4 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Viaarxiv icon

MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks

Add code
Mar 30, 2023
Figure 1 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Figure 2 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Figure 3 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Figure 4 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Viaarxiv icon

Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning

Add code
Dec 06, 2022
Viaarxiv icon

F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models

Add code
Sep 30, 2022
Figure 1 for F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Figure 2 for F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Figure 3 for F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Figure 4 for F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Viaarxiv icon

PaLI: A Jointly-Scaled Multilingual Language-Image Model

Add code
Sep 16, 2022
Figure 1 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 2 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 3 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 4 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Viaarxiv icon

Pre-training image-language transformers for open-vocabulary tasks

Add code
Sep 09, 2022
Figure 1 for Pre-training image-language transformers for open-vocabulary tasks
Figure 2 for Pre-training image-language transformers for open-vocabulary tasks
Figure 3 for Pre-training image-language transformers for open-vocabulary tasks
Figure 4 for Pre-training image-language transformers for open-vocabulary tasks
Viaarxiv icon

Video Question Answering with Iterative Video-Text Co-Tokenization

Add code
Aug 01, 2022
Figure 1 for Video Question Answering with Iterative Video-Text Co-Tokenization
Figure 2 for Video Question Answering with Iterative Video-Text Co-Tokenization
Figure 3 for Video Question Answering with Iterative Video-Text Co-Tokenization
Figure 4 for Video Question Answering with Iterative Video-Text Co-Tokenization
Viaarxiv icon

Mechanical Search on Shelves with Efficient Stacking and Destacking of Objects

Add code
Jul 05, 2022
Figure 1 for Mechanical Search on Shelves with Efficient Stacking and Destacking of Objects
Figure 2 for Mechanical Search on Shelves with Efficient Stacking and Destacking of Objects
Figure 3 for Mechanical Search on Shelves with Efficient Stacking and Destacking of Objects
Figure 4 for Mechanical Search on Shelves with Efficient Stacking and Destacking of Objects
Viaarxiv icon

Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering

Add code
May 02, 2022
Figure 1 for Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
Figure 2 for Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
Figure 3 for Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
Figure 4 for Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering
Viaarxiv icon