Alert button
Picture for Anelia Angelova

Anelia Angelova

Alert button

3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation

Jan 04, 2024
Zihao Xiao, Longlong Jing, Shangxuan Wu, Alex Zihao Zhu, Jingwei Ji, Chiyu Max Jiang, Wei-Chih Hung, Thomas Funkhouser, Weicheng Kuo, Anelia Angelova, Yin Zhou, Shiwei Sheng

Viaarxiv icon

Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities

Nov 13, 2023
AJ Piergiovanni, Isaac Noble, Dahun Kim, Michael S. Ryoo, Victor Gomes, Anelia Angelova

Figure 1 for Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
Figure 2 for Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
Figure 3 for Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
Figure 4 for Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
Viaarxiv icon

Detection-Oriented Image-Text Pretraining for Open-Vocabulary Detection

Sep 29, 2023
Dahun Kim, Anelia Angelova, Weicheng Kuo

Viaarxiv icon

Contrastive Feature Masking Open-Vocabulary Vision Transformer

Sep 02, 2023
Dahun Kim, Anelia Angelova, Weicheng Kuo

Figure 1 for Contrastive Feature Masking Open-Vocabulary Vision Transformer
Figure 2 for Contrastive Feature Masking Open-Vocabulary Vision Transformer
Figure 3 for Contrastive Feature Masking Open-Vocabulary Vision Transformer
Figure 4 for Contrastive Feature Masking Open-Vocabulary Vision Transformer
Viaarxiv icon

Diversifying Joint Vision-Language Tokenization Learning

Jun 15, 2023
Vardaan Pahuja, AJ Piergiovanni, Anelia Angelova

Figure 1 for Diversifying Joint Vision-Language Tokenization Learning
Figure 2 for Diversifying Joint Vision-Language Tokenization Learning
Figure 3 for Diversifying Joint Vision-Language Tokenization Learning
Figure 4 for Diversifying Joint Vision-Language Tokenization Learning
Viaarxiv icon

Joint Adaptive Representations for Image-Language Learning

Jun 01, 2023
AJ Piergiovanni, Anelia Angelova

Figure 1 for Joint Adaptive Representations for Image-Language Learning
Figure 2 for Joint Adaptive Representations for Image-Language Learning
Figure 3 for Joint Adaptive Representations for Image-Language Learning
Figure 4 for Joint Adaptive Representations for Image-Language Learning
Viaarxiv icon

PaLI-X: On Scaling up a Multilingual Vision and Language Model

May 29, 2023
Xi Chen, Josip Djolonga, Piotr Padlewski, Basil Mustafa, Soravit Changpinyo, Jialin Wu, Carlos Riquelme Ruiz, Sebastian Goodman, Xiao Wang, Yi Tay, Siamak Shakeri, Mostafa Dehghani, Daniel Salz, Mario Lucic, Michael Tschannen, Arsha Nagrani, Hexiang Hu, Mandar Joshi, Bo Pang, Ceslee Montgomery, Paulina Pietrzyk, Marvin Ritter, AJ Piergiovanni, Matthias Minderer, Filip Pavetic, Austin Waters, Gang Li, Ibrahim Alabdulmohsin, Lucas Beyer, Julien Amelot, Kenton Lee, Andreas Peter Steiner, Yang Li, Daniel Keysers, Anurag Arnab, Yuanzhong Xu, Keran Rong, Alexander Kolesnikov, Mojtaba Seyedhosseini, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut

Figure 1 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 2 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 3 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 4 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Viaarxiv icon

Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers

May 11, 2023
Dahun Kim, Anelia Angelova, Weicheng Kuo

Figure 1 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 2 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 3 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 4 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Viaarxiv icon