Alert button
Picture for Erhan Bas

Erhan Bas

Alert button

Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding

Add code
Bookmark button
Alert button
Jan 09, 2024
Yatong Bai, Utsav Garg, Apaar Shanker, Haoming Zhang, Samyak Parajuli, Erhan Bas, Isidora Filipovic, Amelia N. Chu, Eugenia D Fomitcheva, Elliot Branson, Aerin Kim, Somayeh Sojoudi, Kyunghyun Cho

Viaarxiv icon

On the Performance of Multimodal Language Models

Add code
Bookmark button
Alert button
Oct 04, 2023
Utsav Garg, Erhan Bas

Viaarxiv icon

Detecting and Preventing Hallucinations in Large Vision Language Models

Add code
Bookmark button
Alert button
Aug 18, 2023
Anisha Gunjal, Jihan Yin, Erhan Bas

Figure 1 for Detecting and Preventing Hallucinations in Large Vision Language Models
Figure 2 for Detecting and Preventing Hallucinations in Large Vision Language Models
Figure 3 for Detecting and Preventing Hallucinations in Large Vision Language Models
Figure 4 for Detecting and Preventing Hallucinations in Large Vision Language Models
Viaarxiv icon

Masked Vision and Language Modeling for Multi-modal Representation Learning

Add code
Bookmark button
Alert button
Aug 03, 2022
Gukyeong Kwon, Zhaowei Cai, Avinash Ravichandran, Erhan Bas, Rahul Bhotika, Stefano Soatto

Figure 1 for Masked Vision and Language Modeling for Multi-modal Representation Learning
Figure 2 for Masked Vision and Language Modeling for Multi-modal Representation Learning
Figure 3 for Masked Vision and Language Modeling for Multi-modal Representation Learning
Figure 4 for Masked Vision and Language Modeling for Multi-modal Representation Learning
Viaarxiv icon

X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks

Add code
Bookmark button
Alert button
Apr 12, 2022
Zhaowei Cai, Gukyeong Kwon, Avinash Ravichandran, Erhan Bas, Zhuowen Tu, Rahul Bhotika, Stefano Soatto

Figure 1 for X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
Figure 2 for X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
Figure 3 for X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
Figure 4 for X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
Viaarxiv icon