Alert button
Picture for Srikar Appalaraju

Srikar Appalaraju

Alert button

Enhancing Vision-Language Pre-training with Rich Supervisions

Add code
Bookmark button
Alert button
Mar 05, 2024
Yuan Gao, Kunyu Shi, Pengkai Zhu, Edouard Belval, Oren Nuriel, Srikar Appalaraju, Shabnam Ghadar, Vijay Mahadevan, Zhuowen Tu, Stefano Soatto

Figure 1 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 2 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 3 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 4 for Enhancing Vision-Language Pre-training with Rich Supervisions
Viaarxiv icon

DEED: Dynamic Early Exit on Decoder for Accelerating Encoder-Decoder Transformer Models

Add code
Bookmark button
Alert button
Nov 15, 2023
Peng Tang, Pengkai Zhu, Tian Li, Srikar Appalaraju, Vijay Mahadevan, R. Manmatha

Viaarxiv icon

Multiple-Question Multiple-Answer Text-VQA

Add code
Bookmark button
Alert button
Nov 15, 2023
Peng Tang, Srikar Appalaraju, R. Manmatha, Yusheng Xie, Vijay Mahadevan

Viaarxiv icon

A Multi-Modal Multilingual Benchmark for Document Image Classification

Add code
Bookmark button
Alert button
Oct 25, 2023
Yoshinari Fujinuma, Siddharth Varia, Nishant Sankaran, Srikar Appalaraju, Bonan Min, Yogarshi Vyas

Figure 1 for A Multi-Modal Multilingual Benchmark for Document Image Classification
Figure 2 for A Multi-Modal Multilingual Benchmark for Document Image Classification
Figure 3 for A Multi-Modal Multilingual Benchmark for Document Image Classification
Figure 4 for A Multi-Modal Multilingual Benchmark for Document Image Classification
Viaarxiv icon

DocFormerv2: Local Features for Document Understanding

Add code
Bookmark button
Alert button
Jun 02, 2023
Srikar Appalaraju, Peng Tang, Qi Dong, Nishant Sankaran, Yichu Zhou, R. Manmatha

Figure 1 for DocFormerv2: Local Features for Document Understanding
Figure 2 for DocFormerv2: Local Features for Document Understanding
Figure 3 for DocFormerv2: Local Features for Document Understanding
Figure 4 for DocFormerv2: Local Features for Document Understanding
Viaarxiv icon

SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation

Add code
Bookmark button
Alert button
Feb 07, 2023
Yash Patel, Yusheng Xie, Yi Zhu, Srikar Appalaraju, R. Manmatha

Figure 1 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Figure 2 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Figure 3 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Figure 4 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Viaarxiv icon

YORO -- Lightweight End to End Visual Grounding

Add code
Bookmark button
Alert button
Nov 15, 2022
Chih-Hui Ho, Srikar Appalaraju, Bhavan Jasani, R. Manmatha, Nuno Vasconcelos

Figure 1 for YORO -- Lightweight End to End Visual Grounding
Figure 2 for YORO -- Lightweight End to End Visual Grounding
Figure 3 for YORO -- Lightweight End to End Visual Grounding
Figure 4 for YORO -- Lightweight End to End Visual Grounding
Viaarxiv icon

MixGen: A New Multi-Modal Data Augmentation

Add code
Bookmark button
Alert button
Jun 16, 2022
Xiaoshuai Hao, Yi Zhu, Srikar Appalaraju, Aston Zhang, Wanqian Zhang, Bo Li, Mu Li

Figure 1 for MixGen: A New Multi-Modal Data Augmentation
Figure 2 for MixGen: A New Multi-Modal Data Augmentation
Figure 3 for MixGen: A New Multi-Modal Data Augmentation
Figure 4 for MixGen: A New Multi-Modal Data Augmentation
Viaarxiv icon

Towards Differential Relational Privacy and its use in Question Answering

Add code
Bookmark button
Alert button
Mar 30, 2022
Simone Bombari, Alessandro Achille, Zijian Wang, Yu-Xiang Wang, Yusheng Xie, Kunwar Yashraj Singh, Srikar Appalaraju, Vijay Mahadevan, Stefano Soatto

Figure 1 for Towards Differential Relational Privacy and its use in Question Answering
Figure 2 for Towards Differential Relational Privacy and its use in Question Answering
Figure 3 for Towards Differential Relational Privacy and its use in Question Answering
Figure 4 for Towards Differential Relational Privacy and its use in Question Answering
Viaarxiv icon

LaTr: Layout-Aware Transformer for Scene-Text VQA

Add code
Bookmark button
Alert button
Dec 24, 2021
Ali Furkan Biten, Ron Litman, Yusheng Xie, Srikar Appalaraju, R. Manmatha

Figure 1 for LaTr: Layout-Aware Transformer for Scene-Text VQA
Figure 2 for LaTr: Layout-Aware Transformer for Scene-Text VQA
Figure 3 for LaTr: Layout-Aware Transformer for Scene-Text VQA
Figure 4 for LaTr: Layout-Aware Transformer for Scene-Text VQA
Viaarxiv icon