Alert button
Picture for R. Manmatha

R. Manmatha

Alert button

Mixed-Query Transformer: A Unified Image Segmentation Architecture

Add code
Bookmark button
Alert button
Apr 06, 2024
Pei Wang, Zhaowei Cai, Hao Yang, Ashwin Swaminathan, R. Manmatha, Stefano Soatto

Viaarxiv icon

On the Scalability of Diffusion-based Text-to-Image Generation

Add code
Bookmark button
Alert button
Apr 03, 2024
Hao Li, Yang Zou, Ying Wang, Orchid Majumder, Yusheng Xie, R. Manmatha, Ashwin Swaminathan, Zhuowen Tu, Stefano Ermon, Stefano Soatto

Viaarxiv icon

DEED: Dynamic Early Exit on Decoder for Accelerating Encoder-Decoder Transformer Models

Add code
Bookmark button
Alert button
Nov 15, 2023
Peng Tang, Pengkai Zhu, Tian Li, Srikar Appalaraju, Vijay Mahadevan, R. Manmatha

Viaarxiv icon

Multiple-Question Multiple-Answer Text-VQA

Add code
Bookmark button
Alert button
Nov 15, 2023
Peng Tang, Srikar Appalaraju, R. Manmatha, Yusheng Xie, Vijay Mahadevan

Viaarxiv icon

DocTr: Document Transformer for Structured Information Extraction in Documents

Add code
Bookmark button
Alert button
Jul 16, 2023
Haofu Liao, Aruni RoyChowdhury, Weijian Li, Ankan Bansal, Yuting Zhang, Zhuowen Tu, Ravi Kumar Satzoda, R. Manmatha, Vijay Mahadevan

Figure 1 for DocTr: Document Transformer for Structured Information Extraction in Documents
Figure 2 for DocTr: Document Transformer for Structured Information Extraction in Documents
Figure 3 for DocTr: Document Transformer for Structured Information Extraction in Documents
Figure 4 for DocTr: Document Transformer for Structured Information Extraction in Documents
Viaarxiv icon

DocFormerv2: Local Features for Document Understanding

Add code
Bookmark button
Alert button
Jun 02, 2023
Srikar Appalaraju, Peng Tang, Qi Dong, Nishant Sankaran, Yichu Zhou, R. Manmatha

Figure 1 for DocFormerv2: Local Features for Document Understanding
Figure 2 for DocFormerv2: Local Features for Document Understanding
Figure 3 for DocFormerv2: Local Features for Document Understanding
Figure 4 for DocFormerv2: Local Features for Document Understanding
Viaarxiv icon

PolyFormer: Referring Image Segmentation as Sequential Polygon Generation

Add code
Bookmark button
Alert button
Feb 14, 2023
Jiang Liu, Hui Ding, Zhaowei Cai, Yuting Zhang, Ravi Kumar Satzoda, Vijay Mahadevan, R. Manmatha

Figure 1 for PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
Figure 2 for PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
Figure 3 for PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
Figure 4 for PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
Viaarxiv icon

SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation

Add code
Bookmark button
Alert button
Feb 07, 2023
Yash Patel, Yusheng Xie, Yi Zhu, Srikar Appalaraju, R. Manmatha

Figure 1 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Figure 2 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Figure 3 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Figure 4 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Viaarxiv icon

YORO -- Lightweight End to End Visual Grounding

Add code
Bookmark button
Alert button
Nov 15, 2022
Chih-Hui Ho, Srikar Appalaraju, Bhavan Jasani, R. Manmatha, Nuno Vasconcelos

Figure 1 for YORO -- Lightweight End to End Visual Grounding
Figure 2 for YORO -- Lightweight End to End Visual Grounding
Figure 3 for YORO -- Lightweight End to End Visual Grounding
Figure 4 for YORO -- Lightweight End to End Visual Grounding
Viaarxiv icon

GLASS: Global to Local Attention for Scene-Text Spotting

Add code
Bookmark button
Alert button
Aug 05, 2022
Roi Ronen, Shahar Tsiper, Oron Anschel, Inbal Lavi, Amir Markovitz, R. Manmatha

Figure 1 for GLASS: Global to Local Attention for Scene-Text Spotting
Figure 2 for GLASS: Global to Local Attention for Scene-Text Spotting
Figure 3 for GLASS: Global to Local Attention for Scene-Text Spotting
Figure 4 for GLASS: Global to Local Attention for Scene-Text Spotting
Viaarxiv icon