Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer


Feb 14, 2022
Yair Kittenplon , Inbal Lavi , Sharon Fogel , Yarin Bar , R. Manmatha , Pietro Perona


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

LaTr: Layout-Aware Transformer for Scene-Text VQA


Dec 24, 2021
Ali Furkan Biten , Ron Litman , Yusheng Xie , Srikar Appalaraju , R. Manmatha


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

DocFormer: End-to-End Transformer for Document Understanding


Jun 22, 2021
Srikar Appalaraju , Bhavan Jasani , Bhargava Urala Kota , Yusheng Xie , R. Manmatha


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

On Calibration of Scene-Text Recognition Models


Dec 23, 2020
Ron Slossberg , Oron Anschel , Amir Markovitz , Ron Litman , Aviad Aberdam , Shahar Tsiper , Shai Mazor , Jon Wu , R. Manmatha


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Sequence-to-Sequence Contrastive Learning for Text Recognition


Dec 20, 2020
Aviad Aberdam , Ron Litman , Shahar Tsiper , Oron Anschel , Ron Slossberg , Shai Mazor , R. Manmatha , Pietro Perona


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

A Comprehensive Study of Deep Video Action Recognition


Dec 11, 2020
Yi Zhu , Xinyu Li , Chunhui Liu , Mohammadreza Zolfaghari , Yuanjun Xiong , Chongruo Wu , Zhi Zhang , Joseph Tighe , R. Manmatha , Mu Li

* Technical report. Code and model zoo can be found at https://cv.gluon.ai/model_zoo/action_recognition.html 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Document Visual Question Answering Challenge 2020


Aug 20, 2020
Minesh Mathew , Ruben Tito , Dimosthenis Karatzas , R. Manmatha , C. V. Jawahar

* to be published as a short paper in DAS 2020 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

DocVQA: A Dataset for VQA on Document Images


Jul 01, 2020
Minesh Mathew , Dimosthenis Karatzas , R. Manmatha , C. V. Jawahar


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Improving Semantic Segmentation via Self-Training


May 06, 2020
Yi Zhu , Zhongyue Zhang , Chongruo Wu , Zhi Zhang , Tong He , Hang Zhang , R. Manmatha , Mu Li , Alexander Smola


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email
1
2
>>