Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

DiT: Self-supervised Pre-training for Document Image Transformer



Junlong Li , Yiheng Xu , Tengchao Lv , Lei Cui , Cha Zhang , Furu Wei

* Work in Progress 

   Access Paper or Ask Questions

Improving Structured Text Recognition with Regular Expression Biasing



Baoguang Shi , Wenfeng Cheng , Yijuan Lu , Cha Zhang , Dinei Florencio


   Access Paper or Ask Questions

TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models



Minghao Li , Tengchao Lv , Lei Cui , Yijuan Lu , Dinei Florencio , Cha Zhang , Zhoujun Li , Furu Wei

* Work in Progress 

   Access Paper or Ask Questions

LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding



Yiheng Xu , Tengchao Lv , Lei Cui , Guoxin Wang , Yijuan Lu , Dinei Florencio , Cha Zhang , Furu Wei

* Work in progress 

   Access Paper or Ask Questions

LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding



Yang Xu , Yiheng Xu , Tengchao Lv , Lei Cui , Furu Wei , Guoxin Wang , Yijuan Lu , Dinei Florencio , Cha Zhang , Wanxiang Che , Min Zhang , Lidong Zhou

* Work in progress 

   Access Paper or Ask Questions

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption



Zhengyuan Yang , Yijuan Lu , Jianfeng Wang , Xi Yin , Dinei Florencio , Lijuan Wang , Cha Zhang , Lei Zhang , Jiebo Luo


   Access Paper or Ask Questions

Multimodal active speaker detection and virtual cinematography for video conferencing



Ross Cutler , Ramin Mehran , Sam Johnson , Cha Zhang , Adam Kirk , Oliver Whyte , Adarsh Kowdle


   Access Paper or Ask Questions

Improving the Adversarial Robustness of Transfer Learning via Noisy Feature Distillation



Ting-Wu Chin , Cha Zhang , Diana Marculescu

* Preprint 

   Access Paper or Ask Questions

1
2
>>