Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Hao Tan

How Much Can CLIP Benefit Vision-and-Language Tasks?


Jul 13, 2021
Sheng Shen, Liunian Harold Li, Hao Tan, Mohit Bansal, Anna Rohrbach, Kai-Wei Chang, Zhewei Yao, Kurt Keutzer

* 14 pages 

  Access Paper or Ask Questions

VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer


Jul 06, 2021
Zineng Tang, Jaemin Cho, Hao Tan, Mohit Bansal

* 18 pages (5 figures, 10 tables) 

  Access Paper or Ask Questions

VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning


Jun 21, 2021
Hao Tan, Jie Lei, Thomas Wolf, Mohit Bansal

* Under review, 23 Pages 

  Access Paper or Ask Questions

Improving Cross-Modal Alignment in Vision Language Navigation via Syntactic Information


Apr 19, 2021
Jialu Li, Hao Tan, Mohit Bansal

* NAACL 2021 (10 pages) 

  Access Paper or Ask Questions

Unifying Vision-and-Language Tasks via Text Generation


Feb 04, 2021
Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal

* 16 pages, 4 figures, 13 tables 

  Access Paper or Ask Questions

ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments


Nov 15, 2020
Hyounghun Kim, Abhay Zala, Graham Burri, Hao Tan, Mohit Bansal

* EMNLP Findings 2020 (18 pages; extended to Hindi) 

  Access Paper or Ask Questions

Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision


Oct 14, 2020
Hao Tan, Mohit Bansal

* EMNLP 2020 (15 pages) 

  Access Paper or Ask Questions

MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding


Oct 12, 2020
Qinxin Wang, Hao Tan, Sheng Shen, Michael W. Mahoney, Zhewei Yao


  Access Paper or Ask Questions

RelativeNAS: Relative Neural Architecture Search via Slow-Fast Learning


Sep 15, 2020
Hao Tan, Ran Cheng, Shihua Huang, Cheng He, Changxiao Qiu, Fan Yang, Ping Luo


  Access Paper or Ask Questions

Diagnosing the Environment Bias in Vision-and-Language Navigation


May 06, 2020
Yubo Zhang, Hao Tan, Mohit Bansal

* IJCAI 2020 (9 pages; first two authors contributed equally) 

  Access Paper or Ask Questions

The Curse of Performance Instability in Analysis Datasets: Consequences, Source, and Suggestions


Apr 28, 2020
Xiang Zhou, Yixin Nie, Hao Tan, Mohit Bansal

* 13 pages 

  Access Paper or Ask Questions

Modality-Balanced Models for Visual Dialogue


Jan 17, 2020
Hyounghun Kim, Hao Tan, Mohit Bansal

* AAAI 2020 (11 pages) 

  Access Paper or Ask Questions

LXMERT: Learning Cross-Modality Encoder Representations from Transformers


Aug 22, 2019
Hao Tan, Mohit Bansal

* EMNLP 2019 (12 pages) 

  Access Paper or Ask Questions

Expressing Visual Relationships via Language


Jun 19, 2019
Hao Tan, Franck Dernoncourt, Zhe Lin, Trung Bui, Mohit Bansal

* ACL 2019 (11 pages) 

  Access Paper or Ask Questions

Enabling Robots to Understand Incomplete Natural Language Instructions Using Commonsense Reasoning


Apr 29, 2019
Haonan Chen, Hao Tan, Alan Kuntz, Mohit Bansal, Ron Alterovitz

* 14 pages, 9 figures 

  Access Paper or Ask Questions

Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout


Apr 08, 2019
Hao Tan, Licheng Yu, Mohit Bansal

* NAACL 2019 (12 pages) 

  Access Paper or Ask Questions

Object Ordering with Bidirectional Matchings for Visual Reasoning


Sep 06, 2018
Hao Tan, Mohit Bansal

* NAACL 2018 (8 pages; added pointer-ordering examples) 

  Access Paper or Ask Questions

Source-Target Inference Models for Spatial Instruction Understanding


Nov 21, 2017
Hao Tan, Mohit Bansal

* Accepted to AAAI 2018 (8 pages) 

  Access Paper or Ask Questions

A Joint Speaker-Listener-Reinforcer Model for Referring Expressions


Apr 17, 2017
Licheng Yu, Hao Tan, Mohit Bansal, Tamara L. Berg

* Some typo fixed; comprehension results on refcocog updated; more human evaluation results added 

  Access Paper or Ask Questions