Alert button

"Text": models, code, and papers
Alert button

MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering

Mar 17, 2022
Yang Ding, Jing Yu, Bang Liu, Yue Hu, Mingxin Cui, Qi Wu

Figure 1 for MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering
Figure 2 for MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering
Figure 3 for MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering
Figure 4 for MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering
Viaarxiv icon

Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning

Mar 03, 2022
Weixin Liang, Yuhui Zhang, Yongchan Kwon, Serena Yeung, James Zou

Figure 1 for Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
Figure 2 for Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
Figure 3 for Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
Figure 4 for Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
Viaarxiv icon

AI Annotated Recommendations in an Efficient Visual Learning Environment with Emphasis on YouTube (AI-EVL)

Mar 10, 2022
Faeze Gholamrezaie, Melika Bahman-Abadi, M. B. Ghaznavi-Ghoushchi

Figure 1 for AI Annotated Recommendations in an Efficient Visual Learning Environment with Emphasis on YouTube (AI-EVL)
Figure 2 for AI Annotated Recommendations in an Efficient Visual Learning Environment with Emphasis on YouTube (AI-EVL)
Figure 3 for AI Annotated Recommendations in an Efficient Visual Learning Environment with Emphasis on YouTube (AI-EVL)
Figure 4 for AI Annotated Recommendations in an Efficient Visual Learning Environment with Emphasis on YouTube (AI-EVL)
Viaarxiv icon

Vision-Language Intelligence: Tasks, Representation Learning, and Large Models

Mar 03, 2022
Feng Li, Hao Zhang, Yi-Fan Zhang, Shilong Liu, Jian Guo, Lionel M. Ni, PengChuan Zhang, Lei Zhang

Figure 1 for Vision-Language Intelligence: Tasks, Representation Learning, and Large Models
Figure 2 for Vision-Language Intelligence: Tasks, Representation Learning, and Large Models
Figure 3 for Vision-Language Intelligence: Tasks, Representation Learning, and Large Models
Figure 4 for Vision-Language Intelligence: Tasks, Representation Learning, and Large Models
Viaarxiv icon

Printed Texts Tracking and Following for a Finger-Wearable Electro-Braille System Through Opto-electrotactile Feedback

Aug 06, 2021
Mehdi Rahimi, Yantao Shen, Zhiming Liu, Fang Jiang

Viaarxiv icon

Bertrand-DR: Improving Text-to-SQL using a Discriminative Re-ranker

Feb 03, 2020
Amol Kelkar, Rohan Relan, Vaishali Bhardwaj, Saurabh Vaichal, Peter Relan

Figure 1 for Bertrand-DR: Improving Text-to-SQL using a Discriminative Re-ranker
Figure 2 for Bertrand-DR: Improving Text-to-SQL using a Discriminative Re-ranker
Figure 3 for Bertrand-DR: Improving Text-to-SQL using a Discriminative Re-ranker
Figure 4 for Bertrand-DR: Improving Text-to-SQL using a Discriminative Re-ranker
Viaarxiv icon

Variational Template Machine for Data-to-Text Generation

Feb 13, 2020
Rong Ye, Wenxian Shi, Hao Zhou, Zhongyu Wei, Lei Li

Figure 1 for Variational Template Machine for Data-to-Text Generation
Figure 2 for Variational Template Machine for Data-to-Text Generation
Figure 3 for Variational Template Machine for Data-to-Text Generation
Figure 4 for Variational Template Machine for Data-to-Text Generation
Viaarxiv icon

Adaptive Offline Quintuplet Loss for Image-Text Matching

Mar 07, 2020
Tianlang Chen, Jiajun Deng, Jiebo Luo

Figure 1 for Adaptive Offline Quintuplet Loss for Image-Text Matching
Figure 2 for Adaptive Offline Quintuplet Loss for Image-Text Matching
Figure 3 for Adaptive Offline Quintuplet Loss for Image-Text Matching
Figure 4 for Adaptive Offline Quintuplet Loss for Image-Text Matching
Viaarxiv icon

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Feb 08, 2021
Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Jinzhu Li, Sheng Zhao, Enhong Chen, Tie-Yan Liu

Figure 1 for LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Figure 2 for LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Figure 3 for LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Figure 4 for LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Viaarxiv icon

Multimodal grid features and cell pointers for Scene Text Visual Question Answering

Jun 01, 2020
Lluís Gómez, Ali Furkan Biten, Rubèn Tito, Andrés Mafla, Dimosthenis Karatzas

Figure 1 for Multimodal grid features and cell pointers for Scene Text Visual Question Answering
Figure 2 for Multimodal grid features and cell pointers for Scene Text Visual Question Answering
Figure 3 for Multimodal grid features and cell pointers for Scene Text Visual Question Answering
Figure 4 for Multimodal grid features and cell pointers for Scene Text Visual Question Answering
Viaarxiv icon