Alert button

"Text": models, code, and papers
Alert button

Deformable Kernel Expansion Model for Efficient Arbitrary-shaped Scene Text Detection

Mar 28, 2023
Tao He, Sheng Huang, Wenhao Tang, Bo Liu

Figure 1 for Deformable Kernel Expansion Model for Efficient Arbitrary-shaped Scene Text Detection
Figure 2 for Deformable Kernel Expansion Model for Efficient Arbitrary-shaped Scene Text Detection
Figure 3 for Deformable Kernel Expansion Model for Efficient Arbitrary-shaped Scene Text Detection
Figure 4 for Deformable Kernel Expansion Model for Efficient Arbitrary-shaped Scene Text Detection
Viaarxiv icon

Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory

May 03, 2023
Xin Cheng, Di Luo, Xiuying Chen, Lemao Liu, Dongyan Zhao, Rui Yan

Figure 1 for Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory
Figure 2 for Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory
Figure 3 for Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory
Figure 4 for Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory
Viaarxiv icon

Variational Distribution Learning for Unsupervised Text-to-Image Generation

Mar 28, 2023
Minsoo Kang, Doyup Lee, Jiseob Kim, Saehoon Kim, Bohyung Han

Figure 1 for Variational Distribution Learning for Unsupervised Text-to-Image Generation
Figure 2 for Variational Distribution Learning for Unsupervised Text-to-Image Generation
Figure 3 for Variational Distribution Learning for Unsupervised Text-to-Image Generation
Figure 4 for Variational Distribution Learning for Unsupervised Text-to-Image Generation
Viaarxiv icon

Med-Flamingo: a Multimodal Medical Few-shot Learner

Jul 27, 2023
Michael Moor, Qian Huang, Shirley Wu, Michihiro Yasunaga, Cyril Zakka, Yash Dalmia, Eduardo Pontes Reis, Pranav Rajpurkar, Jure Leskovec

Figure 1 for Med-Flamingo: a Multimodal Medical Few-shot Learner
Figure 2 for Med-Flamingo: a Multimodal Medical Few-shot Learner
Figure 3 for Med-Flamingo: a Multimodal Medical Few-shot Learner
Figure 4 for Med-Flamingo: a Multimodal Medical Few-shot Learner
Viaarxiv icon

Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text

Apr 14, 2023
Wanrong Zhu, Jack Hessel, Anas Awadalla, Samir Yitzhak Gadre, Jesse Dodge, Alex Fang, Youngjae Yu, Ludwig Schmidt, William Yang Wang, Yejin Choi

Figure 1 for Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text
Figure 2 for Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text
Figure 3 for Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text
Figure 4 for Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text
Viaarxiv icon

The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech

Jun 01, 2023
Phat Do, Matt Coler, Jelske Dijkstra, Esther Klabbers

Figure 1 for The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech
Figure 2 for The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech
Figure 3 for The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech
Viaarxiv icon

Learning when to observe: A frugal reinforcement learning framework for a high-cost world

Jul 24, 2023
Colin Bellinger, Mark Crowley, Isaac Tamblyn

Viaarxiv icon

PRIOR: Prototype Representation Joint Learning from Medical Images and Reports

Jul 24, 2023
Pujin Cheng, Li Lin, Junyan Lyu, Yijin Huang, Wenhan Luo, Xiaoying Tang

Figure 1 for PRIOR: Prototype Representation Joint Learning from Medical Images and Reports
Figure 2 for PRIOR: Prototype Representation Joint Learning from Medical Images and Reports
Figure 3 for PRIOR: Prototype Representation Joint Learning from Medical Images and Reports
Figure 4 for PRIOR: Prototype Representation Joint Learning from Medical Images and Reports
Viaarxiv icon

Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework

Jul 24, 2023
Jingxuan Wei, Cheng Tan, Zhangyang Gao, Linzhuang Sun, Siyuan Li, Bihui Yu, Ruifeng Guo, Stan Z. Li

Figure 1 for Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework
Figure 2 for Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework
Figure 3 for Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework
Figure 4 for Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework
Viaarxiv icon

Performance of Large Language Models in a Computer Science Degree Program

Jul 24, 2023
Tim Krüger, Michael Gref

Viaarxiv icon