Alert button

"Text": models, code, and papers
Alert button

MuLTI: Efficient Video-and-Language Understanding with MultiWay-Sampler and Multiple Choice Modeling

Mar 10, 2023
Jiaqi Xu, Bo Liu, Yunkuo Chen, Mengli Cheng, Xing Shi

Figure 1 for MuLTI: Efficient Video-and-Language Understanding with MultiWay-Sampler and Multiple Choice Modeling
Figure 2 for MuLTI: Efficient Video-and-Language Understanding with MultiWay-Sampler and Multiple Choice Modeling
Figure 3 for MuLTI: Efficient Video-and-Language Understanding with MultiWay-Sampler and Multiple Choice Modeling
Figure 4 for MuLTI: Efficient Video-and-Language Understanding with MultiWay-Sampler and Multiple Choice Modeling
Viaarxiv icon

CCLAP: Controllable Chinese Landscape Painting Generation via Latent Diffusion Model

Apr 09, 2023
Zhongqi Wang, Jie Zhang, Zhilong Ji, Jinfeng Bai, Shiguang Shan

Figure 1 for CCLAP: Controllable Chinese Landscape Painting Generation via Latent Diffusion Model
Figure 2 for CCLAP: Controllable Chinese Landscape Painting Generation via Latent Diffusion Model
Figure 3 for CCLAP: Controllable Chinese Landscape Painting Generation via Latent Diffusion Model
Figure 4 for CCLAP: Controllable Chinese Landscape Painting Generation via Latent Diffusion Model
Viaarxiv icon

GPT-4 Technical Report

Mar 15, 2023
OpenAI

Figure 1 for GPT-4 Technical Report
Figure 2 for GPT-4 Technical Report
Figure 3 for GPT-4 Technical Report
Figure 4 for GPT-4 Technical Report
Viaarxiv icon

RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training

Mar 01, 2023
Zheng Yuan, Qiao Jin, Chuanqi Tan, Zhengyun Zhao, Hongyi Yuan, Fei Huang, Songfang Huang

Figure 1 for RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training
Figure 2 for RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training
Figure 3 for RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training
Figure 4 for RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training
Viaarxiv icon

C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval

Oct 07, 2022
Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, Samuel Thomas, Rogerio Feris, Brian Kingsbury, Leonid Karlinsky, David Harwath, Hilde Kuehne, James Glass

Figure 1 for C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Figure 2 for C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Figure 3 for C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Figure 4 for C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Viaarxiv icon

Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations

Dec 17, 2022
Jifan Chen, Yuhao Zhang, Lan Liu, Rui Dong, Xinchi Chen, Patrick Ng, William Yang Wang, Zhiheng Huang

Figure 1 for Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations
Figure 2 for Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations
Figure 3 for Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations
Figure 4 for Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations
Viaarxiv icon

Multimodal Pre-training Framework for Sequential Recommendation via Contrastive Learning

Mar 21, 2023
Lingzi Zhang, Xin Zhou, Zhiqi Shen

Figure 1 for Multimodal Pre-training Framework for Sequential Recommendation via Contrastive Learning
Figure 2 for Multimodal Pre-training Framework for Sequential Recommendation via Contrastive Learning
Figure 3 for Multimodal Pre-training Framework for Sequential Recommendation via Contrastive Learning
Figure 4 for Multimodal Pre-training Framework for Sequential Recommendation via Contrastive Learning
Viaarxiv icon

Bridge the Gap between Language models and Tabular Understanding

Feb 16, 2023
Nuo Chen, Linjun Shou, Ming Gong, Jian Pei, Chenyu You, Jianhui Chang, Daxin Jiang, Jia Li

Figure 1 for Bridge the Gap between Language models and Tabular Understanding
Figure 2 for Bridge the Gap between Language models and Tabular Understanding
Figure 3 for Bridge the Gap between Language models and Tabular Understanding
Figure 4 for Bridge the Gap between Language models and Tabular Understanding
Viaarxiv icon

SegGPT: Segmenting Everything In Context

Apr 06, 2023
Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang

Figure 1 for SegGPT: Segmenting Everything In Context
Figure 2 for SegGPT: Segmenting Everything In Context
Figure 3 for SegGPT: Segmenting Everything In Context
Figure 4 for SegGPT: Segmenting Everything In Context
Viaarxiv icon

Semantic-visual Guided Transformer for Few-shot Class-incremental Learning

Mar 27, 2023
Wenhao Qiu, Sichao Fu, Jingyi Zhang, Chengxiang Lei, Qinmu Peng

Figure 1 for Semantic-visual Guided Transformer for Few-shot Class-incremental Learning
Figure 2 for Semantic-visual Guided Transformer for Few-shot Class-incremental Learning
Figure 3 for Semantic-visual Guided Transformer for Few-shot Class-incremental Learning
Figure 4 for Semantic-visual Guided Transformer for Few-shot Class-incremental Learning
Viaarxiv icon