Picture for Jingyou Xie

Jingyou Xie

FLEX-CLIP: Feature-Level GEneration Network Enhanced CLIP for X-shot Cross-modal Retrieval

Add code
Nov 26, 2024
Viaarxiv icon

Natural Language Understanding and Inference with MLLM in Visual Question Answering: A Survey

Add code
Nov 26, 2024
Figure 1 for Natural Language Understanding and Inference with MLLM in Visual Question Answering: A Survey
Figure 2 for Natural Language Understanding and Inference with MLLM in Visual Question Answering: A Survey
Figure 3 for Natural Language Understanding and Inference with MLLM in Visual Question Answering: A Survey
Figure 4 for Natural Language Understanding and Inference with MLLM in Visual Question Answering: A Survey
Viaarxiv icon