Picture for Qi Qian

Qi Qian

Searching for Best Practices in Retrieval-Augmented Generation

Add code
Jul 01, 2024
Figure 1 for Searching for Best Practices in Retrieval-Augmented Generation
Figure 2 for Searching for Best Practices in Retrieval-Augmented Generation
Figure 3 for Searching for Best Practices in Retrieval-Augmented Generation
Figure 4 for Searching for Best Practices in Retrieval-Augmented Generation
Viaarxiv icon

Efficient Personalized Text-to-image Generation by Leveraging Textual Subspace

Add code
Jun 30, 2024
Figure 1 for Efficient Personalized Text-to-image Generation by Leveraging Textual Subspace
Figure 2 for Efficient Personalized Text-to-image Generation by Leveraging Textual Subspace
Figure 3 for Efficient Personalized Text-to-image Generation by Leveraging Textual Subspace
Figure 4 for Efficient Personalized Text-to-image Generation by Leveraging Textual Subspace
Viaarxiv icon

Multi-Modal Proxy Learning Towards Personalized Visual Multiple Clustering

Add code
Apr 24, 2024
Viaarxiv icon

mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model

Add code
Nov 30, 2023
Viaarxiv icon

Stable Cluster Discrimination for Deep Clustering

Add code
Nov 24, 2023
Viaarxiv icon

mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration

Add code
Nov 09, 2023
Viaarxiv icon

Intra-Modal Proxy Learning for Zero-Shot Visual Categorization with CLIP

Add code
Oct 30, 2023
Figure 1 for Intra-Modal Proxy Learning for Zero-Shot Visual Categorization with CLIP
Figure 2 for Intra-Modal Proxy Learning for Zero-Shot Visual Categorization with CLIP
Figure 3 for Intra-Modal Proxy Learning for Zero-Shot Visual Categorization with CLIP
Figure 4 for Intra-Modal Proxy Learning for Zero-Shot Visual Categorization with CLIP
Viaarxiv icon

UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model

Add code
Oct 08, 2023
Figure 1 for UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
Figure 2 for UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
Figure 3 for UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
Figure 4 for UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
Viaarxiv icon

Graph Convolution Based Efficient Re-Ranking for Visual Retrieval

Add code
Jun 15, 2023
Figure 1 for Graph Convolution Based Efficient Re-Ranking for Visual Retrieval
Figure 2 for Graph Convolution Based Efficient Re-Ranking for Visual Retrieval
Figure 3 for Graph Convolution Based Efficient Re-Ranking for Visual Retrieval
Figure 4 for Graph Convolution Based Efficient Re-Ranking for Visual Retrieval
Viaarxiv icon

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks

Add code
Jun 07, 2023
Figure 1 for Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks
Figure 2 for Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks
Figure 3 for Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks
Figure 4 for Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks
Viaarxiv icon