Picture for Xing Sun

Xing Sun

Multimodal Label Relevance Ranking via Reinforcement Learning

Add code
Jul 18, 2024
Figure 1 for Multimodal Label Relevance Ranking via Reinforcement Learning
Figure 2 for Multimodal Label Relevance Ranking via Reinforcement Learning
Figure 3 for Multimodal Label Relevance Ranking via Reinforcement Learning
Figure 4 for Multimodal Label Relevance Ranking via Reinforcement Learning
Viaarxiv icon

Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence

Add code
Jun 16, 2024
Figure 1 for Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
Figure 2 for Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
Figure 3 for Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
Figure 4 for Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
Viaarxiv icon

VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models

Add code
Jun 14, 2024
Viaarxiv icon

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Add code
May 31, 2024
Figure 1 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 2 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 3 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 4 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Viaarxiv icon

Cantor: Inspiring Multimodal Chain-of-Thought of MLLM

Add code
Apr 24, 2024
Figure 1 for Cantor: Inspiring Multimodal Chain-of-Thought of MLLM
Figure 2 for Cantor: Inspiring Multimodal Chain-of-Thought of MLLM
Figure 3 for Cantor: Inspiring Multimodal Chain-of-Thought of MLLM
Figure 4 for Cantor: Inspiring Multimodal Chain-of-Thought of MLLM
Viaarxiv icon

HRVDA: High-Resolution Visual Document Assistant

Add code
Apr 10, 2024
Figure 1 for HRVDA: High-Resolution Visual Document Assistant
Figure 2 for HRVDA: High-Resolution Visual Document Assistant
Figure 3 for HRVDA: High-Resolution Visual Document Assistant
Figure 4 for HRVDA: High-Resolution Visual Document Assistant
Viaarxiv icon

A General and Efficient Training for Transformer via Token Expansion

Add code
Mar 31, 2024
Figure 1 for A General and Efficient Training for Transformer via Token Expansion
Figure 2 for A General and Efficient Training for Transformer via Token Expansion
Figure 3 for A General and Efficient Training for Transformer via Token Expansion
Figure 4 for A General and Efficient Training for Transformer via Token Expansion
Viaarxiv icon

RESTORE: Towards Feature Shift for Vision-Language Prompt Learning

Add code
Mar 10, 2024
Figure 1 for RESTORE: Towards Feature Shift for Vision-Language Prompt Learning
Figure 2 for RESTORE: Towards Feature Shift for Vision-Language Prompt Learning
Figure 3 for RESTORE: Towards Feature Shift for Vision-Language Prompt Learning
Figure 4 for RESTORE: Towards Feature Shift for Vision-Language Prompt Learning
Viaarxiv icon

Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models

Add code
Feb 29, 2024
Figure 1 for Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models
Figure 2 for Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models
Figure 3 for Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models
Figure 4 for Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models
Viaarxiv icon

Sinkhorn Distance Minimization for Knowledge Distillation

Add code
Feb 27, 2024
Figure 1 for Sinkhorn Distance Minimization for Knowledge Distillation
Figure 2 for Sinkhorn Distance Minimization for Knowledge Distillation
Figure 3 for Sinkhorn Distance Minimization for Knowledge Distillation
Figure 4 for Sinkhorn Distance Minimization for Knowledge Distillation
Viaarxiv icon