Picture for Chaoya Jiang

Chaoya Jiang

HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language Models

Add code
Jun 04, 2025
Viaarxiv icon

VLM-R$^3$: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought

Add code
May 22, 2025
Viaarxiv icon

SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization

Add code
Nov 17, 2024
Viaarxiv icon

MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model

Add code
Aug 26, 2024
Viaarxiv icon

MIBench: Evaluating Multimodal Large Language Models over Multiple Images

Add code
Jul 21, 2024
Figure 1 for MIBench: Evaluating Multimodal Large Language Models over Multiple Images
Figure 2 for MIBench: Evaluating Multimodal Large Language Models over Multiple Images
Figure 3 for MIBench: Evaluating Multimodal Large Language Models over Multiple Images
Figure 4 for MIBench: Evaluating Multimodal Large Language Models over Multiple Images
Viaarxiv icon

Enhancing In-Context Learning via Implicit Demonstration Augmentation

Add code
Jun 27, 2024
Figure 1 for Enhancing In-Context Learning via Implicit Demonstration Augmentation
Figure 2 for Enhancing In-Context Learning via Implicit Demonstration Augmentation
Figure 3 for Enhancing In-Context Learning via Implicit Demonstration Augmentation
Figure 4 for Enhancing In-Context Learning via Implicit Demonstration Augmentation
Viaarxiv icon

Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models

Add code
Feb 24, 2024
Viaarxiv icon

TiMix: Text-aware Image Mixing for Effective Vision-Language Pre-training

Add code
Dec 14, 2023
Viaarxiv icon

Hallucination Augmented Contrastive Learning for Multimodal Large Language Model

Add code
Dec 13, 2023
Viaarxiv icon

BUS:Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization

Add code
Jul 17, 2023
Viaarxiv icon