Picture for Yaqian Li

Yaqian Li

LearnPruner: Rethinking Attention-based Token Pruning in Vision Language Models

Add code
Apr 27, 2026
Viaarxiv icon

SMoES: Soft Modality-Guided Expert Specialization in MoE-VLMs

Add code
Apr 27, 2026
Viaarxiv icon

QMoP: Query Guided Mixture-of-Projector for Efficient Visual Token Compression

Add code
Mar 22, 2026
Viaarxiv icon

Step-CoT: Stepwise Visual Chain-of-Thought for Medical Visual Question Answering

Add code
Mar 14, 2026
Viaarxiv icon

ITO: Images and Texts as One via Synergizing Multiple Alignment and Training-Time Fusion

Add code
Mar 04, 2026
Viaarxiv icon

iGVLM: Dynamic Instruction-Guided Vision Encoding for Question-Aware Multimodal Understanding

Add code
Mar 03, 2026
Viaarxiv icon

Debiased Novel Category Discovering and Localization

Add code
Feb 29, 2024
Figure 1 for Debiased Novel Category Discovering and Localization
Figure 2 for Debiased Novel Category Discovering and Localization
Figure 3 for Debiased Novel Category Discovering and Localization
Figure 4 for Debiased Novel Category Discovering and Localization
Viaarxiv icon

A Survey for Foundation Models in Autonomous Driving

Add code
Feb 02, 2024
Viaarxiv icon

u-LLaVA: Unifying Multi-Modal Tasks via Large Language Model

Add code
Nov 09, 2023
Figure 1 for u-LLaVA: Unifying Multi-Modal Tasks via Large Language Model
Figure 2 for u-LLaVA: Unifying Multi-Modal Tasks via Large Language Model
Figure 3 for u-LLaVA: Unifying Multi-Modal Tasks via Large Language Model
Figure 4 for u-LLaVA: Unifying Multi-Modal Tasks via Large Language Model
Viaarxiv icon

Inject Semantic Concepts into Image Tagging for Open-Set Recognition

Add code
Oct 23, 2023
Figure 1 for Inject Semantic Concepts into Image Tagging for Open-Set Recognition
Figure 2 for Inject Semantic Concepts into Image Tagging for Open-Set Recognition
Figure 3 for Inject Semantic Concepts into Image Tagging for Open-Set Recognition
Figure 4 for Inject Semantic Concepts into Image Tagging for Open-Set Recognition
Viaarxiv icon