Visual Prompt Tuning


Prompt Disentanglement via Language Guidance and Representation Alignment for Domain Generalization

Add code
Jul 03, 2025
Viaarxiv icon

RichControl: Structure- and Appearance-Rich Training-Free Spatial Control for Text-to-Image Generation

Add code
Jul 03, 2025
Viaarxiv icon

SafePTR: Token-Level Jailbreak Defense in Multimodal LLMs via Prune-then-Restore Mechanism

Add code
Jul 02, 2025
Viaarxiv icon

Multimodal Large Language Models for Medical Report Generation via Customized Prompt Tuning

Add code
Jun 18, 2025
Viaarxiv icon

PRISM2: Unlocking Multi-Modal General Pathology AI with Clinical Dialogue

Add code
Jun 16, 2025
Viaarxiv icon

NAP-Tuning: Neural Augmented Prompt Tuning for Adversarially Robust Vision-Language Models

Add code
Jun 15, 2025
Viaarxiv icon

TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning

Add code
Jun 16, 2025
Viaarxiv icon

Using Vision Language Models to Detect Students' Academic Emotion through Facial Expressions

Add code
Jun 12, 2025
Viaarxiv icon

Quantitative Comparison of Fine-Tuning Techniques for Pretrained Latent Diffusion Models in the Generation of Unseen SAR Image Concepts

Add code
Jun 16, 2025
Viaarxiv icon

Text to Image for Multi-Label Image Recognition with Joint Prompt-Adapter Learning

Add code
Jun 12, 2025
Viaarxiv icon