Picture for Yujun Cai

Yujun Cai

Text Speaks Louder than Vision: ASCII Art Reveals Textual Biases in Vision-Language Models

Add code
Apr 02, 2025
Viaarxiv icon

How does Watermarking Affect Visual Language Models in Document Understanding?

Add code
Apr 01, 2025
Viaarxiv icon

Texture or Semantics? Vision-Language Models Get Lost in Font Recognition

Add code
Mar 31, 2025
Viaarxiv icon

Tricking Retrievers with Influential Tokens: An Efficient Black-Box Corpus Poisoning Attack

Add code
Mar 27, 2025
Viaarxiv icon

Process or Result? Manipulated Ending Tokens Can Mislead Reasoning LLMs to Ignore the Correct Reasoning Steps

Add code
Mar 25, 2025
Viaarxiv icon

MIRAGE: Multimodal Immersive Reasoning and Guided Exploration for Red-Team Jailbreak Attacks

Add code
Mar 24, 2025
Viaarxiv icon

SED-MVS: Segmentation-Driven and Edge-Aligned Deformation Multi-View Stereo with Depth Restoration and Occlusion Constraint

Add code
Mar 17, 2025
Viaarxiv icon

Making Every Step Effective: Jailbreaking Large Vision-Language Models Through Hierarchical KV Equalization

Add code
Mar 14, 2025
Viaarxiv icon

Learning Few-Step Diffusion Models by Trajectory Distribution Matching

Add code
Mar 09, 2025
Viaarxiv icon

Structured Outputs Enable General-Purpose LLMs to be Medical Experts

Add code
Mar 05, 2025
Viaarxiv icon