Picture for Ruoyu Chen

Ruoyu Chen

School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong SAR, China

Did Models Sufficient Learn? Attribution-Guided Training via Subset-Selected Counterfactual Augmentation

Add code
Nov 15, 2025
Viaarxiv icon

PhaseWin Search Framework Enable Efficient Object-Level Interpretation

Add code
Nov 14, 2025
Viaarxiv icon

Where MLLMs Attend and What They Rely On: Explaining Autoregressive Token Generation

Add code
Sep 26, 2025
Viaarxiv icon

Explaining multimodal LLMs via intra-modal token interactions

Add code
Sep 26, 2025
Viaarxiv icon

SMA: Who Said That? Auditing Membership Leakage in Semi-Black-box RAG Controlling

Add code
Aug 12, 2025
Viaarxiv icon

Understanding and Benchmarking the Trustworthiness in Multimodal LLMs for Video Understanding

Add code
Jun 14, 2025
Viaarxiv icon

APTOS-2024 challenge report: Generation of synthetic 3D OCT images from fundus photographs

Add code
Jun 09, 2025
Viaarxiv icon

Unpacking Positional Encoding in Transformers: A Spectral Analysis of Content-Position Coupling

Add code
May 19, 2025
Viaarxiv icon

FaceInsight: A Multimodal Large Language Model for Face Perception

Add code
Apr 22, 2025
Figure 1 for FaceInsight: A Multimodal Large Language Model for Face Perception
Figure 2 for FaceInsight: A Multimodal Large Language Model for Face Perception
Figure 3 for FaceInsight: A Multimodal Large Language Model for Face Perception
Figure 4 for FaceInsight: A Multimodal Large Language Model for Face Perception
Viaarxiv icon

Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection

Add code
Apr 09, 2025
Viaarxiv icon