Picture for Juncheng Wu

Juncheng Wu

Chasing the Public Score: User Pressure and Evaluation Exploitation in Coding Agent Workflows

Add code
Apr 22, 2026
Viaarxiv icon

Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw

Add code
Apr 06, 2026
Viaarxiv icon

EntropyPrune: Matrix Entropy Guided Visual Token Pruning for Multimodal Large Language Models

Add code
Feb 19, 2026
Viaarxiv icon

All You Need Are Random Visual Tokens? Demystifying Token Pruning in VLLMs

Add code
Dec 08, 2025
Viaarxiv icon

MedFrameQA: A Multi-Image Medical VQA Benchmark for Clinical Reasoning

Add code
May 22, 2025
Viaarxiv icon

STAR-1: Safer Alignment of Reasoning LLMs with 1K Data

Add code
Apr 02, 2025
Viaarxiv icon

MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs

Add code
Apr 01, 2025
Figure 1 for MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
Figure 2 for MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
Figure 3 for MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
Figure 4 for MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
Viaarxiv icon

m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning with Large Language Models

Add code
Apr 01, 2025
Figure 1 for m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning with Large Language Models
Figure 2 for m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning with Large Language Models
Figure 3 for m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning with Large Language Models
Figure 4 for m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning with Large Language Models
Viaarxiv icon

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine

Add code
Aug 06, 2024
Viaarxiv icon

DDR: Exploiting Deep Degradation Response as Flexible Image Descriptor

Add code
Jun 12, 2024
Figure 1 for DDR: Exploiting Deep Degradation Response as Flexible Image Descriptor
Figure 2 for DDR: Exploiting Deep Degradation Response as Flexible Image Descriptor
Figure 3 for DDR: Exploiting Deep Degradation Response as Flexible Image Descriptor
Figure 4 for DDR: Exploiting Deep Degradation Response as Flexible Image Descriptor
Viaarxiv icon