Picture for Shikun Zhang

Shikun Zhang

SteerRM: Debiasing Reward Models via Sparse Autoencoders

Add code
Mar 13, 2026
Viaarxiv icon

From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models

Add code
Feb 26, 2026
Viaarxiv icon

Advancing Block Diffusion Language Models for Test-Time Scaling

Add code
Feb 11, 2026
Viaarxiv icon

OPE: Overcoming Information Saturation in Parallel Thinking via Outline-Guided Path Exploration

Add code
Feb 09, 2026
Viaarxiv icon

What Do Agents Learn from Trajectory-SFT: Semantics or Interfaces?

Add code
Feb 02, 2026
Viaarxiv icon

ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback

Add code
Jan 15, 2026
Viaarxiv icon

Modeling Uncertainty Trends for Timely Retrieval in Dynamic RAG

Add code
Nov 13, 2025
Viaarxiv icon

Autoformalizer with Tool Feedback

Add code
Oct 08, 2025
Figure 1 for Autoformalizer with Tool Feedback
Figure 2 for Autoformalizer with Tool Feedback
Figure 3 for Autoformalizer with Tool Feedback
Figure 4 for Autoformalizer with Tool Feedback
Viaarxiv icon

SAEMark: Multi-bit LLM Watermarking with Inference-Time Scaling

Add code
Aug 11, 2025
Figure 1 for SAEMark: Multi-bit LLM Watermarking with Inference-Time Scaling
Figure 2 for SAEMark: Multi-bit LLM Watermarking with Inference-Time Scaling
Figure 3 for SAEMark: Multi-bit LLM Watermarking with Inference-Time Scaling
Figure 4 for SAEMark: Multi-bit LLM Watermarking with Inference-Time Scaling
Viaarxiv icon

Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

Add code
May 23, 2025
Figure 1 for Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective
Figure 2 for Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective
Figure 3 for Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective
Figure 4 for Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective
Viaarxiv icon