Picture for Jinman Zhao

Jinman Zhao

University of Toronto

EndPrompt: Efficient Long-Context Extension via Terminal Anchoring

Add code
May 14, 2026
Viaarxiv icon

All Circuits Lead to Rome: Rethinking Functional Anisotropy in Circuit and Sheaf Discovery for LLMs

Add code
May 12, 2026
Viaarxiv icon

What Happens Inside Agent Memory? Circuit Analysis from Emergence to Diagnosis

Add code
May 05, 2026
Viaarxiv icon

Reinforcing Consistency in Video MLLMs with Structured Rewards

Add code
Apr 01, 2026
Viaarxiv icon

CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions

Add code
Feb 23, 2026
Viaarxiv icon

Not All Preferences Are Created Equal: Stability-Aware and Gradient-Efficient Alignment for Reasoning Models

Add code
Feb 01, 2026
Viaarxiv icon

Mitigating Hallucinations in Video Large Language Models via Spatiotemporal-Semantic Contrastive Decoding

Add code
Jan 30, 2026
Viaarxiv icon

$λ$-GRPO: Unifying the GRPO Frameworks with Learnable Token Preferences

Add code
Oct 08, 2025
Figure 1 for $λ$-GRPO: Unifying the GRPO Frameworks with Learnable Token Preferences
Figure 2 for $λ$-GRPO: Unifying the GRPO Frameworks with Learnable Token Preferences
Figure 3 for $λ$-GRPO: Unifying the GRPO Frameworks with Learnable Token Preferences
Figure 4 for $λ$-GRPO: Unifying the GRPO Frameworks with Learnable Token Preferences
Viaarxiv icon

Staying in the Sweet Spot: Responsive Reasoning Evolution via Capability-Adaptive Hint Scaffolding

Add code
Sep 08, 2025
Viaarxiv icon

Pretraining on the Test Set Is No Longer All You Need: A Debate-Driven Approach to QA Benchmarks

Add code
Jul 23, 2025
Viaarxiv icon