Picture for Zhijiang Guo

Zhijiang Guo

DGPO: Beyond Pairwise Preferences with Directional Consistent Groupwise Optimization

Add code
May 11, 2026
Viaarxiv icon

CodeSpecBench: Benchmarking LLMs for Executable Behavioral Specification Generation

Add code
Apr 14, 2026
Viaarxiv icon

Spotlight and Shadow: Attention-Guided Dual-Anchor Introspective Decoding for MLLM Hallucination Mitigation

Add code
Apr 11, 2026
Viaarxiv icon

Skip-Connected Policy Optimization for Implicit Advantage

Add code
Apr 09, 2026
Viaarxiv icon

DeReason: A Difficulty-Aware Curriculum Improves Decoupled SFT-then-RL Training for General Reasoning

Add code
Mar 11, 2026
Viaarxiv icon

Efficient RLVR Training via Weighted Mutual Information Data Selection

Add code
Mar 02, 2026
Viaarxiv icon

When Silence Is Golden: Can LLMs Learn to Abstain in Temporal QA and Beyond?

Add code
Feb 04, 2026
Viaarxiv icon

Accordion-Thinking: Self-Regulated Step Summaries for Efficient and Readable LLM Reasoning

Add code
Feb 03, 2026
Viaarxiv icon

Merging Beyond: Streaming LLM Updates via Activation-Guided Rotations

Add code
Feb 03, 2026
Viaarxiv icon

ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall

Add code
Oct 09, 2025
Figure 1 for ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall
Figure 2 for ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall
Figure 3 for ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall
Figure 4 for ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall
Viaarxiv icon