Picture for Zhijiang Guo

Zhijiang Guo

ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall

Add code
Oct 09, 2025
Viaarxiv icon

When Inverse Data Outperforms: Exploring the Pitfalls of Mixed Data in Multi-Stage Fine-Tuning

Add code
Sep 16, 2025
Viaarxiv icon

ClimateViz: A Benchmark for Statistical Reasoning and Fact Verification on Scientific Charts

Add code
Jun 11, 2025
Viaarxiv icon

TreeReview: A Dynamic Tree of Questions Framework for Deep and Efficient LLM-based Scientific Peer Review

Add code
Jun 09, 2025
Figure 1 for TreeReview: A Dynamic Tree of Questions Framework for Deep and Efficient LLM-based Scientific Peer Review
Figure 2 for TreeReview: A Dynamic Tree of Questions Framework for Deep and Efficient LLM-based Scientific Peer Review
Figure 3 for TreeReview: A Dynamic Tree of Questions Framework for Deep and Efficient LLM-based Scientific Peer Review
Figure 4 for TreeReview: A Dynamic Tree of Questions Framework for Deep and Efficient LLM-based Scientific Peer Review
Viaarxiv icon

TreeRPO: Tree Relative Policy Optimization

Add code
Jun 05, 2025
Viaarxiv icon

SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving

Add code
May 29, 2025
Viaarxiv icon

AVerImaTeC: A Dataset for Automatic Verification of Image-Text Claims with Evidence from the Web

Add code
May 23, 2025
Viaarxiv icon

Activation-Guided Consensus Merging for Large Language Models

Add code
May 20, 2025
Viaarxiv icon

TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios

Add code
May 19, 2025
Viaarxiv icon

EffiBench-X: A Multi-Language Benchmark for Measuring Efficiency of LLM-Generated Code

Add code
May 19, 2025
Viaarxiv icon