Picture for Liangyou Li

Liangyou Li

ARTIS: Agentic Risk-Aware Test-Time Scaling via Iterative Simulation

Add code
Feb 03, 2026
Viaarxiv icon

From Verifiable Dot to Reward Chain: Harnessing Verifiable Reference-based Rewards for Reinforcement Learning of Open-ended Generation

Add code
Jan 26, 2026
Viaarxiv icon

Stepwise Reasoning Checkpoint Analysis: A Test Time Scaling Method to Enhance LLMs' Reasoning

Add code
May 23, 2025
Viaarxiv icon

ToolACE-R: Tool Learning with Adaptive Self-Refinement

Add code
Apr 02, 2025
Figure 1 for ToolACE-R: Tool Learning with Adaptive Self-Refinement
Figure 2 for ToolACE-R: Tool Learning with Adaptive Self-Refinement
Figure 3 for ToolACE-R: Tool Learning with Adaptive Self-Refinement
Figure 4 for ToolACE-R: Tool Learning with Adaptive Self-Refinement
Viaarxiv icon

Learning to Align Multi-Faceted Evaluation: A Unified and Robust Framework

Add code
Feb 26, 2025
Viaarxiv icon

Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge

Add code
Feb 18, 2025
Figure 1 for Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge
Figure 2 for Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge
Figure 3 for Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge
Figure 4 for Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge
Viaarxiv icon

NILE: Internal Consistency Alignment in Large Language Models

Add code
Dec 21, 2024
Viaarxiv icon

ToolFlow: Boosting LLM Tool-Calling Through Natural and Coherent Dialogue Synthesis

Add code
Oct 24, 2024
Figure 1 for ToolFlow: Boosting LLM Tool-Calling Through Natural and Coherent Dialogue Synthesis
Figure 2 for ToolFlow: Boosting LLM Tool-Calling Through Natural and Coherent Dialogue Synthesis
Figure 3 for ToolFlow: Boosting LLM Tool-Calling Through Natural and Coherent Dialogue Synthesis
Figure 4 for ToolFlow: Boosting LLM Tool-Calling Through Natural and Coherent Dialogue Synthesis
Viaarxiv icon

Subtle Errors Matter: Preference Learning via Error-injected Self-editing

Add code
Oct 09, 2024
Figure 1 for Subtle Errors Matter: Preference Learning via Error-injected Self-editing
Figure 2 for Subtle Errors Matter: Preference Learning via Error-injected Self-editing
Figure 3 for Subtle Errors Matter: Preference Learning via Error-injected Self-editing
Figure 4 for Subtle Errors Matter: Preference Learning via Error-injected Self-editing
Viaarxiv icon

RevisEval: Improving LLM-as-a-Judge via Response-Adapted References

Add code
Oct 07, 2024
Figure 1 for RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Figure 2 for RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Figure 3 for RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Figure 4 for RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Viaarxiv icon