Picture for Zhengyin Du

Zhengyin Du

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Add code
Sep 10, 2025
Viaarxiv icon

Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments

Add code
Aug 12, 2025
Figure 1 for Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments
Figure 2 for Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments
Figure 3 for Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments
Figure 4 for Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments
Viaarxiv icon

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Add code
Apr 08, 2025
Figure 1 for VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks
Figure 2 for VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks
Figure 3 for VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks
Viaarxiv icon

Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?

Add code
Apr 01, 2025
Viaarxiv icon

LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion

Add code
Jan 25, 2025
Figure 1 for LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Figure 2 for LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Figure 3 for LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Figure 4 for LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Viaarxiv icon

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Add code
Jan 20, 2025
Viaarxiv icon

ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use

Add code
Jan 07, 2025
Viaarxiv icon

TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use

Add code
Dec 20, 2024
Figure 1 for TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use
Figure 2 for TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use
Figure 3 for TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use
Figure 4 for TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use
Viaarxiv icon