Picture for Zhangyue Yin

Zhangyue Yin

Dynamic and Generalizable Process Reward Modeling

Add code
Jul 23, 2025
Viaarxiv icon

R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning

Add code
May 26, 2025
Viaarxiv icon

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Add code
May 26, 2025
Viaarxiv icon

FamilyTool: A Multi-hop Personalized Tool Use Benchmark

Add code
Apr 09, 2025
Figure 1 for FamilyTool: A Multi-hop Personalized Tool Use Benchmark
Figure 2 for FamilyTool: A Multi-hop Personalized Tool Use Benchmark
Figure 3 for FamilyTool: A Multi-hop Personalized Tool Use Benchmark
Figure 4 for FamilyTool: A Multi-hop Personalized Tool Use Benchmark
Viaarxiv icon

Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?

Add code
Feb 17, 2025
Figure 1 for Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?
Figure 2 for Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?
Figure 3 for Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?
Figure 4 for Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?
Viaarxiv icon

Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework

Add code
Jan 26, 2025
Viaarxiv icon

VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning Tasks

Add code
Dec 24, 2024
Viaarxiv icon

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

Add code
Dec 18, 2024
Figure 1 for Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Figure 2 for Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Figure 3 for Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Figure 4 for Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Viaarxiv icon

Unified Active Retrieval for Retrieval Augmented Generation

Add code
Jun 18, 2024
Viaarxiv icon

Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models

Add code
May 21, 2024
Figure 1 for Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models
Figure 2 for Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models
Figure 3 for Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models
Figure 4 for Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models
Viaarxiv icon