Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chengrui Huang

TTPA: Token-level Tool-use Preference Alignment Training Framework with Fine-grained Evaluation

May 26, 2025

Chengrui Huang, Shen Gao, Zhengliang Shi, Dongsheng Wang, Shuo Shang

Abstract:Existing tool-learning methods usually rely on supervised fine-tuning, they often overlook fine-grained optimization of internal tool call details, leading to limitations in preference alignment and error discrimination. To overcome these challenges, we propose Token-level Tool-use Preference Alignment Training Framework (TTPA), a training paradigm for constructing token-level tool-use preference datasets that align LLMs with fine-grained preferences using a novel error-oriented scoring mechanism. TTPA first introduces reversed dataset construction, a method for creating high-quality, multi-turn tool-use datasets by reversing the generation flow. Additionally, we propose Token-level Preference Sampling (TPS) to capture fine-grained preferences by modeling token-level differences during generation. To address biases in scoring, we introduce the Error-oriented Scoring Mechanism (ESM), which quantifies tool-call errors and can be used as a training signal. Extensive experiments on three diverse benchmark datasets demonstrate that TTPA significantly improves tool-using performance while showing strong generalization ability across models and datasets.

* 16 pages, 5 figures

Via

Access Paper or Ask Questions

What Affects the Stability of Tool Learning? An Empirical Study on the Robustness of Tool Learning Frameworks

Jul 03, 2024

Chengrui Huang, Zhengliang Shi, Yuntao Wen, Xiuying Chen, Peng Han, Shen Gao, Shuo Shang

Figure 1 for What Affects the Stability of Tool Learning? An Empirical Study on the Robustness of Tool Learning Frameworks

Figure 2 for What Affects the Stability of Tool Learning? An Empirical Study on the Robustness of Tool Learning Frameworks

Figure 3 for What Affects the Stability of Tool Learning? An Empirical Study on the Robustness of Tool Learning Frameworks

Figure 4 for What Affects the Stability of Tool Learning? An Empirical Study on the Robustness of Tool Learning Frameworks

Abstract:Tool learning methods have enhanced the ability of large language models (LLMs) to interact with real-world applications. Many existing works fine-tune LLMs or design prompts to enable LLMs to select appropriate tools and correctly invoke them to meet user requirements. However, it is observed in previous works that the performance of tool learning varies from tasks, datasets, training settings, and algorithms. Without understanding the impact of these factors, it can lead to inconsistent results, inefficient model deployment, and suboptimal tool utilization, ultimately hindering the practical integration and scalability of LLMs in real-world scenarios. Therefore, in this paper, we explore the impact of both internal and external factors on the performance of tool learning frameworks. Through extensive experiments on two benchmark datasets, we find several insightful conclusions for future work, including the observation that LLMs can benefit significantly from increased trial and exploration. We believe our empirical study provides a new perspective for future tool learning research.

* 19 pages, 9 figures

Via

Access Paper or Ask Questions

360°REA: Towards A Reusable Experience Accumulation with 360° Assessment for Multi-Agent System

Apr 08, 2024

Shen Gao, Hao Li, Zhengliang Shi, Chengrui Huang, Quan Tu, Zhiliang Tian, Minlie Huang, Shuo Shang

Figure 1 for 360°REA: Towards A Reusable Experience Accumulation with 360° Assessment for Multi-Agent System

Figure 2 for 360°REA: Towards A Reusable Experience Accumulation with 360° Assessment for Multi-Agent System

Figure 3 for 360°REA: Towards A Reusable Experience Accumulation with 360° Assessment for Multi-Agent System

Figure 4 for 360°REA: Towards A Reusable Experience Accumulation with 360° Assessment for Multi-Agent System

Abstract:Large language model agents have demonstrated remarkable advancements across various complex tasks. Recent works focus on optimizing the agent team or employing self-reflection to iteratively solve complex tasks. Since these agents are all based on the same LLM, only conducting self-evaluation or removing underperforming agents does not substantively enhance the capability of the agents. We argue that a comprehensive evaluation and accumulating experience from evaluation feedback is an effective approach to improving system performance. In this paper, we propose Reusable Experience Accumulation with 360{\deg} Assessment (360{\deg}REA), a hierarchical multi-agent framework inspired by corporate organizational practices. The framework employs a novel 360{\deg} performance assessment method for multi-perspective performance evaluation with fine-grained assessment. To enhance the capability of agents in addressing complex tasks, we introduce dual-level experience pool for agents to accumulate experience through fine-grained assessment. Extensive experiments on complex task datasets demonstrate the effectiveness of 360{\deg}REA.

Via

Access Paper or Ask Questions