Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Henger Li

Verification-Guided Context Optimization for Tool Calling via Hierarchical LLMs-as-Editors

Dec 15, 2025

Henger Li, Shuangjie You, Flavio Di Palo, Yiyue Qian, Ayush Jain

Figure 1 for Verification-Guided Context Optimization for Tool Calling via Hierarchical LLMs-as-Editors

Figure 2 for Verification-Guided Context Optimization for Tool Calling via Hierarchical LLMs-as-Editors

Figure 3 for Verification-Guided Context Optimization for Tool Calling via Hierarchical LLMs-as-Editors

Figure 4 for Verification-Guided Context Optimization for Tool Calling via Hierarchical LLMs-as-Editors

Abstract:Tool calling enables large language models (LLMs) to interact with external environments through tool invocation, providing a practical way to overcome the limitations of pretraining. However, the effectiveness of tool use depends heavily on the quality of the associated documentation and knowledge base context. These materials are usually written for human users and are often misaligned with how LLMs interpret information. This problem is even more pronounced in industrial settings, where hundreds of tools with overlapping functionality create challenges in scalability, variability, and ambiguity. We propose Verification-Guided Context Optimization (VGCO), a framework that uses LLMs as editors to automatically refine tool-related documentation and knowledge base context. VGCO works in two stages. First, Evaluation collects real-world failure cases and identifies mismatches between tools and their context. Second, Optimization performs hierarchical editing through offline learning with structure-aware, in-context optimization. The novelty of our LLM editors has three main aspects. First, they use a hierarchical structure that naturally integrates into the tool-calling workflow. Second, they are state-aware, action-specific, and verification-guided, which constrains the search space and enables efficient, targeted improvements. Third, they enable cost-efficient sub-task specialization, either by prompt engineering large editor models or by post-training smaller editor models. Unlike prior work that emphasizes multi-turn reasoning, VGCO focuses on the single-turn, large-scale tool-calling problem and achieves significant improvements in accuracy, robustness, and generalization across LLMs.

* Accepted by AAAI 2026 Workshop on Agentic AI Benchmarks and Applications for Enterprise Tasks

Via

Access Paper or Ask Questions

Online Learning with Probing for Sequential User-Centric Selection

Jul 27, 2025

Tianyi Xu, Yiting Chen, Henger Li, Zheyong Bian, Emiliano Dall'Anese, Zizhan Zheng

Figure 1 for Online Learning with Probing for Sequential User-Centric Selection

Figure 2 for Online Learning with Probing for Sequential User-Centric Selection

Figure 3 for Online Learning with Probing for Sequential User-Centric Selection

Abstract:We formalize sequential decision-making with information acquisition as the probing-augmented user-centric selection (PUCS) framework, where a learner first probes a subset of arms to obtain side information on resources and rewards, and then assigns $K$ plays to $M$ arms. PUCS covers applications such as ridesharing, wireless scheduling, and content recommendation, in which both resources and payoffs are initially unknown and probing is costly. For the offline setting with known distributions, we present a greedy probing algorithm with a constant-factor approximation guarantee $\zeta = (e-1)/(2e-1)$. For the online setting with unknown distributions, we introduce OLPA, a stochastic combinatorial bandit algorithm that achieves a regret bound $\mathcal{O}(\sqrt{T} + \ln^{2} T)$. We also prove a lower bound $\Omega(\sqrt{T})$, showing that the upper bound is tight up to logarithmic factors. Experiments on real-world data demonstrate the effectiveness of our solutions.

Via

Access Paper or Ask Questions

Meta Stackelberg Game: Robust Federated Learning against Adaptive and Mixed Poisoning Attacks

Oct 22, 2024

Tao Li, Henger Li, Yunian Pan, Tianyi Xu, Zizhan Zheng, Quanyan Zhu

Figure 1 for Meta Stackelberg Game: Robust Federated Learning against Adaptive and Mixed Poisoning Attacks

Figure 2 for Meta Stackelberg Game: Robust Federated Learning against Adaptive and Mixed Poisoning Attacks

Figure 3 for Meta Stackelberg Game: Robust Federated Learning against Adaptive and Mixed Poisoning Attacks

Figure 4 for Meta Stackelberg Game: Robust Federated Learning against Adaptive and Mixed Poisoning Attacks

Abstract:Federated learning (FL) is susceptible to a range of security threats. Although various defense mechanisms have been proposed, they are typically non-adaptive and tailored to specific types of attacks, leaving them insufficient in the face of multiple uncertain, unknown, and adaptive attacks employing diverse strategies. This work formulates adversarial federated learning under a mixture of various attacks as a Bayesian Stackelberg Markov game, based on which we propose the meta-Stackelberg defense composed of pre-training and online adaptation. {The gist is to simulate strong attack behavior using reinforcement learning (RL-based attacks) in pre-training and then design meta-RL-based defense to combat diverse and adaptive attacks.} We develop an efficient meta-learning approach to solve the game, leading to a robust and adaptive FL defense. Theoretically, our meta-learning algorithm, meta-Stackelberg learning, provably converges to the first-order $\varepsilon$-meta-equilibrium point in $O(\varepsilon^{-2})$ gradient iterations with $O(\varepsilon^{-4})$ samples per iteration. Experiments show that our meta-Stackelberg framework performs superbly against strong model poisoning and backdoor attacks of uncertain and unknown types.

* This work has been submitted to the IEEE for possible publication

Via

Access Paper or Ask Questions

A First Order Meta Stackelberg Method for Robust Federated Learning

Jul 16, 2023

Yunian Pan, Tao Li, Henger Li, Tianyi Xu, Zizhan Zheng, Quanyan Zhu

Figure 1 for A First Order Meta Stackelberg Method for Robust Federated Learning

Figure 2 for A First Order Meta Stackelberg Method for Robust Federated Learning

Abstract:Previous research has shown that federated learning (FL) systems are exposed to an array of security risks. Despite the proposal of several defensive strategies, they tend to be non-adaptive and specific to certain types of attacks, rendering them ineffective against unpredictable or adaptive threats. This work models adversarial federated learning as a Bayesian Stackelberg Markov game (BSMG) to capture the defender's incomplete information of various attack types. We propose meta-Stackelberg learning (meta-SL), a provably efficient meta-learning algorithm, to solve the equilibrium strategy in BSMG, leading to an adaptable FL defense. We demonstrate that meta-SL converges to the first-order $\varepsilon$-equilibrium point in $O(\varepsilon^{-2})$ gradient iterations, with $O(\varepsilon^{-4})$ samples needed per iteration, matching the state of the art. Empirical evidence indicates that our meta-Stackelberg framework performs exceptionally well against potent model poisoning and backdoor attacks of an uncertain nature.

* Accepted to ICML 2023 Workshop on The 2nd New Frontiers In Adversarial Machine Learning. Associated technical report arXiv:2306.13273

Via

Access Paper or Ask Questions

Learning to Backdoor Federated Learning

Mar 06, 2023

Henger Li, Chen Wu, Senchun Zhu, Zizhan Zheng

Figure 1 for Learning to Backdoor Federated Learning

Figure 2 for Learning to Backdoor Federated Learning

Figure 3 for Learning to Backdoor Federated Learning

Figure 4 for Learning to Backdoor Federated Learning

Abstract:In a federated learning (FL) system, malicious participants can easily embed backdoors into the aggregated model while maintaining the model's performance on the main task. To this end, various defenses, including training stage aggregation-based defenses and post-training mitigation defenses, have been proposed recently. While these defenses obtain reasonable performance against existing backdoor attacks, which are mainly heuristics based, we show that they are insufficient in the face of more advanced attacks. In particular, we propose a general reinforcement learning-based backdoor attack framework where the attacker first trains a (non-myopic) attack policy using a simulator built upon its local data and common knowledge on the FL system, which is then applied during actual FL training. Our attack framework is both adaptive and flexible and achieves strong attack performance and durability even under state-of-the-art defenses.

Via

Access Paper or Ask Questions