Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bo Xu

Strategy-Augmented Planning for Large Language Models via Opponent Exploitation

May 13, 2025

Shuai Xu, Sijia Cui, Yanna Wang, Bo Xu, Qi Wang

Abstract:Efficiently modeling and exploiting opponents is a long-standing challenge in adversarial domains. Large Language Models (LLMs) trained on extensive textual data have recently demonstrated outstanding performance in general tasks, introducing new research directions for opponent modeling. Some studies primarily focus on directly using LLMs to generate decisions based on the elaborate prompt context that incorporates opponent descriptions, while these approaches are limited to scenarios where LLMs possess adequate domain expertise. To address that, we introduce a two-stage Strategy-Augmented Planning (SAP) framework that significantly enhances the opponent exploitation capabilities of LLM-based agents by utilizing a critical component, the Strategy Evaluation Network (SEN). Specifically, in the offline stage, we construct an explicit strategy space and subsequently collect strategy-outcome pair data for training the SEN network. During the online phase, SAP dynamically recognizes the opponent's strategies and greedily exploits them by searching best response strategy on the well-trained SEN, finally translating strategy to a course of actions by carefully designed prompts. Experimental results show that SAP exhibits robust generalization capabilities, allowing it to perform effectively not only against previously encountered opponent strategies but also against novel, unseen strategies. In the MicroRTS environment, SAP achieves a 85.35\% performance improvement over baseline methods and matches the competitiveness of reinforcement learning approaches against state-of-the-art (SOTA) rule-based AI.

* Accepted to IJCNN 2025

Via

Access Paper or Ask Questions

Efficient Tuning of Large Language Models for Knowledge-Grounded Dialogue Generation

Apr 10, 2025

Bo Zhang, Hui Ma, Dailin Li, Jian Ding, Jian Wang, Bo Xu, HongFei Lin

Abstract:Large language models (LLMs) demonstrate remarkable text comprehension and generation capabilities but often lack the ability to utilize up-to-date or domain-specific knowledge not included in their training data. To address this gap, we introduce KEDiT, an efficient method for fine-tuning LLMs for knowledge-grounded dialogue generation. KEDiT operates in two main phases: first, it employs an information bottleneck to compress retrieved knowledge into learnable parameters, retaining essential information while minimizing computational overhead. Second, a lightweight knowledge-aware adapter integrates these compressed knowledge vectors into the LLM during fine-tuning, updating less than 2\% of the model parameters. The experimental results on the Wizard of Wikipedia and a newly constructed PubMed-Dialog dataset demonstrate that KEDiT excels in generating contextually relevant and informative responses, outperforming competitive baselines in automatic, LLM-based, and human evaluations. This approach effectively combines the strengths of pretrained LLMs with the adaptability needed for incorporating dynamic knowledge, presenting a scalable solution for fields such as medicine.

* Accepted at TACL; pre-MIT Press publication version. Code and data are available at https://github.com/zhangbo-nlp/KEDiT

Via

Access Paper or Ask Questions

Unveiling the Capabilities of Large Language Models in Detecting Offensive Language with Annotation Disagreement

Feb 10, 2025

Junyu Lu, Kai Ma, Kaichun Wang, Kelaiti Xiao, Roy Ka-Wei Lee, Bo Xu, Liang Yang, Hongfei Lin

Abstract:LLMs are widely used for offensive language detection due to their advanced capability. However, the challenges posed by human annotation disagreement in real-world datasets remain underexplored. These disagreement samples are difficult to detect due to their ambiguous nature. Additionally, the confidence of LLMs in processing disagreement samples can provide valuable insights into their alignment with human annotators. To address this gap, we systematically evaluate the ability of LLMs to detect offensive language with annotation disagreement. We compare the binary accuracy of multiple LLMs across varying annotation agreement levels and analyze the relationship between LLM confidence and annotation agreement. Furthermore, we investigate the impact of disagreement samples on LLM decision-making during few-shot learning and instruction fine-tuning. Our findings highlight the challenges posed by disagreement samples and offer guidance for improving LLM-based offensive language detection.

* 17 pages, submitted to the ACL 2025

Via

Access Paper or Ask Questions

Episodic Novelty Through Temporal Distance

Jan 26, 2025

Yuhua Jiang, Qihan Liu, Yiqin Yang, Xiaoteng Ma, Dianyu Zhong, Hao Hu, Jun Yang, Bin Liang, Bo Xu, Chongjie Zhang(+1 more)

Abstract:Exploration in sparse reward environments remains a significant challenge in reinforcement learning, particularly in Contextual Markov Decision Processes (CMDPs), where environments differ across episodes. Existing episodic intrinsic motivation methods for CMDPs primarily rely on count-based approaches, which are ineffective in large state spaces, or on similarity-based methods that lack appropriate metrics for state comparison. To address these shortcomings, we propose Episodic Novelty Through Temporal Distance (ETD), a novel approach that introduces temporal distance as a robust metric for state similarity and intrinsic reward computation. By employing contrastive learning, ETD accurately estimates temporal distances and derives intrinsic rewards based on the novelty of states within the current episode. Extensive experiments on various benchmark tasks demonstrate that ETD significantly outperforms state-of-the-art methods, highlighting its effectiveness in enhancing exploration in sparse reward CMDPs.

* ICLR2025

Via

Access Paper or Ask Questions

Triple Path Enhanced Neural Architecture Search for Multimodal Fake News Detection

Jan 24, 2025

Bo Xu, Qiujie Xie, Jiahui Zhou, Linlin Zong

Abstract:Multimodal fake news detection has become one of the most crucial issues on social media platforms. Although existing methods have achieved advanced performance, two main challenges persist: (1) Under-performed multimodal news information fusion due to model architecture solidification, and (2) weak generalization ability on partial-modality contained fake news. To meet these challenges, we propose a novel and flexible triple path enhanced neural architecture search model MUSE. MUSE includes two dynamic paths for detecting partial-modality contained fake news and a static path for exploiting potential multimodal correlations. Experimental results show that MUSE achieves stable performance improvement over the baselines.

* This paper has been accepted into the IEEE International Conference on Acoustics, Speech, and Signal Processing(ICASSP 2024)

Via

Access Paper or Ask Questions

AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model

Jan 19, 2025

Lipeng Ma, Weidong Yang, Yixuan Li, Ben Fei, Mingjie Zhou, Shuhao Li, Sihang Jiang, Bo Xu, Yanghua Xiao

Figure 1 for AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model

Figure 2 for AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model

Figure 3 for AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model

Figure 4 for AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model

Abstract:Automated log analysis is crucial to ensure high availability and reliability of complex systems. The advent of LLMs in NLP has ushered in a new era of language model-driven automated log analysis, garnering significant interest. Within this field, two primary paradigms based on language models for log analysis have become prominent. Small Language Models (SLMs) follow the pre-train and fine-tune paradigm, focusing on the specific log analysis task through fine-tuning on supervised datasets. On the other hand, LLMs following the in-context learning paradigm, analyze logs by providing a few examples in prompt contexts without updating parameters. Despite their respective strengths, we notice that SLMs are more cost-effective but less powerful, whereas LLMs with large parameters are highly powerful but expensive and inefficient. To trade-off between the performance and inference costs of both models in automated log analysis, this paper introduces an adaptive log analysis framework known as AdaptiveLog, which effectively reduces the costs associated with LLM while ensuring superior results. This framework collaborates an LLM and a small language model, strategically allocating the LLM to tackle complex logs while delegating simpler logs to the SLM. Specifically, to efficiently query the LLM, we propose an adaptive selection strategy based on the uncertainty estimation of the SLM, where the LLM is invoked only when the SLM is uncertain. In addition, to enhance the reasoning ability of the LLM in log analysis tasks, we propose a novel prompt strategy by retrieving similar error-prone cases as the reference, enabling the model to leverage past error experiences and learn solutions from these cases. Extensive experiments demonstrate that AdaptiveLog achieves state-of-the-art results across different tasks, elevating the overall accuracy of log analysis while maintaining cost efficiency.

Via

Access Paper or Ask Questions

$β$-DQN: Improving Deep Q-Learning By Evolving the Behavior

Jan 01, 2025

Hongming Zhang, Fengshuo Bai, Chenjun Xiao, Chao Gao, Bo Xu, Martin Müller

Figure 1 for $β$-DQN: Improving Deep Q-Learning By Evolving the Behavior

Figure 2 for $β$-DQN: Improving Deep Q-Learning By Evolving the Behavior

Figure 3 for $β$-DQN: Improving Deep Q-Learning By Evolving the Behavior

Figure 4 for $β$-DQN: Improving Deep Q-Learning By Evolving the Behavior

Abstract:While many sophisticated exploration methods have been proposed, their lack of generality and high computational cost often lead researchers to favor simpler methods like $\epsilon$-greedy. Motivated by this, we introduce $\beta$-DQN, a simple and efficient exploration method that augments the standard DQN with a behavior function $\beta$. This function estimates the probability that each action has been taken at each state. By leveraging $\beta$, we generate a population of diverse policies that balance exploration between state-action coverage and overestimation bias correction. An adaptive meta-controller is designed to select an effective policy for each episode, enabling flexible and explainable exploration. $\beta$-DQN is straightforward to implement and adds minimal computational overhead to the standard DQN. Experiments on both simple and challenging exploration domains show that $\beta$-DQN outperforms existing baseline methods across a wide range of tasks, providing an effective solution for improving exploration in deep reinforcement learning.

Via

Access Paper or Ask Questions

Spike2Former: Efficient Spiking Transformer for High-performance Image Segmentation

Dec 19, 2024

Zhenxin Lei, Man Yao, Jiakui Hu, Xinhao Luo, Yanye Lu, Bo Xu, Guoqi Li

Figure 1 for Spike2Former: Efficient Spiking Transformer for High-performance Image Segmentation

Figure 2 for Spike2Former: Efficient Spiking Transformer for High-performance Image Segmentation

Figure 3 for Spike2Former: Efficient Spiking Transformer for High-performance Image Segmentation

Figure 4 for Spike2Former: Efficient Spiking Transformer for High-performance Image Segmentation

Abstract:Spiking Neural Networks (SNNs) have a low-power advantage but perform poorly in image segmentation tasks. The reason is that directly converting neural networks with complex architectural designs for segmentation tasks into spiking versions leads to performance degradation and non-convergence. To address this challenge, we first identify the modules in the architecture design that lead to the severe reduction in spike firing, make targeted improvements, and propose Spike2Former architecture. Second, we propose normalized integer spiking neurons to solve the training stability problem of SNNs with complex architectures. We set a new state-of-the-art for SNNs in various semantic segmentation datasets, with a significant improvement of +12.7% mIoU and 5.0 efficiency on ADE20K, +14.3% mIoU and 5.2 efficiency on VOC2012, and +9.1% mIoU and 6.6 efficiency on CityScapes.

* This work has been accepted on Association for the Advancement of Artificial Intelligence 2025

Via

Access Paper or Ask Questions

Efficient 3D Recognition with Event-driven Spike Sparse Convolution

Dec 10, 2024

Xuerui Qiu, Man Yao, Jieyuan Zhang, Yuhong Chou, Ning Qiao, Shibo Zhou, Bo Xu, Guoqi Li

Figure 1 for Efficient 3D Recognition with Event-driven Spike Sparse Convolution

Figure 2 for Efficient 3D Recognition with Event-driven Spike Sparse Convolution

Figure 3 for Efficient 3D Recognition with Event-driven Spike Sparse Convolution

Figure 4 for Efficient 3D Recognition with Event-driven Spike Sparse Convolution

Abstract:Spiking Neural Networks (SNNs) provide an energy-efficient way to extract 3D spatio-temporal features. Point clouds are sparse 3D spatial data, which suggests that SNNs should be well-suited for processing them. However, when applying SNNs to point clouds, they often exhibit limited performance and fewer application scenarios. We attribute this to inappropriate preprocessing and feature extraction methods. To address this issue, we first introduce the Spike Voxel Coding (SVC) scheme, which encodes the 3D point clouds into a sparse spike train space, reducing the storage requirements and saving time on point cloud preprocessing. Then, we propose a Spike Sparse Convolution (SSC) model for efficiently extracting 3D sparse point cloud features. Combining SVC and SSC, we design an efficient 3D SNN backbone (E-3DSNN), which is friendly with neuromorphic hardware. For instance, SSC can be implemented on neuromorphic chips with only minor modifications to the addressing function of vanilla spike convolution. Experiments on ModelNet40, KITTI, and Semantic KITTI datasets demonstrate that E-3DSNN achieves state-of-the-art (SOTA) results with remarkable efficiency. Notably, our E-3DSNN (1.87M) obtained 91.7\% top-1 accuracy on ModelNet40, surpassing the current best SNN baselines (14.3M) by 3.0\%. To our best knowledge, it is the first direct training 3D SNN backbone that can simultaneously handle various 3D computer vision tasks (e.g., classification, detection, and segmentation) with an event-driven nature. Code is available: https://github.com/bollossom/E-3DSNN/.

* Accepted by AAAI 2025

Via

Access Paper or Ask Questions

Learnable Infinite Taylor Gaussian for Dynamic View Rendering

Dec 05, 2024

Bingbing Hu, Yanyan Li, Rui Xie, Bo Xu, Haoye Dong, Junfeng Yao, Gim Hee Lee

Figure 1 for Learnable Infinite Taylor Gaussian for Dynamic View Rendering

Figure 2 for Learnable Infinite Taylor Gaussian for Dynamic View Rendering

Figure 3 for Learnable Infinite Taylor Gaussian for Dynamic View Rendering

Figure 4 for Learnable Infinite Taylor Gaussian for Dynamic View Rendering

Abstract:Capturing the temporal evolution of Gaussian properties such as position, rotation, and scale is a challenging task due to the vast number of time-varying parameters and the limited photometric data available, which generally results in convergence issues, making it difficult to find an optimal solution. While feeding all inputs into an end-to-end neural network can effectively model complex temporal dynamics, this approach lacks explicit supervision and struggles to generate high-quality transformation fields. On the other hand, using time-conditioned polynomial functions to model Gaussian trajectories and orientations provides a more explicit and interpretable solution, but requires significant handcrafted effort and lacks generalizability across diverse scenes. To overcome these limitations, this paper introduces a novel approach based on a learnable infinite Taylor Formula to model the temporal evolution of Gaussians. This method offers both the flexibility of an implicit network-based approach and the interpretability of explicit polynomial functions, allowing for more robust and generalizable modeling of Gaussian dynamics across various dynamic scenes. Extensive experiments on dynamic novel view rendering tasks are conducted on public datasets, demonstrating that the proposed method achieves state-of-the-art performance in this domain. More information is available on our project page(https://ellisonking.github.io/TaylorGaussian).

Via

Access Paper or Ask Questions