Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Weize Liu

Agentic Critical Training

Mar 09, 2026

Weize Liu, Minghui Liu, Sy-Tuyen Ho, Souradip Chakraborty, Xiyao Wang, Furong Huang

Abstract:Training large language models (LLMs) as autonomous agents often begins with imitation learning, but it only teaches agents what to do without understanding why: agents never contrast successful actions against suboptimal alternatives and thus lack awareness of action quality. Recent approaches attempt to address this by introducing self-reflection supervision derived from contrasts between expert and alternative actions. However, the training paradigm fundamentally remains imitation learning: the model imitates pre-constructed reflection text rather than learning to reason autonomously. We propose Agentic Critical Training (ACT), a reinforcement learning paradigm that trains agents to identify the better action among alternatives. By rewarding whether the model's judgment is correct, ACT drives the model to autonomously develop reasoning about action quality, producing genuine self-reflection rather than imitating it. Across three challenging agent benchmarks, ACT consistently improves agent performance when combined with different post-training methods. It achieves an average improvement of 5.07 points over imitation learning and 4.62 points over reinforcement learning. Compared to approaches that inject reflection capability through knowledge distillation, ACT also demonstrates clear advantages, yielding an average improvement of 2.42 points. Moreover, ACT enables strong out-of-distribution generalization on agentic benchmarks and improves performance on general reasoning benchmarks without any reasoning-specific training data, highlighting the value of our method. These results suggest that ACT is a promising path toward developing more reflective and capable LLM agents.

* Project page: https://attention-is-all-i-need.github.io/ACT/

Via

Access Paper or Ask Questions

From Misleading Queries to Accurate Answers: A Three-Stage Fine-Tuning Method for LLMs

Apr 15, 2025

Guocong Li, Weize Liu, Yihang Wu, Ping Wang, Shuaihan Huang, Hongxia Xu, Jian Wu

Figure 1 for From Misleading Queries to Accurate Answers: A Three-Stage Fine-Tuning Method for LLMs

Figure 2 for From Misleading Queries to Accurate Answers: A Three-Stage Fine-Tuning Method for LLMs

Figure 3 for From Misleading Queries to Accurate Answers: A Three-Stage Fine-Tuning Method for LLMs

Figure 4 for From Misleading Queries to Accurate Answers: A Three-Stage Fine-Tuning Method for LLMs

Abstract:Large language models (LLMs) exhibit excellent performance in natural language processing (NLP), but remain highly sensitive to the quality of input queries, especially when these queries contain misleading or inaccurate information. Existing methods focus on correcting the output, but they often overlook the potential of improving the ability of LLMs to detect and correct misleading content in the input itself. In this paper, we propose a novel three-stage fine-tuning method that enhances the ability of LLMs to detect and correct misleading information in the input, further improving response accuracy and reducing hallucinations. Specifically, the three stages include (1) training LLMs to identify misleading information, (2) training LLMs to correct the misleading information using built-in or external knowledge, and (3) training LLMs to generate accurate answers based on the corrected queries. To evaluate our method, we conducted experiments on three datasets for the hallucination detection task and the question answering (QA) task, as well as two datasets containing misleading information that we constructed. The experimental results demonstrate that our method significantly improves the accuracy and factuality of LLM responses, while also enhancing the ability to detect hallucinations and reducing the generation of hallucinations in the output, particularly when the query contains misleading information. We will publicly release our code upon acceptance.

Via

Access Paper or Ask Questions

Unraveling Babel: Exploring Multilingual Activation Patterns within Large Language Models

Feb 26, 2024

Weize Liu, Yinlong Xu, Hongxia Xu, Jintai Chen, Xuming Hu, Jian Wu

Abstract:Recently, large language models (LLMs) have achieved tremendous breakthroughs in the field of language processing, yet their mechanisms in processing multiple languages remain agnostic. Therefore, in this work we study the multilingual activation patterns of LLMs. By transforming the original Large Language Models (LLMs) into a Mixture of Experts (MoE) architecture, we analyze the expert activation patterns when processing various languages and demonstrate the connections of these activation patterns at the level of language families. We discover the existence of non-language-specific neurons as well as language-specific activation neurons. Further exploration even showcases that merely leveraging high-frequency activation neurons can accelerate inference while maintaining comparable performance. These findings shed light on the LLMs' multilingual processing mechanism, and are of significant importance in guiding the multilingual training and model pruning of LLMs.

Via

Access Paper or Ask Questions

Generative AI for Controllable Protein Sequence Design: A Survey

Feb 16, 2024

Yiheng Zhu, Zitai Kong, Jialu Wu, Weize Liu, Yuqiang Han, Mingze Yin, Hongxia Xu, Chang-Yu Hsieh, Tingjun Hou

Figure 1 for Generative AI for Controllable Protein Sequence Design: A Survey

Figure 2 for Generative AI for Controllable Protein Sequence Design: A Survey

Abstract:The design of novel protein sequences with targeted functionalities underpins a central theme in protein engineering, impacting diverse fields such as drug discovery and enzymatic engineering. However, navigating this vast combinatorial search space remains a severe challenge due to time and financial constraints. This scenario is rapidly evolving as the transformative advancements in AI, particularly in the realm of generative models and optimization algorithms, have been propelling the protein design field towards an unprecedented revolution. In this survey, we systematically review recent advances in generative AI for controllable protein sequence design. To set the stage, we first outline the foundational tasks in protein sequence design in terms of the constraints involved and present key generative models and optimization algorithms. We then offer in-depth reviews of each design task and discuss the pertinent applications. Finally, we identify the unresolved challenges and highlight research opportunities that merit deeper exploration.

* 9 pages

Via

Access Paper or Ask Questions

Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for Imbalanced Medical Classification

Nov 28, 2023

Jiahuan Yan, Haojun Gao, Zhang Kai, Weize Liu, Danny Chen, Jian Wu, Jintai Chen

Abstract:Deep learning approaches exhibit promising performances on various text tasks. However, they are still struggling on medical text classification since samples are often extremely imbalanced and scarce. Different from existing mainstream approaches that focus on supplementary semantics with external medical information, this paper aims to rethink the data challenges in medical texts and present a novel framework-agnostic algorithm called Text2Tree that only utilizes internal label hierarchy in training deep learning models. We embed the ICD code tree structure of labels into cascade attention modules for learning hierarchy-aware label representations. Two new learning schemes, Similarity Surrogate Learning (SSL) and Dissimilarity Mixup Learning (DML), are devised to boost text classification by reusing and distinguishing samples of other labels following the label representation hierarchy, respectively. Experiments on authoritative public datasets and real-world medical records show that our approach stably achieves superior performances over classical and advanced imbalanced classification methods.

* EMNLP 2023 Findings. Code: https://github.com/jyansir/Text2Tree

Via

Access Paper or Ask Questions

Mind's Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models

Nov 15, 2023

Weize Liu, Guocong Li, Kai Zhang, Bang Du, Qiyuan Chen, Xuming Hu, Hongxia Xu, Jintai Chen, Jian Wu

Figure 1 for Mind's Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models

Figure 2 for Mind's Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models

Figure 3 for Mind's Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models

Figure 4 for Mind's Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models

Abstract:Large language models (LLMs) have achieved remarkable advancements in the field of natural language processing. However, the sheer scale and computational demands of these models present formidable challenges when considering their practical deployment in resource-constrained contexts. While techniques such as chain-of-thought (CoT) distillation have displayed promise in distilling LLMs into small language models (SLMs), there is a risk that distilled SLMs may still carry over flawed reasoning or hallucinations inherited from their LLM counterparts. To address these issues, we propose a twofold methodology: First, we introduce a novel method for distilling the self-evaluation capability inherent in LLMs into SLMs, which aims to mitigate the adverse effects of erroneous reasoning and reduce hallucinations. Second, we advocate for a comprehensive distillation process that incorporates multiple distinct chain-of-thought and self-evaluation paradigms and ensures a more holistic and robust knowledge transfer into SLMs. Experiments on three NLP benchmarks demonstrate that our method significantly improves the performance of distilled SLMs and sheds light on the path towards developing smaller models closely aligned with human cognition.

* 13 pages, 5 figures

Via

Access Paper or Ask Questions