Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wanlong Liu

R2-Write: Reflection and Revision for Open-Ended Writing with Deep Reasoning

Apr 03, 2026

Wanlong Liu, Bo Zhang, Chenliang Li, Shaopeng Lai, Yuning Wu, Xuanyu Lei, Ming Yan

Abstract:While deep reasoning with long chain-of-thought has dramatically improved large language models in verifiable domains like mathematics, its effectiveness for open-ended tasks such as writing remains unexplored. In this paper, we conduct a systematic investigation revealing that existing mainstream reasoning models achieve limited gains on open-ended writing tasks. Our further analysis shows that these models lack deep reflection and revision patterns in open-ended writing, resulting in substantially smaller improvements compared to mathematical reasoning tasks. To address this limitation, we introduce R2-Write: an automated framework that synthesizes high-quality thinking trajectories enriched with explicit reflection and revision patterns through iterative writer-judge interaction. To prevent redundant reflections, we design a process reward mechanism that supervises reflection quality during reinforcement learning, improving both performance and token efficiency. Extensive experiments across multiple creative writing and deep-research benchmarks demonstrate significant improvements, validating that explicitly incorporating reflection and revision patterns unlocks deep reasoning capabilities for open-ended writing tasks.

* 31 pages

Via

Access Paper or Ask Questions

QFFT, Question-Free Fine-Tuning for Adaptive Reasoning

Jun 15, 2025

Wanlong Liu, Junxiao Xu, Fei Yu, Yukang Lin, Ke Ji, Wenyu Chen, Yan Xu, Yasheng Wang, Lifeng Shang, Benyou Wang

Figure 1 for QFFT, Question-Free Fine-Tuning for Adaptive Reasoning

Figure 2 for QFFT, Question-Free Fine-Tuning for Adaptive Reasoning

Figure 3 for QFFT, Question-Free Fine-Tuning for Adaptive Reasoning

Figure 4 for QFFT, Question-Free Fine-Tuning for Adaptive Reasoning

Abstract:Recent advancements in Long Chain-of-Thought (CoT) reasoning models have improved performance on complex tasks, but they suffer from overthinking, which generates redundant reasoning steps, especially for simple questions. This paper revisits the reasoning patterns of Long and Short CoT models, observing that the Short CoT patterns offer concise reasoning efficiently, while the Long CoT patterns excel in challenging scenarios where the Short CoT patterns struggle. To enable models to leverage both patterns, we propose Question-Free Fine-Tuning (QFFT), a fine-tuning approach that removes the input question during training and learns exclusively from Long CoT responses. This approach enables the model to adaptively employ both reasoning patterns: it prioritizes the Short CoT patterns and activates the Long CoT patterns only when necessary. Experiments on various mathematical datasets demonstrate that QFFT reduces average response length by more than 50\%, while achieving performance comparable to Supervised Fine-Tuning (SFT). Additionally, QFFT exhibits superior performance compared to SFT in noisy, out-of-domain, and low-resource scenarios.

* 23 pages

Via

Access Paper or Ask Questions

Mixed-Precision Graph Neural Quantization for Low Bit Large Language Models

Jan 30, 2025

Wanlong Liu, Yichen Xiao, Dingyi Zeng, Hongyang Zhao, Wenyu Chen, Malu Zhang

Figure 1 for Mixed-Precision Graph Neural Quantization for Low Bit Large Language Models

Figure 2 for Mixed-Precision Graph Neural Quantization for Low Bit Large Language Models

Figure 3 for Mixed-Precision Graph Neural Quantization for Low Bit Large Language Models

Abstract:Post-Training Quantization (PTQ) is pivotal for deploying large language models (LLMs) within resource-limited settings by significantly reducing resource demands. However, existing PTQ strategies underperform at low bit levels < 3 bits due to the significant difference between the quantized and original weights. To enhance the quantization performance at low bit widths, we introduce a Mixed-precision Graph Neural PTQ (MG-PTQ) approach, employing a graph neural network (GNN) module to capture dependencies among weights and adaptively assign quantization bit-widths. Through the information propagation of the GNN module, our method more effectively captures dependencies among target weights, leading to a more accurate assessment of weight importance and optimized allocation of quantization strategies. Extensive experiments on the WikiText2 and C4 datasets demonstrate that our MG-PTQ method outperforms previous state-of-the-art PTQ method GPTQ, setting new benchmarks for quantization performance under low-bit conditions.

* ICASSP 2025

Via

Access Paper or Ask Questions

RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions

Dec 31, 2024

Wanlong Liu, Junying Chen, Ke Ji, Li Zhou, Wenyu Chen, Benyou Wang

Figure 1 for RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions

Figure 2 for RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions

Figure 3 for RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions

Figure 4 for RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions

Abstract:Retrieval-Augmented Generation (RAG) has emerged as a key paradigm for enhancing large language models (LLMs) by incorporating external knowledge. However, current RAG methods face two limitations: (1) they only cover limited RAG scenarios. (2) They suffer from limited task diversity due to the lack of a general RAG dataset. To address these limitations, we propose RAG-Instruct, a general method for synthesizing diverse and high-quality RAG instruction data based on any source corpus. Our approach leverages (1) five RAG paradigms, which encompass diverse query-document relationships, and (2) instruction simulation, which enhances instruction diversity and quality by utilizing the strengths of existing instruction datasets. Using this method, we construct a 40K instruction dataset from Wikipedia, comprehensively covering diverse RAG scenarios and tasks. Experiments demonstrate that RAG-Instruct effectively enhances LLMs' RAG capabilities, achieving strong zero-shot performance and significantly outperforming various RAG baselines across a diverse set of tasks. RAG-Instruct is publicly available at https://github.com/FreedomIntelligence/RAG-Instruct.

Via

Access Paper or Ask Questions

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Dec 25, 2024

Junying Chen, Zhenyang Cai, Ke Ji, Xidong Wang, Wanlong Liu, Rongsheng Wang, Jianye Hou, Benyou Wang

Figure 1 for HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Figure 2 for HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Figure 3 for HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Figure 4 for HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Abstract:The breakthrough of OpenAI o1 highlights the potential of enhancing reasoning to improve LLM. Yet, most research in reasoning has focused on mathematical tasks, leaving domains like medicine underexplored. The medical domain, though distinct from mathematics, also demands robust reasoning to provide reliable answers, given the high standards of healthcare. However, verifying medical reasoning is challenging, unlike those in mathematics. To address this, we propose verifiable medical problems with a medical verifier to check the correctness of model outputs. This verifiable nature enables advancements in medical reasoning through a two-stage approach: (1) using the verifier to guide the search for a complex reasoning trajectory for fine-tuning LLMs, (2) applying reinforcement learning (RL) with verifier-based rewards to enhance complex reasoning further. Finally, we introduce HuatuoGPT-o1, a medical LLM capable of complex reasoning, which outperforms general and medical-specific baselines using only 40K verifiable problems. Experiments show complex reasoning improves medical problem-solving and benefits more from RL. We hope our approach inspires advancements in reasoning across medical and other specialized domains.

Via

Access Paper or Ask Questions

A Compressive Memory-based Retrieval Approach for Event Argument Extraction

Sep 14, 2024

Wanlong Liu, Enqi Zhang, Li Zhou, Dingyi Zeng, Shaohuan Cheng, Chen Zhang, Malu Zhang, Wenyu Chen

Figure 1 for A Compressive Memory-based Retrieval Approach for Event Argument Extraction

Figure 2 for A Compressive Memory-based Retrieval Approach for Event Argument Extraction

Figure 3 for A Compressive Memory-based Retrieval Approach for Event Argument Extraction

Figure 4 for A Compressive Memory-based Retrieval Approach for Event Argument Extraction

Abstract:Recent works have demonstrated the effectiveness of retrieval augmentation in the Event Argument Extraction (EAE) task. However, existing retrieval-based EAE methods have two main limitations: (1) input length constraints and (2) the gap between the retriever and the inference model. These issues limit the diversity and quality of the retrieved information. In this paper, we propose a Compressive Memory-based Retrieval (CMR) mechanism for EAE, which addresses the two limitations mentioned above. Our compressive memory, designed as a dynamic matrix that effectively caches retrieved information and supports continuous updates, overcomes the limitations of the input length. Additionally, after pre-loading all candidate demonstrations into the compressive memory, the model further retrieves and filters relevant information from memory based on the input query, bridging the gap between the retriever and the inference model. Extensive experiments show that our method achieves new state-of-the-art performance on three public datasets (RAMS, WikiEvents, ACE05), significantly outperforming existing retrieval-based EAE methods.

* 15 pages

Via

Access Paper or Ask Questions

DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction Model with Slot Querying

May 22, 2024

Guanghui Wang, Dexi Liu, Qizhi Wan, Xiping Liu, Wanlong Liu

Figure 1 for DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction Model with Slot Querying

Figure 2 for DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction Model with Slot Querying

Figure 3 for DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction Model with Slot Querying

Figure 4 for DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction Model with Slot Querying

Abstract:Recent advancements in event argument extraction (EAE) involve incorporating beneficial auxiliary information into models during training and inference, such as retrieved instances and event templates. Additionally, some studies introduce learnable prefix vectors to models. These methods face three challenges: (1) insufficient utilization of relevant event instances due to deficiencies in retrieval; (2) neglect of important information provided by relevant event templates; (3) the advantages of prefixes are constrained due to their inability to meet the specific informational needs of EAE. In this work, we propose DEGAP, which addresses the above challenges through two simple yet effective components: (1) dual prefixes, where the instance-oriented prefix and template-oriented prefix are trained to learn information from different event instances and templates, respectively, and then provide relevant information as cues to EAE model without retrieval; (2) event-guided adaptive gating mechanism, which guides the prefixes based on the target event to fully leverage their advantages. Extensive experiments demonstrate that our method achieves new state-of-the-art performance on four datasets (ACE05, RAMS, WIKIEVENTS, and MLEE). Further analysis verifies the importance of the proposed design and the effectiveness of the main components.

Via

Access Paper or Ask Questions

Beyond Single-Event Extraction: Towards Efficient Document-Level Multi-Event Argument Extraction

May 03, 2024

Wanlong Liu, Li Zhou, Dingyi Zeng, Yichen Xiao, Shaohuan Cheng, Chen Zhang, Grandee Lee, Malu Zhang, Wenyu Chen

Figure 1 for Beyond Single-Event Extraction: Towards Efficient Document-Level Multi-Event Argument Extraction

Figure 2 for Beyond Single-Event Extraction: Towards Efficient Document-Level Multi-Event Argument Extraction

Figure 3 for Beyond Single-Event Extraction: Towards Efficient Document-Level Multi-Event Argument Extraction

Figure 4 for Beyond Single-Event Extraction: Towards Efficient Document-Level Multi-Event Argument Extraction

Abstract:Recent mainstream event argument extraction methods process each event in isolation, resulting in inefficient inference and ignoring the correlations among multiple events. To address these limitations, here we propose a multiple-event argument extraction model DEEIA (Dependency-guided Encoding and Event-specific Information Aggregation), capable of extracting arguments from all events within a document simultaneouslyThe proposed DEEIA model employs a multi-event prompt mechanism, comprising DE and EIA modules. The DE module is designed to improve the correlation between prompts and their corresponding event contexts, whereas the EIA module provides event-specific information to improve contextual understanding. Extensive experiments show that our method achieves new state-of-the-art performance on four public datasets (RAMS, WikiEvents, MLEE, and ACE05), while significantly saving the inference time compared to the baselines. Further analyses demonstrate the effectiveness of the proposed modules.

Via

Access Paper or Ask Questions

Does Mapo Tofu Contain Coffee? Probing LLMs for Food-related Cultural Knowledge

Apr 10, 2024

Li Zhou, Taelin Karidi, Nicolas Garneau, Yong Cao, Wanlong Liu, Wenyu Chen, Daniel Hershcovich

Figure 1 for Does Mapo Tofu Contain Coffee? Probing LLMs for Food-related Cultural Knowledge

Figure 2 for Does Mapo Tofu Contain Coffee? Probing LLMs for Food-related Cultural Knowledge

Figure 3 for Does Mapo Tofu Contain Coffee? Probing LLMs for Food-related Cultural Knowledge

Figure 4 for Does Mapo Tofu Contain Coffee? Probing LLMs for Food-related Cultural Knowledge

Abstract:Recent studies have highlighted the presence of cultural biases in Large Language Models (LLMs), yet often lack a robust methodology to dissect these phenomena comprehensively. Our work aims to bridge this gap by delving into the Food domain, a universally relevant yet culturally diverse aspect of human life. We introduce FmLAMA, a multilingual dataset centered on food-related cultural facts and variations in food practices. We analyze LLMs across various architectures and configurations, evaluating their performance in both monolingual and multilingual settings. By leveraging templates in six different languages, we investigate how LLMs interact with language-specific and cultural knowledge. Our findings reveal that (1) LLMs demonstrate a pronounced bias towards food knowledge prevalent in the United States; (2) Incorporating relevant cultural context significantly improves LLMs' ability to access cultural knowledge; (3) The efficacy of LLMs in capturing cultural nuances is highly dependent on the interplay between the probing language, the specific model architecture, and the cultural context in question. This research underscores the complexity of integrating cultural understanding into LLMs and emphasizes the importance of culturally diverse datasets to mitigate biases and enhance model performance across different cultural domains.

* 20 pages,8 figures

Via

Access Paper or Ask Questions

MLPs Compass: What is learned when MLPs are combined with PLMs?

Jan 03, 2024

Li Zhou, Wenyu Chen, Yong Cao, Dingyi Zeng, Wanlong Liu, Hong Qu

Figure 1 for MLPs Compass: What is learned when MLPs are combined with PLMs?

Figure 2 for MLPs Compass: What is learned when MLPs are combined with PLMs?

Figure 3 for MLPs Compass: What is learned when MLPs are combined with PLMs?

Figure 4 for MLPs Compass: What is learned when MLPs are combined with PLMs?

Abstract:While Transformer-based pre-trained language models and their variants exhibit strong semantic representation capabilities, the question of comprehending the information gain derived from the additional components of PLMs remains an open question in this field. Motivated by recent efforts that prove Multilayer-Perceptrons (MLPs) modules achieving robust structural capture capabilities, even outperforming Graph Neural Networks (GNNs), this paper aims to quantify whether simple MLPs can further enhance the already potent ability of PLMs to capture linguistic information. Specifically, we design a simple yet effective probing framework containing MLPs components based on BERT structure and conduct extensive experiments encompassing 10 probing tasks spanning three distinct linguistic levels. The experimental results demonstrate that MLPs can indeed enhance the comprehension of linguistic structure by PLMs. Our research provides interpretable and valuable insights into crafting variations of PLMs utilizing MLPs for tasks that emphasize diverse linguistic structures.

* Accepted by ICASSP 2024

Via

Access Paper or Ask Questions