Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wenjun Ke

On the Salience of Low-Probability Tokens for AI-Generated Text Detection: A Multiscale Uncertainty Perspective

Jun 01, 2026

Yikai Guo, Bin Wang, Xilai Fan, Wenjun Ke, Haoran Luo

Abstract:AI-generated text increasingly blends with human writing, raising practical risks such as misinformation, academic misuse, and corpora contamination. While statistical detectors are appealing for efficiency and generalization, they suffer from two key limitations. (i) Boilerplate dominance, boilerplate tokens shared across human and LLM writing can overwhelm discriminative signals. (ii) Brittle point estimates, relying on a single probability score yields unstable decisions under adversarial manipulations. To address these issues, we propose Uncertainty, a multiscale uncertainty estimator that focuses on informative low-probability tokens, which more clearly expose distributional discrepancies. Locally, it alleviates boilerplate dominance by averaging the log-probabilities of low-probability tokens; globally, it reduces brittleness by capturing the distributional shape of this low-probability region via Rényi entropy. We further extend the detector to Uncertainty++ via conditional independent sampling, yielding a more stable uncertainty estimation. Experiments across seven datasets and sixteen LLMs demonstrate high effectiveness, generalization, and robustness. Our code is available at https://github.com/guoyikai2000/Uncertainty-AIGT.

* Accepted by ICML 2026 main conference

Via

Access Paper or Ask Questions

Unlearning of Knowledge Graph Embedding via Preference Optimization

Jul 28, 2025

Jiajun Liu, Wenjun Ke, Peng Wang, Yao He, Ziyu Shang, Guozheng Li, Zijie Xu, Ke Ji

Figure 1 for Unlearning of Knowledge Graph Embedding via Preference Optimization

Figure 2 for Unlearning of Knowledge Graph Embedding via Preference Optimization

Figure 3 for Unlearning of Knowledge Graph Embedding via Preference Optimization

Figure 4 for Unlearning of Knowledge Graph Embedding via Preference Optimization

Abstract:Existing knowledge graphs (KGs) inevitably contain outdated or erroneous knowledge that needs to be removed from knowledge graph embedding (KGE) models. To address this challenge, knowledge unlearning can be applied to eliminate specific information while preserving the integrity of the remaining knowledge in KGs. Existing unlearning methods can generally be categorized into exact unlearning and approximate unlearning. However, exact unlearning requires high training costs while approximate unlearning faces two issues when applied to KGs due to the inherent connectivity of triples: (1) It fails to fully remove targeted information, as forgetting triples can still be inferred from remaining ones. (2) It focuses on local data for specific removal, which weakens the remaining knowledge in the forgetting boundary. To address these issues, we propose GraphDPO, a novel approximate unlearning framework based on direct preference optimization (DPO). Firstly, to effectively remove forgetting triples, we reframe unlearning as a preference optimization problem, where the model is trained by DPO to prefer reconstructed alternatives over the original forgetting triples. This formulation penalizes reliance on forgettable knowledge, mitigating incomplete forgetting caused by KG connectivity. Moreover, we introduce an out-boundary sampling strategy to construct preference pairs with minimal semantic overlap, weakening the connection between forgetting and retained knowledge. Secondly, to preserve boundary knowledge, we introduce a boundary recall mechanism that replays and distills relevant information both within and across time steps. We construct eight unlearning datasets across four popular KGs with varying unlearning rates. Experiments show that GraphDPO outperforms state-of-the-art baselines by up to 10.1% in MRR_Avg and 14.0% in MRR_F1.

Via

Access Paper or Ask Questions

Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification

Jul 12, 2024

Ke Ji, Peng Wang, Wenjun Ke, Guozheng Li, Jiajun Liu, Jingsheng Gao, Ziyu Shang

Figure 1 for Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification

Figure 2 for Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification

Figure 3 for Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification

Figure 4 for Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification

Abstract:Recently, various pre-trained language models (PLMs) have been proposed to prove their impressive performances on a wide range of few-shot tasks. However, limited by the unstructured prior knowledge in PLMs, it is difficult to maintain consistent performance on complex structured scenarios, such as hierarchical text classification (HTC), especially when the downstream data is extremely scarce. The main challenge is how to transfer the unstructured semantic space in PLMs to the downstream domain hierarchy. Unlike previous work on HTC which directly performs multi-label classification or uses graph neural network (GNN) to inject label hierarchy, in this work, we study the HTC problem under a few-shot setting to adapt knowledge in PLMs from an unstructured manner to the downstream hierarchy. Technically, we design a simple yet effective method named Hierarchical Iterative Conditional Random Field (HierICRF) to search the most domain-challenging directions and exquisitely crafts domain-hierarchy adaptation as a hierarchical iterative language modeling problem, and then it encourages the model to make hierarchical consistency self-correction during the inference, thereby achieving knowledge transfer with hierarchical consistency preservation. We perform HierICRF on various architectures, and extensive experiments on two popular HTC datasets demonstrate that prompt with HierICRF significantly boosts the few-shot HTC performance with an average Micro-F1 by 28.80% to 1.50% and Macro-F1 by 36.29% to 1.5% over the previous state-of-the-art (SOTA) baselines under few-shot settings, while remaining SOTA hierarchical consistency performance.

* 9 pages, 2 figures, Accepted by IJCAI2024

Via

Access Paper or Ask Questions

Fast and Continual Knowledge Graph Embedding via Incremental LoRA

Jul 08, 2024

Jiajun Liu, Wenjun Ke, Peng Wang, Jiahao Wang, Jinhua Gao, Ziyu Shang, Guozheng Li, Zijie Xu, Ke Ji, Yining Li

Figure 1 for Fast and Continual Knowledge Graph Embedding via Incremental LoRA

Figure 2 for Fast and Continual Knowledge Graph Embedding via Incremental LoRA

Figure 3 for Fast and Continual Knowledge Graph Embedding via Incremental LoRA

Figure 4 for Fast and Continual Knowledge Graph Embedding via Incremental LoRA

Abstract:Continual Knowledge Graph Embedding (CKGE) aims to efficiently learn new knowledge and simultaneously preserve old knowledge. Dominant approaches primarily focus on alleviating catastrophic forgetting of old knowledge but neglect efficient learning for the emergence of new knowledge. However, in real-world scenarios, knowledge graphs (KGs) are continuously growing, which brings a significant challenge to fine-tuning KGE models efficiently. To address this issue, we propose a fast CKGE framework (\model), incorporating an incremental low-rank adapter (\mec) mechanism to efficiently acquire new knowledge while preserving old knowledge. Specifically, to mitigate catastrophic forgetting, \model\ isolates and allocates new knowledge to specific layers based on the fine-grained influence between old and new KGs. Subsequently, to accelerate fine-tuning, \model\ devises an efficient \mec\ mechanism, which embeds the specific layers into incremental low-rank adapters with fewer training parameters. Moreover, \mec\ introduces adaptive rank allocation, which makes the LoRA aware of the importance of entities and adjusts its rank scale adaptively. We conduct experiments on four public datasets and two new datasets with a larger initial scale. Experimental results demonstrate that \model\ can reduce training time by 34\%-49\% while still achieving competitive link prediction performance against state-of-the-art models on four public datasets (average MRR score of 21.0\% vs. 21.1\%).Meanwhile, on two newly constructed datasets, \model\ saves 51\%-68\% training time and improves link prediction performance by 1.5\%.

* Accepted by IJCAI2024

Via

Access Paper or Ask Questions

Towards Continual Knowledge Graph Embedding via Incremental Distillation

May 07, 2024

Jiajun Liu, Wenjun Ke, Peng Wang, Ziyu Shang, Jinhua Gao, Guozheng Li, Ke Ji, Yanhe Liu

Figure 1 for Towards Continual Knowledge Graph Embedding via Incremental Distillation

Figure 2 for Towards Continual Knowledge Graph Embedding via Incremental Distillation

Figure 3 for Towards Continual Knowledge Graph Embedding via Incremental Distillation

Figure 4 for Towards Continual Knowledge Graph Embedding via Incremental Distillation

Abstract:Traditional knowledge graph embedding (KGE) methods typically require preserving the entire knowledge graph (KG) with significant training costs when new knowledge emerges. To address this issue, the continual knowledge graph embedding (CKGE) task has been proposed to train the KGE model by learning emerging knowledge efficiently while simultaneously preserving decent old knowledge. However, the explicit graph structure in KGs, which is critical for the above goal, has been heavily ignored by existing CKGE methods. On the one hand, existing methods usually learn new triples in a random order, destroying the inner structure of new KGs. On the other hand, old triples are preserved with equal priority, failing to alleviate catastrophic forgetting effectively. In this paper, we propose a competitive method for CKGE based on incremental distillation (IncDE), which considers the full use of the explicit graph structure in KGs. First, to optimize the learning order, we introduce a hierarchical strategy, ranking new triples for layer-by-layer learning. By employing the inter- and intra-hierarchical orders together, new triples are grouped into layers based on the graph structure features. Secondly, to preserve the old knowledge effectively, we devise a novel incremental distillation mechanism, which facilitates the seamless transfer of entity representations from the previous layer to the next one, promoting old knowledge preservation. Finally, we adopt a two-stage training paradigm to avoid the over-corruption of old knowledge influenced by under-trained new knowledge. Experimental results demonstrate the superiority of IncDE over state-of-the-art baselines. Notably, the incremental distillation mechanism contributes to improvements of 0.2%-6.5% in the mean reciprocal rank (MRR) score.

* Accepted by AAAI 2024

Via

Access Paper or Ask Questions

Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction

Apr 27, 2024

Guozheng Li, Peng Wang, Wenjun Ke, Yikai Guo, Ke Ji, Ziyu Shang, Jiajun Liu, Zijie Xu

Figure 1 for Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction

Figure 2 for Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction

Figure 3 for Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction

Figure 4 for Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction

Abstract:Relation extraction (RE) aims to identify relations between entities mentioned in texts. Although large language models (LLMs) have demonstrated impressive in-context learning (ICL) abilities in various tasks, they still suffer from poor performances compared to most supervised fine-tuned RE methods. Utilizing ICL for RE with LLMs encounters two challenges: (1) retrieving good demonstrations from training examples, and (2) enabling LLMs exhibit strong ICL abilities in RE. On the one hand, retrieving good demonstrations is a non-trivial process in RE, which easily results in low relevance regarding entities and relations. On the other hand, ICL with an LLM achieves poor performance in RE while RE is different from language modeling in nature or the LLM is not large enough. In this work, we propose a novel recall-retrieve-reason RE framework that synergizes LLMs with retrieval corpora (training examples) to enable relevant retrieving and reliable in-context reasoning. Specifically, we distill the consistently ontological knowledge from training datasets to let LLMs generate relevant entity pairs grounded by retrieval corpora as valid queries. These entity pairs are then used to retrieve relevant training examples from the retrieval corpora as demonstrations for LLMs to conduct better ICL via instruction tuning. Extensive experiments on different LLMs and RE datasets demonstrate that our method generates relevant and valid entity pairs and boosts ICL abilities of LLMs, achieving competitive or new state-of-the-art performance on sentence-level RE compared to previous supervised fine-tuning methods and ICL-based methods.

* IJCAI 2024

Via

Access Paper or Ask Questions

CMNEE: A Large-Scale Document-Level Event Extraction Dataset based on Open-Source Chinese Military News

Apr 18, 2024

Mengna Zhu, Zijie Xu, Kaisheng Zeng, Kaiming Xiao, Mao Wang, Wenjun Ke, Hongbin Huang

Figure 1 for CMNEE: A Large-Scale Document-Level Event Extraction Dataset based on Open-Source Chinese Military News

Figure 2 for CMNEE: A Large-Scale Document-Level Event Extraction Dataset based on Open-Source Chinese Military News

Figure 3 for CMNEE: A Large-Scale Document-Level Event Extraction Dataset based on Open-Source Chinese Military News

Figure 4 for CMNEE: A Large-Scale Document-Level Event Extraction Dataset based on Open-Source Chinese Military News

Abstract:Extracting structured event knowledge, including event triggers and corresponding arguments, from military texts is fundamental to many applications, such as intelligence analysis and decision assistance. However, event extraction in the military field faces the data scarcity problem, which impedes the research of event extraction models in this domain. To alleviate this problem, we propose CMNEE, a large-scale, document-level open-source Chinese Military News Event Extraction dataset. It contains 17,000 documents and 29,223 events, which are all manually annotated based on a pre-defined schema for the military domain including 8 event types and 11 argument role types. We designed a two-stage, multi-turns annotation strategy to ensure the quality of CMNEE and reproduced several state-of-the-art event extraction models with a systematic evaluation. The experimental results on CMNEE fall shorter than those on other domain datasets obviously, which demonstrates that event extraction for military domain poses unique challenges and requires further research efforts. Our code and data can be obtained from https://github.com/Mzzzhu/CMNEE.

* 13 pages, 7 figures, accepted to LREC-COLING 2024

Via

Access Paper or Ask Questions

Attention Guidance Mechanism for Handwritten Mathematical Expression Recognition

Mar 05, 2024

Yutian Liu, Wenjun Ke, Jianguo Wei

Abstract:Handwritten mathematical expression recognition (HMER) is challenging in image-to-text tasks due to the complex layouts of mathematical expressions and suffers from problems including over-parsing and under-parsing. To solve these, previous HMER methods improve the attention mechanism by utilizing historical alignment information. However, this approach has limitations in addressing under-parsing since it cannot correct the erroneous attention on image areas that should be parsed at subsequent decoding steps. This faulty attention causes the attention module to incorporate future context into the current decoding step, thereby confusing the alignment process. To address this issue, we propose an attention guidance mechanism to explicitly suppress attention weights in irrelevant areas and enhance the appropriate ones, thereby inhibiting access to information outside the intended context. Depending on the type of attention guidance, we devise two complementary approaches to refine attention weights: self-guidance that coordinates attention of multiple heads and neighbor-guidance that integrates attention from adjacent time steps. Experiments show that our method outperforms existing state-of-the-art methods, achieving expression recognition rates of 60.75% / 61.81% / 63.30% on the CROHME 2014/ 2016/ 2019 datasets.

Via

Access Paper or Ask Questions

Unlocking Instructive In-Context Learning with Tabular Prompting for Relational Triple Extraction

Feb 21, 2024

Guozheng Li, Wenjun Ke, Peng Wang, Zijie Xu, Ke Ji, Jiajun Liu, Ziyu Shang, Qiqing Luo

Figure 1 for Unlocking Instructive In-Context Learning with Tabular Prompting for Relational Triple Extraction

Figure 2 for Unlocking Instructive In-Context Learning with Tabular Prompting for Relational Triple Extraction

Figure 3 for Unlocking Instructive In-Context Learning with Tabular Prompting for Relational Triple Extraction

Figure 4 for Unlocking Instructive In-Context Learning with Tabular Prompting for Relational Triple Extraction

Abstract:The in-context learning (ICL) for relational triple extraction (RTE) has achieved promising performance, but still encounters two key challenges: (1) how to design effective prompts and (2) how to select proper demonstrations. Existing methods, however, fail to address these challenges appropriately. On the one hand, they usually recast RTE task to text-to-text prompting formats, which is unnatural and results in a mismatch between the output format at the pre-training time and the inference time for large language models (LLMs). On the other hand, they only utilize surface natural language features and lack consideration of triple semantics in sample selection. These issues are blocking improved performance in ICL for RTE, thus we aim to tackle prompt designing and sample selection challenges simultaneously. To this end, we devise a tabular prompting for RTE (\textsc{TableIE}) which frames RTE task into a table generation task to incorporate explicit structured information into ICL, facilitating conversion of outputs to RTE structures. Then we propose instructive in-context learning (I$^2$CL) which only selects and annotates a few samples considering internal triple semantics in massive unlabeled samples.

* LREC-COLING 2024

Via

Access Paper or Ask Questions

Revisiting Large Language Models as Zero-shot Relation Extractors

Oct 11, 2023

Guozheng Li, Peng Wang, Wenjun Ke

Figure 1 for Revisiting Large Language Models as Zero-shot Relation Extractors

Figure 2 for Revisiting Large Language Models as Zero-shot Relation Extractors

Figure 3 for Revisiting Large Language Models as Zero-shot Relation Extractors

Figure 4 for Revisiting Large Language Models as Zero-shot Relation Extractors

Abstract:Relation extraction (RE) consistently involves a certain degree of labeled or unlabeled data even if under zero-shot setting. Recent studies have shown that large language models (LLMs) transfer well to new tasks out-of-the-box simply given a natural language prompt, which provides the possibility of extracting relations from text without any data and parameter tuning. This work focuses on the study of exploring LLMs, such as ChatGPT, as zero-shot relation extractors. On the one hand, we analyze the drawbacks of existing RE prompts and attempt to incorporate recent prompt techniques such as chain-of-thought (CoT) to improve zero-shot RE. We propose the summarize-and-ask (\textsc{SumAsk}) prompting, a simple prompt recursively using LLMs to transform RE inputs to the effective question answering (QA) format. On the other hand, we conduct comprehensive experiments on various benchmarks and settings to investigate the capabilities of LLMs on zero-shot RE. Specifically, we have the following findings: (i) \textsc{SumAsk} consistently and significantly improves LLMs performance on different model sizes, benchmarks and settings; (ii) Zero-shot prompting with ChatGPT achieves competitive or superior results compared with zero-shot and fully supervised methods; (iii) LLMs deliver promising performance in extracting overlapping relations; (iv) The performance varies greatly regarding different relations. Different from small language models, LLMs are effective in handling challenge none-of-the-above (NoTA) relation.

* Findings of EMNLP 2023

Via

Access Paper or Ask Questions