Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ningyu Zhang

AnyEdit: Edit Any Knowledge Encoded in Language Models

Feb 08, 2025

Houcheng Jiang, Junfeng Fang, Ningyu Zhang, Guojun Ma, Mingyang Wan, Xiang Wang, Xiangnan He, Tat-seng Chua

Figure 1 for AnyEdit: Edit Any Knowledge Encoded in Language Models

Figure 2 for AnyEdit: Edit Any Knowledge Encoded in Language Models

Figure 3 for AnyEdit: Edit Any Knowledge Encoded in Language Models

Figure 4 for AnyEdit: Edit Any Knowledge Encoded in Language Models

Abstract:Large language models (LLMs) often produce incorrect or outdated information, necessitating efficient and precise knowledge updates. Current model editing methods, however, struggle with long-form knowledge in diverse formats, such as poetry, code snippets, and mathematical derivations. These limitations arise from their reliance on editing a single token's hidden state, a limitation we term "efficacy barrier". To solve this, we propose AnyEdit, a new autoregressive editing paradigm. It decomposes long-form knowledge into sequential chunks and iteratively edits the key token in each chunk, ensuring consistent and accurate outputs. Theoretically, we ground AnyEdit in the Chain Rule of Mutual Information, showing its ability to update any knowledge within LLMs. Empirically, it outperforms strong baselines by 21.5% on benchmarks including UnKEBench, AKEW, and our new EditEverything dataset for long-form diverse-formatted knowledge. Additionally, AnyEdit serves as a plug-and-play framework, enabling current editing methods to update knowledge with arbitrary length and format, significantly advancing the scope and practicality of LLM knowledge editing.

Via

Access Paper or Ask Questions

OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking

Jan 16, 2025

Zekun Xi, Wenbiao Yin, Jizhan Fang, Jialong Wu, Runnan Fang, Ningyu Zhang, Jiang Yong, Pengjun Xie, Fei Huang, Huajun Chen

Abstract:Machine writing with large language models often relies on retrieval-augmented generation. However, these approaches remain confined within the boundaries of the model's predefined scope, limiting the generation of content with rich information. Specifically, vanilla-retrieved information tends to lack depth, utility, and suffers from redundancy, which negatively impacts the quality of generated articles, leading to shallow, repetitive, and unoriginal outputs. To address these issues, we propose OmniThink, a machine writing framework that emulates the human-like process of iterative expansion and reflection. The core idea behind OmniThink is to simulate the cognitive behavior of learners as they progressively deepen their knowledge of the topics. Experimental results demonstrate that OmniThink improves the knowledge density of generated articles without compromising metrics such as coherence and depth. Human evaluations and expert feedback further highlight the potential of OmniThink to address real-world challenges in the generation of long-form articles.

Via

Access Paper or Ask Questions

A Multi-Modal AI Copilot for Single-Cell Analysis with Instruction Following

Jan 15, 2025

Yin Fang, Xinle Deng, Kangwei Liu, Ningyu Zhang, Jingyang Qian, Penghui Yang, Xiaohui Fan, Huajun Chen

Figure 1 for A Multi-Modal AI Copilot for Single-Cell Analysis with Instruction Following

Figure 2 for A Multi-Modal AI Copilot for Single-Cell Analysis with Instruction Following

Figure 3 for A Multi-Modal AI Copilot for Single-Cell Analysis with Instruction Following

Figure 4 for A Multi-Modal AI Copilot for Single-Cell Analysis with Instruction Following

Abstract:Large language models excel at interpreting complex natural language instructions, enabling them to perform a wide range of tasks. In the life sciences, single-cell RNA sequencing (scRNA-seq) data serves as the "language of cellular biology", capturing intricate gene expression patterns at the single-cell level. However, interacting with this "language" through conventional tools is often inefficient and unintuitive, posing challenges for researchers. To address these limitations, we present InstructCell, a multi-modal AI copilot that leverages natural language as a medium for more direct and flexible single-cell analysis. We construct a comprehensive multi-modal instruction dataset that pairs text-based instructions with scRNA-seq profiles from diverse tissues and species. Building on this, we develop a multi-modal cell language architecture capable of simultaneously interpreting and processing both modalities. InstructCell empowers researchers to accomplish critical tasks-such as cell type annotation, conditional pseudo-cell generation, and drug sensitivity prediction-using straightforward natural language commands. Extensive evaluations demonstrate that InstructCell consistently meets or exceeds the performance of existing single-cell foundation models, while adapting to diverse experimental conditions. More importantly, InstructCell provides an accessible and intuitive tool for exploring complex single-cell data, lowering technical barriers and enabling deeper biological insights.

* 37 pages; 13 figures; Code: https://github.com/zjunlp/Instructcell, Models: https://huggingface.co/zjunlp/Instructcell-chat, https://huggingface.co/zjunlp/InstructCell-instruct

Via

Access Paper or Ask Questions

OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

Dec 28, 2024

Yujie Luo, Xiangyuan Ru, Kangwei Liu, Lin Yuan, Mengshu Sun, Ningyu Zhang, Lei Liang, Zhiqiang Zhang, Jun Zhou, Lanning Wei(+3 more)

Figure 1 for OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

Figure 2 for OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

Figure 3 for OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

Abstract:We introduce OneKE, a dockerized schema-guided knowledge extraction system, which can extract knowledge from the Web and raw PDF Books, and support various domains (science, news, etc.). Specifically, we design OneKE with multiple agents and a configure knowledge base. Different agents perform their respective roles, enabling support for various extraction scenarios. The configure knowledge base facilitates schema configuration, error case debugging and correction, further improving the performance. Empirical evaluations on benchmark datasets demonstrate OneKE's efficacy, while case studies further elucidate its adaptability to diverse tasks across multiple domains, highlighting its potential for broad applications. We have open-sourced the Code at https://github.com/zjunlp/OneKE and released a Video at http://oneke.openkg.cn/demo.mp4.

* Work in progress

Via

Access Paper or Ask Questions

Exploring Model Kinship for Merging Large Language Models

Oct 16, 2024

Yedi Hu, Yunzhi Yao, Ningyu Zhang, Shumin Deng, Huajun Chen

Figure 1 for Exploring Model Kinship for Merging Large Language Models

Figure 2 for Exploring Model Kinship for Merging Large Language Models

Figure 3 for Exploring Model Kinship for Merging Large Language Models

Figure 4 for Exploring Model Kinship for Merging Large Language Models

Abstract:Model merging has become one of the key technologies for enhancing the capabilities and efficiency of Large Language Models (LLMs). However, our understanding of the expected performance gains and principles when merging any two models remains limited. In this work, we introduce model kinship, the degree of similarity or relatedness between LLMs, analogous to biological evolution. With comprehensive empirical analysis, we find that there is a certain relationship between model kinship and the performance gains after model merging, which can help guide our selection of candidate models. Inspired by this, we propose a new model merging strategy: Top-k Greedy Merging with Model Kinship, which can yield better performance on benchmark datasets. Specifically, we discover that using model kinship as a criterion can assist us in continuously performing model merging, alleviating the degradation (local optima) in model evolution, whereas model kinship can serve as a guide to escape these traps. Code is available at https://github.com/zjunlp/ModelKinship.

* Ongoing work

Via

Access Paper or Ask Questions

MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation

Oct 15, 2024

Chenxi Wang, Xiang Chen, Ningyu Zhang, Bozhong Tian, Haoming Xu, Shumin Deng, Huajun Chen

Figure 1 for MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation

Figure 2 for MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation

Figure 3 for MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation

Figure 4 for MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation

Abstract:Multimodal Large Language Models (MLLMs) frequently exhibit hallucination phenomena, but the underlying reasons remain poorly understood. In this paper, we present an empirical analysis and find that, although MLLMs incorrectly generate the objects in the final output, they are actually able to recognize visual objects in the preceding layers. We speculate that this may be due to the strong knowledge priors of the language model suppressing the visual information, leading to hallucinations. Motivated by this, we propose a novel dynamic correction decoding method for MLLMs (DeCo), which adaptively selects the appropriate preceding layers and proportionally integrates knowledge into the final layer to adjust the output logits. Note that DeCo is model agnostic and can be seamlessly incorporated with various classic decoding strategies and applied to different MLLMs. We evaluate DeCo on widely-used benchmarks, demonstrating that it can reduce hallucination rates by a large margin compared to baselines, highlighting its potential to mitigate hallucinations. Code is available at https://github.com/zjunlp/DeCo.

* Ongoing work

Via

Access Paper or Ask Questions

Locking Down the Finetuned LLMs Safety

Oct 14, 2024

Minjun Zhu, Linyi Yang, Yifan Wei, Ningyu Zhang, Yue Zhang

Figure 1 for Locking Down the Finetuned LLMs Safety

Figure 2 for Locking Down the Finetuned LLMs Safety

Figure 3 for Locking Down the Finetuned LLMs Safety

Figure 4 for Locking Down the Finetuned LLMs Safety

Abstract:Fine-tuning large language models (LLMs) on additional datasets is often necessary to optimize them for specific downstream tasks. However, existing safety alignment measures, which restrict harmful behavior during inference, are insufficient to mitigate safety risks during fine-tuning. Alarmingly, fine-tuning with just 10 toxic sentences can make models comply with harmful instructions. We introduce SafetyLock, a novel alignment intervention method that maintains robust safety post-fine-tuning through efficient and transferable mechanisms. SafetyLock leverages our discovery that fine-tuned models retain similar safety-related activation representations to their base models. This insight enables us to extract what we term the Meta-SafetyLock, a set of safety bias directions representing key activation patterns associated with safe responses in the original model. We can then apply these directions universally to fine-tuned models to enhance their safety. By searching for activation directions across multiple token dimensions, SafetyLock achieves enhanced robustness and transferability. SafetyLock re-aligns fine-tuned models in under 0.01 seconds without additional computational cost. Our experiments demonstrate that SafetyLock can reduce the harmful instruction response rate from 60% to below 1% in toxic fine-tuned models. It surpasses traditional methods in both performance and efficiency, offering a scalable, non-invasive solution for ensuring the safety of customized LLMs. Our analysis across various fine-tuning scenarios confirms SafetyLock's robustness, advocating its integration into safety protocols for aligned LLMs. The code is released at https://github.com/zhu-minjun/SafetyLock.

Via

Access Paper or Ask Questions

Benchmarking Agentic Workflow Generation

Oct 10, 2024

Shuofei Qiao, Runnan Fang, Zhisong Qiu, Xiaobin Wang, Ningyu Zhang, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

Figure 1 for Benchmarking Agentic Workflow Generation

Figure 2 for Benchmarking Agentic Workflow Generation

Figure 3 for Benchmarking Agentic Workflow Generation

Figure 4 for Benchmarking Agentic Workflow Generation

Abstract:Large Language Models (LLMs), with their exceptional ability to handle a wide range of tasks, have driven significant advancements in tackling reasoning and planning tasks, wherein decomposing complex problems into executable workflows is a crucial step in this process. Existing workflow evaluation frameworks either focus solely on holistic performance or suffer from limitations such as restricted scenario coverage, simplistic workflow structures, and lax evaluation standards. To this end, we introduce WorFBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures. Additionally, we present WorFEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms to accurately quantify the LLM agent's workflow generation capabilities. Through comprehensive evaluations across different types of LLMs, we discover distinct gaps between the sequence planning capabilities and graph planning capabilities of LLM agents, with even GPT-4 exhibiting a gap of around 15%. We also train two open-source models and evaluate their generalization abilities on held-out tasks. Furthermore, we observe that the generated workflows can enhance downstream tasks, enabling them to achieve superior performance with less time during inference. Code and dataset will be available at https://github.com/zjunlp/WorFBench.

* Work in progress

Via

Access Paper or Ask Questions

Benchmarking Chinese Knowledge Rectification in Large Language Models

Sep 09, 2024

Tianhe Lu, Jizhan Fang, Yunzhi Yao, Xin Xu, Ningyu Zhang, Huajun Chen

Figure 1 for Benchmarking Chinese Knowledge Rectification in Large Language Models

Figure 2 for Benchmarking Chinese Knowledge Rectification in Large Language Models

Figure 3 for Benchmarking Chinese Knowledge Rectification in Large Language Models

Figure 4 for Benchmarking Chinese Knowledge Rectification in Large Language Models

Abstract:While Large Language Models (LLMs) exhibit remarkable generative capabilities, they are not without flaws, particularly in the form of hallucinations. This issue is even more pronounced when LLMs are applied to specific languages and domains. For example, LLMs may generate nonsense information when handling Chinese ancient poetry, proverbs, or idioms, owing to the lack of specific knowledge. To this end, this paper introduces a benchmark for rectifying Chinese knowledge in LLMs via knowledge editing. Specifically, we introduce a new Chinese dataset, CKnowEdit, by collecting seven type of knowledge from various sources, including classical texts, idioms, and content from Baidu Tieba Ruozhiba, thereby accounting for the unique polyphony, antithesis, and logical constructs inherent in the Chinese language. Through the analysis of this dataset, we uncover the challenges faced by current LLMs in mastering Chinese. Furthermore, our evaluation of state-of-the-art knowledge editing techniques on this dataset unveil the substantial scope for advancement in the rectification of Chinese knowledge. Code and dataset are available at https://github.com/zjunlp/EasyEdit.

* Ongoing work; code and dataset are available at https://github.com/zjunlp/EasyEdit

Via

Access Paper or Ask Questions

OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System

Sep 09, 2024

Ningyu Zhang, Zekun Xi, Yujie Luo, Peng Wang, Bozhong Tian, Yunzhi Yao, Jintian Zhang, Shumin Deng, Mengshu Sun, Lei Liang(+4 more)

Figure 1 for OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System

Figure 2 for OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System

Figure 3 for OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System

Figure 4 for OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System

Abstract:Knowledge representation has been a central aim of AI since its inception. Symbolic Knowledge Graphs (KGs) and neural Large Language Models (LLMs) can both represent knowledge. KGs provide highly accurate and explicit knowledge representation, but face scalability issue; while LLMs offer expansive coverage of knowledge, but incur significant training costs and struggle with precise and reliable knowledge manipulation. To this end, we introduce OneEdit, a neural-symbolic prototype system for collaborative knowledge editing using natural language, which facilitates easy-to-use knowledge management with KG and LLM. OneEdit consists of three modules: 1) The Interpreter serves for user interaction with natural language; 2) The Controller manages editing requests from various users, leveraging the KG with rollbacks to handle knowledge conflicts and prevent toxic knowledge attacks; 3) The Editor utilizes the knowledge from the Controller to edit KG and LLM. We conduct experiments on two new datasets with KGs which demonstrate that OneEdit can achieve superior performance.

* LLM+KG@VLDB2024, code is available at https://github.com/zjunlp/OneEdit

Via

Access Paper or Ask Questions