Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wen Zhang

Zhejiang University

Multi-domain Knowledge Graph Collaborative Pre-training and Prompt Tuning for Diverse Downstream Tasks

May 21, 2024

Yichi Zhang, Binbin Hu, Zhuo Chen, Lingbing Guo, Ziqi Liu, Zhiqiang Zhang, Lei Liang, Huajun Chen, Wen Zhang

Figure 1 for Multi-domain Knowledge Graph Collaborative Pre-training and Prompt Tuning for Diverse Downstream Tasks

Figure 2 for Multi-domain Knowledge Graph Collaborative Pre-training and Prompt Tuning for Diverse Downstream Tasks

Figure 3 for Multi-domain Knowledge Graph Collaborative Pre-training and Prompt Tuning for Diverse Downstream Tasks

Figure 4 for Multi-domain Knowledge Graph Collaborative Pre-training and Prompt Tuning for Diverse Downstream Tasks

Abstract:Knowledge graphs (KGs) provide reliable external knowledge for a wide variety of AI tasks in the form of structured triples. Knowledge graph pre-training (KGP) aims to pre-train neural networks on large-scale KGs and provide unified interfaces to enhance different downstream tasks, which is a key direction for KG management, maintenance, and applications. Existing works often focus on purely research questions in open domains, or they are not open source due to data security and privacy in real scenarios. Meanwhile, existing studies have not explored the training efficiency and transferability of KGP models in depth. To address these problems, We propose a framework MuDoK to achieve multi-domain collaborative pre-training and efficient prefix prompt tuning to serve diverse downstream tasks like recommendation and text understanding. Our design is a plug-and-play prompt learning approach that can be flexibly adapted to different downstream task backbones. In response to the lack of open-source benchmarks, we constructed a new multi-domain KGP benchmark called KPI with two large-scale KGs and six different sub-domain tasks to evaluate our method and open-sourced it for subsequent research. We evaluated our approach based on constructed KPI benchmarks using diverse backbone models in heterogeneous downstream tasks. The experimental results show that our framework brings significant performance gains, along with its generality, efficiency, and transferability.

* Work in progress. Code and data will be open-sourced at https://github.com/zjukg/MuDoK

Via

Access Paper or Ask Questions

Coefficient Decomposition for Spectral Graph Convolution

May 06, 2024

Feng Huang, Wen Zhang

Abstract:Spectral graph convolutional network (SGCN) is a kind of graph neural networks (GNN) based on graph signal filters, and has shown compelling expressivity for modeling graph-structured data. Most SGCNs adopt polynomial filters and learn the coefficients from the training data. Many of them focus on which polynomial basis leads to optimal expressive power and models' architecture is little discussed. In this paper, we propose a general form in terms of spectral graph convolution, where the coefficients of polynomial basis are stored in a third-order tensor. Then, we show that the convolution block in existing SGCNs can be derived by performing a certain coefficient decomposition operation on the coefficient tensor. Based on the generalized view, we develop novel spectral graph convolutions CoDeSGC-CP and -Tucker by tensor decomposition CP and Tucker on the coefficient tensor. Extensive experimental results demonstrate that the proposed convolutions achieve favorable performance improvements.

Via

Access Paper or Ask Questions

MyGO: Discrete Modality Information as Fine-Grained Tokens for Multi-modal Knowledge Graph Completion

Apr 15, 2024

Yichi Zhang, Zhuo Chen, Lingbing Guo, Yajing Xu, Binbin Hu, Ziqi Liu, Huajun Chen, Wen Zhang

Figure 1 for MyGO: Discrete Modality Information as Fine-Grained Tokens for Multi-modal Knowledge Graph Completion

Figure 2 for MyGO: Discrete Modality Information as Fine-Grained Tokens for Multi-modal Knowledge Graph Completion

Figure 3 for MyGO: Discrete Modality Information as Fine-Grained Tokens for Multi-modal Knowledge Graph Completion

Figure 4 for MyGO: Discrete Modality Information as Fine-Grained Tokens for Multi-modal Knowledge Graph Completion

Abstract:Multi-modal knowledge graphs (MMKG) store structured world knowledge containing rich multi-modal descriptive information. To overcome their inherent incompleteness, multi-modal knowledge graph completion (MMKGC) aims to discover unobserved knowledge from given MMKGs, leveraging both structural information from the triples and multi-modal information of the entities. Existing MMKGC methods usually extract multi-modal features with pre-trained models and employ a fusion module to integrate multi-modal features with triple prediction. However, this often results in a coarse handling of multi-modal data, overlooking the nuanced, fine-grained semantic details and their interactions. To tackle this shortfall, we introduce a novel framework MyGO to process, fuse, and augment the fine-grained modality information from MMKGs. MyGO tokenizes multi-modal raw data as fine-grained discrete tokens and learns entity representations with a cross-modal entity encoder. To further augment the multi-modal representations, MyGO incorporates fine-grained contrastive learning to highlight the specificity of the entity representations. Experiments on standard MMKGC benchmarks reveal that our method surpasses 20 of the latest models, underlining its superior performance. Code and data are available at https://github.com/zjukg/MyGO

* Working in progress; Repo is available at https://github.com/zjukg/MyGO

Via

Access Paper or Ask Questions

The Power of Noise: Toward a Unified Multi-modal Knowledge Graph Representation Framework

Mar 20, 2024

Zhuo Chen, Yin Fang, Yichi Zhang, Lingbing Guo, Jiaoyan Chen, Huajun Chen, Wen Zhang

Figure 1 for The Power of Noise: Toward a Unified Multi-modal Knowledge Graph Representation Framework

Figure 2 for The Power of Noise: Toward a Unified Multi-modal Knowledge Graph Representation Framework

Figure 3 for The Power of Noise: Toward a Unified Multi-modal Knowledge Graph Representation Framework

Figure 4 for The Power of Noise: Toward a Unified Multi-modal Knowledge Graph Representation Framework

Abstract:The advancement of Multi-modal Pre-training highlights the necessity for a robust Multi-Modal Knowledge Graph (MMKG) representation learning framework. This framework is crucial for integrating structured knowledge into multi-modal Large Language Models (LLMs) at scale, aiming to alleviate issues like knowledge misconceptions and multi-modal hallucinations. In this work, to evaluate models' ability to accurately embed entities within MMKGs, we focus on two widely researched tasks: Multi-modal Knowledge Graph Completion (MKGC) and Multi-modal Entity Alignment (MMEA). Building on this foundation, we propose a novel SNAG method that utilizes a Transformer-based architecture equipped with modality-level noise masking for the robust integration of multi-modal entity features in KGs. By incorporating specific training objectives for both MKGC and MMEA, our approach achieves SOTA performance across a total of ten datasets (three for MKGC and seven for MEMA), demonstrating its robustness and versatility. Besides, SNAG can not only function as a standalone model but also enhance other existing methods, providing stable performance improvements. Our code and data are available at: https://github.com/zjukg/SNAG.

* Ongoing work; 10 pages, 6 Tables, 2 Figures; Repo is available at https://github.com/zjukg/SNAG

Via

Access Paper or Ask Questions

Prompt-fused framework for Inductive Logical Query Answering

Mar 19, 2024

Zezhong Xu, Peng Ye, Lei Liang, Huajun Chen, Wen Zhang

Figure 1 for Prompt-fused framework for Inductive Logical Query Answering

Figure 2 for Prompt-fused framework for Inductive Logical Query Answering

Figure 3 for Prompt-fused framework for Inductive Logical Query Answering

Figure 4 for Prompt-fused framework for Inductive Logical Query Answering

Abstract:Answering logical queries on knowledge graphs (KG) poses a significant challenge for machine reasoning. The primary obstacle in this task stems from the inherent incompleteness of KGs. Existing research has predominantly focused on addressing the issue of missing edges in KGs, thereby neglecting another aspect of incompleteness: the emergence of new entities. Furthermore, most of the existing methods tend to reason over each logical operator separately, rather than comprehensively analyzing the query as a whole during the reasoning process. In this paper, we propose a query-aware prompt-fused framework named Pro-QE, which could incorporate existing query embedding methods and address the embedding of emerging entities through contextual information aggregation. Additionally, a query prompt, which is generated by encoding the symbolic query, is introduced to gather information relevant to the query from a holistic perspective. To evaluate the efficacy of our model in the inductive setting, we introduce two new challenging benchmarks. Experimental results demonstrate that our model successfully handles the issue of unseen entities in logical queries. Furthermore, the ablation study confirms the efficacy of the aggregator and prompt components.

* Accepted by COLING 2024

Via

Access Paper or Ask Questions

InBox: Recommendation with Knowledge Graph using Interest Box Embedding

Mar 19, 2024

Zezhong Xu, Yincen Qu, Wen Zhang, Lei Liang, Huajun Chen

Figure 1 for InBox: Recommendation with Knowledge Graph using Interest Box Embedding

Figure 2 for InBox: Recommendation with Knowledge Graph using Interest Box Embedding

Figure 3 for InBox: Recommendation with Knowledge Graph using Interest Box Embedding

Figure 4 for InBox: Recommendation with Knowledge Graph using Interest Box Embedding

Abstract:Knowledge graphs (KGs) have become vitally important in modern recommender systems, effectively improving performance and interpretability. Fundamentally, recommender systems aim to identify user interests based on historical interactions and recommend suitable items. However, existing works overlook two key challenges: (1) an interest corresponds to a potentially large set of related items, and (2) the lack of explicit, fine-grained exploitation of KG information and interest connectivity. This leads to an inability to reflect distinctions between entities and interests when modeling them in a single way. Additionally, the granularity of concepts in the knowledge graphs used for recommendations tends to be coarse, failing to match the fine-grained nature of user interests. This homogenization limits the precise exploitation of knowledge graph data and interest connectivity. To address these limitations, we introduce a novel embedding-based model called InBox. Specifically, various knowledge graph entities and relations are embedded as points or boxes, while user interests are modeled as boxes encompassing interaction history. Representing interests as boxes enables containing collections of item points related to that interest. We further propose that an interest comprises diverse basic concepts, and box intersection naturally supports concept combination. Across three training steps, InBox significantly outperforms state-of-the-art methods like HAKG and KGIN on recommendation tasks. Further analysis provides meaningful insights into the variable value of different KG data for recommendations. In summary, InBox advances recommender systems through box-based interest and concept modeling for sophisticated knowledge graph exploitation.

* VLDB 2024 under submission

Via

Access Paper or Ask Questions

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

Feb 26, 2024

Zhuo Chen, Yichi Zhang, Yin Fang, Yuxia Geng, Lingbing Guo, Xiang Chen, Qian Li, Wen Zhang, Jiaoyan Chen, Yushan Zhu(+5 more)

Figure 1 for Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

Figure 2 for Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

Figure 3 for Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

Figure 4 for Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

Abstract:Knowledge Graphs (KGs) play a pivotal role in advancing various AI applications, with the semantic web community's exploration into multi-modal dimensions unlocking new avenues for innovation. In this survey, we carefully review over 300 articles, focusing on KG-aware research in two principal aspects: KG-driven Multi-Modal (KG4MM) learning, where KGs support multi-modal tasks, and Multi-Modal Knowledge Graph (MM4KG), which extends KG studies into the MMKG realm. We begin by defining KGs and MMKGs, then explore their construction progress. Our review includes two primary task categories: KG-aware multi-modal learning tasks, such as Image Classification and Visual Question Answering, and intrinsic MMKG tasks like Multi-modal Knowledge Graph Completion and Entity Alignment, highlighting specific research trajectories. For most of these tasks, we provide definitions, evaluation benchmarks, and additionally outline essential insights for conducting relevant research. Finally, we discuss current challenges and identify emerging trends, such as progress in Large Language Modeling and Multi-modal Pre-training strategies. This survey aims to serve as a comprehensive reference for researchers already involved in or considering delving into KG and multi-modal learning research, offering insights into the evolving landscape of MMKG research and supporting future work.

* Ongoing work; 41 pages (Main Text), 55 pages (Total), 11 Tables, 13 Figures, 619 citations; Paper list is available at https://github.com/zjukg/KG-MM-Survey

Via

Access Paper or Ask Questions

Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion

Feb 22, 2024

Yichi Zhang, Zhuo Chen, Lei Liang, Huajun Chen, Wen Zhang

Figure 1 for Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion

Figure 2 for Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion

Figure 3 for Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion

Figure 4 for Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion

Abstract:Multi-modal knowledge graph completion (MMKGC) aims to predict the missing triples in the multi-modal knowledge graphs by incorporating structural, visual, and textual information of entities into the discriminant models. The information from different modalities will work together to measure the triple plausibility. Existing MMKGC methods overlook the imbalance problem of modality information among entities, resulting in inadequate modal fusion and inefficient utilization of the raw modality information. To address the mentioned problems, we propose Adaptive Multi-modal Fusion and Modality Adversarial Training (AdaMF-MAT) to unleash the power of imbalanced modality information for MMKGC. AdaMF-MAT achieves multi-modal fusion with adaptive modality weights and further generates adversarial samples by modality-adversarial training to enhance the imbalanced modality information. Our approach is a co-design of the MMKGC model and training strategy which can outperform 19 recent MMKGC methods and achieve new state-of-the-art results on three public MMKGC benchmarks. Our code and data have been released at https://github.com/zjukg/AdaMF-MAT.

* Accepted by LREC-COLING 2024

Via

Access Paper or Ask Questions

Know Your Needs Better: Towards Structured Understanding of Marketer Demands with Analogical Reasoning Augmented LLMs

Jan 09, 2024

Junjie Wang, Dan Yang, Binbin Hu, Yue Shen, Ziqi Liu, Wen Zhang, Jinjie Gu, Zhiqiang Zhang

Abstract:In this paper, we explore a new way for user targeting, where non-expert marketers could select their target users solely given demands in natural language form. The key to this issue is how to transform natural languages into practical structured logical languages, i.e., the structured understanding of marketer demands. Considering the impressive natural language processing ability of large language models (LLMs), we try to leverage LLMs to solve this issue. Past research indicates that the reasoning ability of LLMs can be effectively enhanced through chain-of-thought (CoT) prompting. But existing methods still have some limitations: (1) Previous methods either use simple "Let's think step by step" spells or provide fixed examples in demonstrations without considering compatibility between prompts and questions, making LLMs ineffective in some complex reasoning tasks such as structured language transformation. (2) Previous methods are often implemented in closed-source models or excessively large models, which is not suitable in industrial practical scenarios. Based on these, we propose ARALLM (i.e., Analogical Reasoning Augmented Large Language Models) consisting of two modules: Analogical Reasoning based Prompting and Reasoning-Augmented Multi-Task Model Distillation.

* Under review

Via

Access Paper or Ask Questions

A Multi-Modal Contrastive Diffusion Model for Therapeutic Peptide Generation

Jan 04, 2024

Yongkang Wang, Xuan Liu, Feng Huang, Zhankun Xiong, Wen Zhang

Abstract:Therapeutic peptides represent a unique class of pharmaceutical agents crucial for the treatment of human diseases. Recently, deep generative models have exhibited remarkable potential for generating therapeutic peptides, but they only utilize sequence or structure information alone, which hinders the performance in generation. In this study, we propose a Multi-Modal Contrastive Diffusion model (MMCD), fusing both sequence and structure modalities in a diffusion framework to co-generate novel peptide sequences and structures. Specifically, MMCD constructs the sequence-modal and structure-modal diffusion models, respectively, and devises a multi-modal contrastive learning strategy with intercontrastive and intra-contrastive in each diffusion timestep, aiming to capture the consistency between two modalities and boost model performance. The inter-contrastive aligns sequences and structures of peptides by maximizing the agreement of their embeddings, while the intra-contrastive differentiates therapeutic and non-therapeutic peptides by maximizing the disagreement of their sequence/structure embeddings simultaneously. The extensive experiments demonstrate that MMCD performs better than other state-of-theart deep generative methods in generating therapeutic peptides across various metrics, including antimicrobial/anticancer score, diversity, and peptide-docking.

* This paper is accepted by AAAI 2024

Via

Access Paper or Ask Questions