Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qing Wang

Bridge: A Unified Framework to Knowledge Graph Completion via Language Models and Knowledge Representation

Nov 11, 2024

Qiao Qiao, Yuepei Li, Qing Wang, Kang Zhou, Qi Li

Abstract:Knowledge graph completion (KGC) is a task of inferring missing triples based on existing Knowledge Graphs (KGs). Both structural and semantic information are vital for successful KGC. However, existing methods only use either the structural knowledge from the KG embeddings or the semantic information from pre-trained language models (PLMs), leading to suboptimal model performance. Moreover, since PLMs are not trained on KGs, directly using PLMs to encode triples may be inappropriate. To overcome these limitations, we propose a novel framework called Bridge, which jointly encodes structural and semantic information of KGs. Specifically, we strategically encode entities and relations separately by PLMs to better utilize the semantic knowledge of PLMs and enable structured representation learning via a structural learning principle. Furthermore, to bridge the gap between KGs and PLMs, we employ a self-supervised representation learning method called BYOL to fine-tune PLMs with two different views of a triple. Unlike BYOL, which uses augmentation methods to create two semantically similar views of the same image, potentially altering the semantic information. We strategically separate the triple into two parts to create different views, thus avoiding semantic alteration. Experiments demonstrate that Bridge outperforms the SOTA models on three benchmark datasets.

Via

Access Paper or Ask Questions

STEM-POM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing

Nov 01, 2024

Jiaru Zou, Qing Wang, Pratyush Thakur, Nickvash Kani

Figure 1 for STEM-POM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing

Figure 2 for STEM-POM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing

Figure 3 for STEM-POM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing

Figure 4 for STEM-POM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing

Abstract:Advances in large language models (LLMs) have spurred research into enhancing their reasoning capabilities, particularly in math-rich STEM documents. While LLMs can generate equations or solve math-related queries, their ability to fully understand and interpret abstract mathematical symbols in long, math-rich documents remains limited. In this paper, we introduce STEM-PoM, a comprehensive benchmark dataset designed to evaluate LLMs' reasoning abilities on math symbols within contextual scientific text. The dataset, sourced from real-world ArXiv documents, contains over 2K math symbols classified as main attributes of variables, constants, operators, and unit descriptors, with additional sub-attributes including scalar/vector/matrix for variables and local/global/discipline-specific labels for both constants and operators. Our extensive experiments show that state-of-the-art LLMs achieve an average of 20-60% accuracy under in-context learning and 50-60% accuracy with fine-tuning, revealing a significant gap in their mathematical reasoning capabilities. STEM-PoM fuels future research of developing advanced Math-AI models that can robustly handle math symbols.

* Accepted to NeurIPS Math-AI 2024

Via

Access Paper or Ask Questions

CodePurify: Defend Backdoor Attacks on Neural Code Models via Entropy-based Purification

Oct 26, 2024

Fangwen Mu, Junjie Wang, Zhuohao Yu, Lin Shi, Song Wang, Mingyang Li, Qing Wang

Figure 1 for CodePurify: Defend Backdoor Attacks on Neural Code Models via Entropy-based Purification

Figure 2 for CodePurify: Defend Backdoor Attacks on Neural Code Models via Entropy-based Purification

Figure 3 for CodePurify: Defend Backdoor Attacks on Neural Code Models via Entropy-based Purification

Figure 4 for CodePurify: Defend Backdoor Attacks on Neural Code Models via Entropy-based Purification

Abstract:Neural code models have found widespread success in tasks pertaining to code intelligence, yet they are vulnerable to backdoor attacks, where an adversary can manipulate the victim model's behavior by inserting triggers into the source code. Recent studies indicate that advanced backdoor attacks can achieve nearly 100% attack success rates on many software engineering tasks. However, effective defense techniques against such attacks remain insufficiently explored. In this study, we propose CodePurify, a novel defense against backdoor attacks on code models through entropy-based purification. Entropy-based purification involves the process of precisely detecting and eliminating the possible triggers in the source code while preserving its semantic information. Within this process, CodePurify first develops a confidence-driven entropy-based measurement to determine whether a code snippet is poisoned and, if so, locates the triggers. Subsequently, it purifies the code by substituting the triggers with benign tokens using a masked language model. We extensively evaluate CodePurify against four advanced backdoor attacks across three representative tasks and two popular code models. The results show that CodePurify significantly outperforms four commonly used defense baselines, improving average defense performance by at least 40%, 40%, and 12% across the three tasks, respectively. These findings highlight the potential of CodePurify to serve as a robust defense against backdoor attacks on neural code models.

Via

Access Paper or Ask Questions

Optimal Partial Graph Matching

Oct 23, 2024

Gathika Ratnayaka, James Nichols, Qing Wang

Figure 1 for Optimal Partial Graph Matching

Figure 2 for Optimal Partial Graph Matching

Figure 3 for Optimal Partial Graph Matching

Figure 4 for Optimal Partial Graph Matching

Abstract:Partial graph matching addresses the limitations of traditional graph matching by allowing some nodes to remain unmatched, making it applicable to more complex scenarios. However, this flexibility introduces additional complexity, as both the subset of nodes to match and the optimal mapping must be determined. While recent studies have explored deep learning techniques for partial graph matching, a significant limitation remains: the absence of an optimization objective that fully captures the problem's intrinsic nature while enabling efficient solutions. In this paper, we propose a novel optimization framework for partial graph matching, inspired by optimal partial transport. Our approach formulates an objective that enables partial assignments while incorporating matching biases, using weighted total variation as the divergence function to guarantee optimal partial assignments. We employ the Hungarian algorithm to achieve efficient, exact solutions with cubic time complexity. Our contributions are threefold: (i) we introduce a robust optimization objective that balances matched and unmatched nodes; (ii) we establish a connection between partial graph matching and the linear sum assignment problem, enabling efficient solutions; (iii) we propose a deep graph matching architecture with a novel partial matching loss, providing an end-to-end solution. The empirical evaluations on standard graph matching benchmarks demonstrate the efficacy of the proposed approach.

Via

Access Paper or Ask Questions

Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning

Oct 21, 2024

Hanlin Yang, Jian Yao, Weiming Liu, Qing Wang, Hanmin Qin, Hansheng Kong, Kirk Tang, Jiechao Xiong, Chao Yu, Kai Li(+6 more)

Figure 1 for Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning

Figure 2 for Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning

Figure 3 for Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning

Figure 4 for Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning

Abstract:Recovering a spectrum of diverse policies from a set of expert trajectories is an important research topic in imitation learning. After determining a latent style for a trajectory, previous diverse policies recovering methods usually employ a vanilla behavioral cloning learning objective conditioned on the latent style, treating each state-action pair in the trajectory with equal importance. Based on an observation that in many scenarios, behavioral styles are often highly relevant with only a subset of state-action pairs, this paper presents a new principled method in diverse polices recovery. In particular, after inferring or assigning a latent style for a trajectory, we enhance the vanilla behavioral cloning by incorporating a weighting mechanism based on pointwise mutual information. This additional weighting reflects the significance of each state-action pair's contribution to learning the style, thus allowing our method to focus on state-action pairs most representative of that style. We provide theoretical justifications for our new objective, and extensive empirical evaluations confirm the effectiveness of our method in recovering diverse policies from expert data.

* 18 pages, 6 figures

Via

Access Paper or Ask Questions

Towards Bridging Generalization and Expressivity of Graph Neural Networks

Oct 14, 2024

Shouheng Li, Floris Geerts, Dongwoo Kim, Qing Wang

Abstract:Expressivity and generalization are two critical aspects of graph neural networks (GNNs). While significant progress has been made in studying the expressivity of GNNs, much less is known about their generalization capabilities, particularly when dealing with the inherent complexity of graph-structured data. In this work, we address the intricate relationship between expressivity and generalization in GNNs. Theoretical studies conjecture a trade-off between the two: highly expressive models risk overfitting, while those focused on generalization may sacrifice expressivity. However, empirical evidence often contradicts this assumption, with expressive GNNs frequently demonstrating strong generalization. We explore this contradiction by introducing a novel framework that connects GNN generalization to the variance in graph structures they can capture. This leads us to propose a $k$-variance margin-based generalization bound that characterizes the structural properties of graph embeddings in terms of their upper-bounded expressive power. Our analysis does not rely on specific GNN architectures, making it broadly applicable across GNN models. We further uncover a trade-off between intra-class concentration and inter-class separation, both of which are crucial for effective generalization. Through case studies and experiments on real-world datasets, we demonstrate that our theoretical findings align with empirical results, offering a deeper understanding of how expressivity can enhance GNN generalization.

* 17 pages, 2 figures, 2 tables

Via

Access Paper or Ask Questions

From Prohibition to Adoption: How Hong Kong Universities Are Navigating ChatGPT in Academic Workflows

Oct 02, 2024

Junjun Huang, Jifan Wu, Qing Wang, Kemeng Yuan, Jiefeng Li, Di Lu

Abstract:This paper aims at comparing the time when Hong Kong universities used to ban ChatGPT to the current periods where it has become integrated in the academic processes. Bolted by concerns of integrity and ethical issues in technologies, institutions have adapted by moving towards the center adopting AI literacy and responsibility policies. This study examines new paradigms which have been developed to help implement these positives while preventing negative effects on academia. Keywords: ChatGPT, Academic Integrity, AI Literacy, Ethical AI Use, Generative AI in Education, University Policy, AI Integration in Academia, Higher Education and Technology

Via

Access Paper or Ask Questions

See then Tell: Enhancing Key Information Extraction with Vision Grounding

Sep 29, 2024

Shuhang Liu, Zhenrong Zhang, Pengfei Hu, Jiefeng Ma, Jun Du, Qing Wang, Jianshu Zhang, Chenyu Liu

Figure 1 for See then Tell: Enhancing Key Information Extraction with Vision Grounding

Figure 2 for See then Tell: Enhancing Key Information Extraction with Vision Grounding

Figure 3 for See then Tell: Enhancing Key Information Extraction with Vision Grounding

Figure 4 for See then Tell: Enhancing Key Information Extraction with Vision Grounding

Abstract:In the digital era, the ability to understand visually rich documents that integrate text, complex layouts, and imagery is critical. Traditional Key Information Extraction (KIE) methods primarily rely on Optical Character Recognition (OCR), which often introduces significant latency, computational overhead, and errors. Current advanced image-to-text approaches, which bypass OCR, typically yield plain text outputs without corresponding vision grounding. In this paper, we introduce STNet (See then Tell Net), a novel end-to-end model designed to deliver precise answers with relevant vision grounding. Distinctively, STNet utilizes a unique <see> token to observe pertinent image areas, aided by a decoder that interprets physical coordinates linked to this token. Positioned at the outset of the answer text, the <see> token allows the model to first see--observing the regions of the image related to the input question--and then tell--providing articulated textual responses. To enhance the model's seeing capabilities, we collect extensive structured table recognition datasets. Leveraging the advanced text processing prowess of GPT-4, we develop the TVG (TableQA with Vision Grounding) dataset, which not only provides text-based Question Answering (QA) pairs but also incorporates precise vision grounding for these pairs. Our approach demonstrates substantial advancements in KIE performance, achieving state-of-the-art results on publicly available datasets such as CORD, SROIE, and DocVQA. The code will also be made publicly available.

Via

Access Paper or Ask Questions

Investigating Context-Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style

Sep 17, 2024

Yuepei Li, Kang Zhou, Qiao Qiao, Bach Nguyen, Qing Wang, Qi Li

Figure 1 for Investigating Context-Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style

Figure 2 for Investigating Context-Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style

Figure 3 for Investigating Context-Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style

Figure 4 for Investigating Context-Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style

Abstract:Retrieval-augmented generation (RAG) improves Large Language Models (LLMs) by incorporating external information into the response generation process. However, how context-faithful LLMs are and what factors influence LLMs' context-faithfulness remain largely unexplored. In this study, we investigate the impact of memory strength and evidence presentation on LLMs' receptiveness to external evidence. We introduce a method to quantify the memory strength of LLMs by measuring the divergence in LLMs' responses to different paraphrases of the same question, which is not considered by previous works. We also generate evidence in various styles to evaluate the effects of evidence in different styles. Two datasets are used for evaluation: Natural Questions (NQ) with popular questions and popQA featuring long-tail questions. Our results show that for questions with high memory strength, LLMs are more likely to rely on internal memory, particularly for larger LLMs such as GPT-4. On the other hand, presenting paraphrased evidence significantly increases LLMs' receptiveness compared to simple repetition or adding details.

Via

Access Paper or Ask Questions

Speaker Contrastive Learning for Source Speaker Tracing

Sep 16, 2024

Qing Wang, Hongmei Guo, Jian Kang, Mengjie Du, Jie Li, Xiao-Lei Zhang, Lei Xie

Figure 1 for Speaker Contrastive Learning for Source Speaker Tracing

Figure 2 for Speaker Contrastive Learning for Source Speaker Tracing

Figure 3 for Speaker Contrastive Learning for Source Speaker Tracing

Figure 4 for Speaker Contrastive Learning for Source Speaker Tracing

Abstract:As a form of biometric authentication technology, the security of speaker verification systems is of utmost importance. However, SV systems are inherently vulnerable to various types of attacks that can compromise their accuracy and reliability. One such attack is voice conversion, which modifies a persons speech to sound like another person by altering various vocal characteristics. This poses a significant threat to SV systems. To address this challenge, the Source Speaker Tracing Challenge in IEEE SLT2024 aims to identify the source speaker information in manipulated speech signals. Specifically, SSTC focuses on source speaker verification against voice conversion to determine whether two converted speech samples originate from the same source speaker. In this study, we propose a speaker contrastive learning-based approach for source speaker tracing to learn the latent source speaker information in converted speech. To learn a more source-speaker-related representation, we employ speaker contrastive loss during the training of the embedding extractor. This speaker contrastive loss helps identify the true source speaker embedding among several distractor speaker embeddings, enabling the embedding extractor to learn the potentially possessing source speaker information present in the converted speech. Experiments demonstrate that our proposed speaker contrastive learning system achieves the lowest EER of 16.788% on the challenge test set, securing first place in the challenge.

* 7 pages, 2 figures, accepted by SLT

Via

Access Paper or Ask Questions