Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shu Wu

Bi-Level Graph Structure Learning for Next POI Recommendation

Nov 02, 2024

Liang Wang, Shu Wu, Qiang Liu, Yanqiao Zhu, Xiang Tao, Mengdi Zhang

Figure 1 for Bi-Level Graph Structure Learning for Next POI Recommendation

Figure 2 for Bi-Level Graph Structure Learning for Next POI Recommendation

Figure 3 for Bi-Level Graph Structure Learning for Next POI Recommendation

Figure 4 for Bi-Level Graph Structure Learning for Next POI Recommendation

Abstract:Next point-of-interest (POI) recommendation aims to predict a user's next destination based on sequential check-in history and a set of POI candidates. Graph neural networks (GNNs) have demonstrated a remarkable capability in this endeavor by exploiting the extensive global collaborative signals present among POIs. However, most of the existing graph-based approaches construct graph structures based on pre-defined heuristics, failing to consider inherent hierarchical structures of POI features such as geographical locations and visiting peaks, or suffering from noisy and incomplete structures in graphs. To address the aforementioned issues, this paper presents a novel Bi-level Graph Structure Learning (BiGSL) for next POI recommendation. BiGSL first learns a hierarchical graph structure to capture the fine-to-coarse connectivity between POIs and prototypes, and then uses a pairwise learning module to dynamically infer relationships between POI pairs and prototype pairs. Based on the learned bi-level graphs, our model then employs a multi-relational graph network that considers both POI- and prototype-level neighbors, resulting in improved POI representations. Our bi-level structure learning scheme is more robust to data noise and incompleteness, and improves the exploration ability for recommendation by alleviating sparsity issues. Experimental results on three real-world datasets demonstrate the superiority of our model over existing state-of-the-art methods, with a significant improvement in recommendation accuracy and exploration performance.

* IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 11, pp. 5695-5708, Nov. 2024
* Accepted by IEEE Transactions on Knowledge and Data Engineering

Via

Access Paper or Ask Questions

Beyond Filtering: Adaptive Image-Text Quality Enhancement for MLLM Pretraining

Oct 21, 2024

Han Huang, Yuqi Huo, Zijia Zhao, Haoyu Lu, Shu Wu, Bingning Wang, Qiang Liu, Weipeng Chen, Liang Wang

Figure 1 for Beyond Filtering: Adaptive Image-Text Quality Enhancement for MLLM Pretraining

Figure 2 for Beyond Filtering: Adaptive Image-Text Quality Enhancement for MLLM Pretraining

Figure 3 for Beyond Filtering: Adaptive Image-Text Quality Enhancement for MLLM Pretraining

Figure 4 for Beyond Filtering: Adaptive Image-Text Quality Enhancement for MLLM Pretraining

Abstract:Multimodal large language models (MLLMs) have made significant strides by integrating visual and textual modalities. A critical factor in training MLLMs is the quality of image-text pairs within multimodal pretraining datasets. However, $\textit {de facto}$ filter-based data quality enhancement paradigms often discard a substantial portion of high-quality image data due to inadequate semantic alignment between images and texts, leading to inefficiencies in data utilization and scalability. In this paper, we propose the Adaptive Image-Text Quality Enhancer (AITQE), a model that dynamically assesses and enhances the quality of image-text pairs. AITQE employs a text rewriting mechanism for low-quality pairs and incorporates a negative sample learning strategy to improve evaluative capabilities by integrating deliberately selected low-quality samples during training. Unlike prior approaches that significantly alter text distributions, our method minimally adjusts text to preserve data volume while enhancing quality. Experimental results demonstrate that AITQE surpasses existing methods on various benchmark, effectively leveraging raw data and scaling efficiently with increasing data volumes. We hope our work will inspire future works. The code and model are available at: https://github.com/hanhuang22/AITQE.

Via

Access Paper or Ask Questions

Uncovering Overfitting in Large Language Model Editing

Oct 10, 2024

Mengqi Zhang, Xiaotian Ye, Qiang Liu, Pengjie Ren, Shu Wu, Zhumin Chen

Figure 1 for Uncovering Overfitting in Large Language Model Editing

Figure 2 for Uncovering Overfitting in Large Language Model Editing

Figure 3 for Uncovering Overfitting in Large Language Model Editing

Figure 4 for Uncovering Overfitting in Large Language Model Editing

Abstract:Knowledge editing has been proposed as an effective method for updating and correcting the internal knowledge of Large Language Models (LLMs). However, existing editing methods often struggle with complex tasks, such as multi-hop reasoning. In this paper, we identify and investigate the phenomenon of Editing Overfit, where edited models assign disproportionately high probabilities to the edit target, hindering the generalization of new knowledge in complex scenarios. We attribute this issue to the current editing paradigm, which places excessive emphasis on the direct correspondence between the input prompt and the edit target for each edit sample. To further explore this issue, we introduce a new benchmark, EVOKE (EValuation of Editing Overfit in Knowledge Editing), along with fine-grained evaluation metrics. Through comprehensive experiments and analysis, we demonstrate that Editing Overfit is prevalent in current editing methods and that common overfitting mitigation strategies are of limited effectiveness in knowledge editing. To overcome this, inspired by LLMs' knowledge recall mechanisms, we propose a new plug-and-play strategy called Learn to Inference (LTI), which introduce a Multi-stage Inference Constraint module to guide the edited models in recalling new knowledge similarly to how unedited LLMs leverage knowledge through in-context learning. Extensive experimental results across a wide range of tasks validate the effectiveness of LTI in mitigating Editing Overfit.

Via

Access Paper or Ask Questions

Beyond Efficiency: Molecular Data Pruning for Enhanced Generalization

Sep 02, 2024

Dingshuo Chen, Zhixun Li, Yuyan Ni, Guibin Zhang, Ding Wang, Qiang Liu, Shu Wu, Jeffrey Xu Yu, Liang Wang

Figure 1 for Beyond Efficiency: Molecular Data Pruning for Enhanced Generalization

Figure 2 for Beyond Efficiency: Molecular Data Pruning for Enhanced Generalization

Figure 3 for Beyond Efficiency: Molecular Data Pruning for Enhanced Generalization

Figure 4 for Beyond Efficiency: Molecular Data Pruning for Enhanced Generalization

Abstract:With the emergence of various molecular tasks and massive datasets, how to perform efficient training has become an urgent yet under-explored issue in the area. Data pruning (DP), as an oft-stated approach to saving training burdens, filters out less influential samples to form a coreset for training. However, the increasing reliance on pretrained models for molecular tasks renders traditional in-domain DP methods incompatible. Therefore, we propose a Molecular data Pruning framework for enhanced Generalization (MolPeg), which focuses on the source-free data pruning scenario, where data pruning is applied with pretrained models. By maintaining two models with different updating paces during training, we introduce a novel scoring function to measure the informativeness of samples based on the loss discrepancy. As a plug-and-play framework, MolPeg realizes the perception of both source and target domain and consistently outperforms existing DP methods across four downstream tasks. Remarkably, it can surpass the performance obtained from full-dataset training, even when pruning up to 60-70% of the data on HIV and PCBA dataset. Our work suggests that the discovery of effective data-pruning metrics could provide a viable path to both enhanced efficiency and superior generalization in transfer learning.

* 20 pages, under review

Via

Access Paper or Ask Questions

Enhancing Multi-hop Reasoning through Knowledge Erasure in Large Language Model Editing

Aug 22, 2024

Mengqi Zhang, Bowen Fang, Qiang Liu, Pengjie Ren, Shu Wu, Zhumin Chen, Liang Wang

Figure 1 for Enhancing Multi-hop Reasoning through Knowledge Erasure in Large Language Model Editing

Figure 2 for Enhancing Multi-hop Reasoning through Knowledge Erasure in Large Language Model Editing

Figure 3 for Enhancing Multi-hop Reasoning through Knowledge Erasure in Large Language Model Editing

Figure 4 for Enhancing Multi-hop Reasoning through Knowledge Erasure in Large Language Model Editing

Abstract:Large language models (LLMs) face challenges with internal knowledge inaccuracies and outdated information. Knowledge editing has emerged as a pivotal approach to mitigate these issues. Although current knowledge editing techniques exhibit promising performance in single-hop reasoning tasks, they show limitations when applied to multi-hop reasoning. Drawing on cognitive neuroscience and the operational mechanisms of LLMs, we hypothesize that the residual single-hop knowledge after editing causes edited models to revert to their original answers when processing multi-hop questions, thereby undermining their performance in multihop reasoning tasks. To validate this hypothesis, we conduct a series of experiments that empirically confirm our assumptions. Building on the validated hypothesis, we propose a novel knowledge editing method that incorporates a Knowledge Erasure mechanism for Large language model Editing (KELE). Specifically, we design an erasure function for residual knowledge and an injection function for new knowledge. Through joint optimization, we derive the optimal recall vector, which is subsequently utilized within a rank-one editing framework to update the parameters of targeted model layers. Extensive experiments on GPT-J and GPT-2 XL demonstrate that KELE substantially enhances the multi-hop reasoning capability of edited LLMs.

Via

Access Paper or Ask Questions

DIVE: Subgraph Disagreement for Graph Out-of-Distribution Generalization

Aug 08, 2024

Xin Sun, Liang Wang, Qiang Liu, Shu Wu, Zilei Wang

Abstract:This paper addresses the challenge of out-of-distribution (OOD) generalization in graph machine learning, a field rapidly advancing yet grappling with the discrepancy between source and target data distributions. Traditional graph learning algorithms, based on the assumption of uniform distribution between training and test data, falter in real-world scenarios where this assumption fails, resulting in suboptimal performance. A principal factor contributing to this suboptimal performance is the inherent simplicity bias of neural networks trained through Stochastic Gradient Descent (SGD), which prefer simpler features over more complex yet equally or more predictive ones. This bias leads to a reliance on spurious correlations, adversely affecting OOD performance in various tasks such as image recognition, natural language understanding, and graph classification. Current methodologies, including subgraph-mixup and information bottleneck approaches, have achieved partial success but struggle to overcome simplicity bias, often reinforcing spurious correlations. To tackle this, we propose DIVE, training a collection of models to focus on all label-predictive subgraphs by encouraging the models to foster divergence on the subgraph mask, which circumvents the limitation of a model solely focusing on the subgraph corresponding to simple structural patterns. Specifically, we employs a regularizer to punish overlap in extracted subgraphs across models, thereby encouraging different models to concentrate on distinct structural patterns. Model selection for robust OOD performance is achieved through validation accuracy. Tested across four datasets from GOOD benchmark and one dataset from DrugOOD benchmark, our approach demonstrates significant improvement over existing methods, effectively addressing the simplicity bias and enhancing generalization in graph machine learning.

Via

Access Paper or Ask Questions

Navigating the Noisy Crowd: Finding Key Information for Claim Verification

Jul 17, 2024

Haisong Gong, Huanhuan Ma, Qiang Liu, Shu Wu, Liang Wang

Figure 1 for Navigating the Noisy Crowd: Finding Key Information for Claim Verification

Figure 2 for Navigating the Noisy Crowd: Finding Key Information for Claim Verification

Figure 3 for Navigating the Noisy Crowd: Finding Key Information for Claim Verification

Figure 4 for Navigating the Noisy Crowd: Finding Key Information for Claim Verification

Abstract:Claim verification is a task that involves assessing the truthfulness of a given claim based on multiple evidence pieces. Using large language models (LLMs) for claim verification is a promising way. However, simply feeding all the evidence pieces to an LLM and asking if the claim is factual does not yield good results. The challenge lies in the noisy nature of both the evidence and the claim: evidence passages typically contain irrelevant information, with the key facts hidden within the context, while claims often convey multiple aspects simultaneously. To navigate this "noisy crowd" of information, we propose EACon (Evidence Abstraction and Claim Deconstruction), a framework designed to find key information within evidence and verify each aspect of a claim separately. EACon first finds keywords from the claim and employs fuzzy matching to select relevant keywords for each raw evidence piece. These keywords serve as a guide to extract and summarize critical information into abstracted evidence. Subsequently, EACon deconstructs the original claim into subclaims, which are then verified against both abstracted and raw evidence individually. We evaluate EACon using two open-source LLMs on two challenging datasets. Results demonstrate that EACon consistently and substantially improve LLMs' performance in claim verification.

Via

Access Paper or Ask Questions

Learning Domain-Invariant Features for Out-of-Context News Detection

Jun 11, 2024

Yimeng Gu, Mengqi Zhang, Ignacio Castro, Shu Wu, Gareth Tyson

Figure 1 for Learning Domain-Invariant Features for Out-of-Context News Detection

Figure 2 for Learning Domain-Invariant Features for Out-of-Context News Detection

Figure 3 for Learning Domain-Invariant Features for Out-of-Context News Detection

Figure 4 for Learning Domain-Invariant Features for Out-of-Context News Detection

Abstract:Multimodal out-of-context news is a common type of misinformation on online media platforms. This involves posting a caption, alongside an invalid out-of-context news image. Reflecting its importance, researchers have developed models to detect such misinformation. However, a common limitation of these models is that they only consider the scenario where pre-labeled data is available for each domain, failing to address the out-of-context news detection on unlabeled domains (e.g., unverified news on new topics or agencies). In this work, we therefore focus on domain adaptive out-of-context news detection. In order to effectively adapt the detection model to unlabeled news topics or agencies, we propose ConDA-TTA (Contrastive Domain Adaptation with Test-Time Adaptation) which applies contrastive learning and maximum mean discrepancy (MMD) to learn the domain-invariant feature. In addition, it leverages target domain statistics during test-time to further assist domain adaptation. Experimental results show that our approach outperforms baselines in 5 out of 7 domain adaptation settings on two public datasets, by as much as 2.93% in F1 and 2.08% in accuracy.

Via

Access Paper or Ask Questions

Interpretable Multimodal Out-of-context Detection with Soft Logic Regularization

Jun 07, 2024

Huanhuan Ma, Jinghao Zhang, Qiang Liu, Shu Wu, Liang Wang

Figure 1 for Interpretable Multimodal Out-of-context Detection with Soft Logic Regularization

Figure 2 for Interpretable Multimodal Out-of-context Detection with Soft Logic Regularization

Figure 3 for Interpretable Multimodal Out-of-context Detection with Soft Logic Regularization

Abstract:The rapid spread of information through mobile devices and media has led to the widespread of false or deceptive news, causing significant concerns in society. Among different types of misinformation, image repurposing, also known as out-of-context misinformation, remains highly prevalent and effective. However, current approaches for detecting out-of-context misinformation often lack interpretability and offer limited explanations. In this study, we propose a logic regularization approach for out-of-context detection called LOGRAN (LOGic Regularization for out-of-context ANalysis). The primary objective of LOGRAN is to decompose the out-of-context detection at the phrase level. By employing latent variables for phrase-level predictions, the final prediction of the image-caption pair can be aggregated using logical rules. The latent variables also provide an explanation for how the final result is derived, making this fine-grained detection method inherently explanatory. We evaluate the performance of LOGRAN on the NewsCLIPpings dataset, showcasing competitive overall results. Visualized examples also reveal faithful phrase-level predictions of out-of-context images, accompanied by explanations. This highlights the effectiveness of our approach in addressing out-of-context detection and enhancing interpretability.

* ICASSP 2024 lecture paper

Via

Access Paper or Ask Questions

Semantic Evolvement Enhanced Graph Autoencoder for Rumor Detection

Apr 24, 2024

Xiang Tao, Liang Wang, Qiang Liu, Shu Wu

Figure 1 for Semantic Evolvement Enhanced Graph Autoencoder for Rumor Detection

Figure 2 for Semantic Evolvement Enhanced Graph Autoencoder for Rumor Detection

Figure 3 for Semantic Evolvement Enhanced Graph Autoencoder for Rumor Detection

Figure 4 for Semantic Evolvement Enhanced Graph Autoencoder for Rumor Detection

Abstract:Due to the rapid spread of rumors on social media, rumor detection has become an extremely important challenge. Recently, numerous rumor detection models which utilize textual information and the propagation structure of events have been proposed. However, these methods overlook the importance of semantic evolvement information of event in propagation process, which is often challenging to be truly learned in supervised training paradigms and traditional rumor detection methods. To address this issue, we propose a novel semantic evolvement enhanced Graph Autoencoder for Rumor Detection (GARD) model in this paper. The model learns semantic evolvement information of events by capturing local semantic changes and global semantic evolvement information through specific graph autoencoder and reconstruction strategies. By combining semantic evolvement information and propagation structure information, the model achieves a comprehensive understanding of event propagation and perform accurate and robust detection, while also detecting rumors earlier by capturing semantic evolvement information in the early stages. Moreover, in order to enhance the model's ability to learn the distinct patterns of rumors and non-rumors, we introduce a uniformity regularizer to further improve the model's performance. Experimental results on three public benchmark datasets confirm the superiority of our GARD method over the state-of-the-art approaches in both overall performance and early rumor detection.

Via

Access Paper or Ask Questions