Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jianzhong Qi

Bayesian-guided Label Mapping for Visual Reprogramming

Oct 31, 2024

Chengyi Cai, Zesheng Ye, Lei Feng, Jianzhong Qi, Feng Liu

Figure 1 for Bayesian-guided Label Mapping for Visual Reprogramming

Figure 2 for Bayesian-guided Label Mapping for Visual Reprogramming

Figure 3 for Bayesian-guided Label Mapping for Visual Reprogramming

Figure 4 for Bayesian-guided Label Mapping for Visual Reprogramming

Abstract:Visual reprogramming (VR) leverages the intrinsic capabilities of pretrained vision models by adapting their input or output interfaces to solve downstream tasks whose labels (i.e., downstream labels) might be totally different from the labels associated with the pretrained models (i.e., pretrained labels). When adapting the output interface, label mapping methods transform the pretrained labels to downstream labels by establishing a gradient-free one-to-one correspondence between the two sets of labels. However, in this paper, we reveal that one-to-one mappings may overlook the complex relationship between pretrained and downstream labels. Motivated by this observation, we propose a Bayesian-guided Label Mapping (BLM) method. BLM constructs an iteratively-updated probabilistic label mapping matrix, with each element quantifying a pairwise relationship between pretrained and downstream labels. The assignment of values to the constructed matrix is guided by Bayesian conditional probability, considering the joint distribution of the downstream labels and the labels predicted by the pretrained model on downstream samples. Experiments conducted on both pretrained vision models (e.g., ResNeXt) and vision-language models (e.g., CLIP) demonstrate the superior performance of BLM over existing label mapping methods. The success of BLM also offers a probabilistic lens through which to understand and analyze the effectiveness of VR. Our code is available at https://github.com/tmlr-group/BayesianLM.

Via

Access Paper or Ask Questions

Accurate and Regret-aware Numerical Problem Solver for Tabular Question Answering

Oct 10, 2024

Yuxiang Wang, Jianzhong Qi, Junhao Gan

Figure 1 for Accurate and Regret-aware Numerical Problem Solver for Tabular Question Answering

Figure 2 for Accurate and Regret-aware Numerical Problem Solver for Tabular Question Answering

Figure 3 for Accurate and Regret-aware Numerical Problem Solver for Tabular Question Answering

Figure 4 for Accurate and Regret-aware Numerical Problem Solver for Tabular Question Answering

Abstract:Question answering on free-form tables (a.k.a. TableQA) is a challenging task because of the flexible structure and the complex schema of tables. Recent studies use Large Language Models (LLMs) for this task, exploiting their capability in understanding the questions and tabular data which are typically given in natural language and contains many textual fields, respectively. While this approach has shown promising results, it overlooks the challenges brought by numerical values which are common in tabular data, while LLMs are known to struggle with such values. We aim to address this issue and answer numerical questions. We propose a model named TabLaP that uses LLMs as a planner rather than an answer generator, exploiting LLMs capability in multi-step reasoning while leaving the actual numerical calculations to a Python interpreter for accurate calculation. Recognizing the inaccurate nature of LLMs, we further make a first attempt to quantify the trustworthiness of the answers produced by TabLaP, such that users can use TabLaP in a regret-aware manner. Experimental results on two benchmark datasets show that TabLaP is substantially more accurate than the state-of-the-art models, improving the answer accuracy by 5.7% and 5.8% on the two datasets, respectively.

Via

Access Paper or Ask Questions

Federated Graph Learning for Cross-Domain Recommendation

Oct 10, 2024

Ziqi Yang, Zhaopeng Peng, Zihui Wang, Jianzhong Qi, Chaochao Chen, Weike Pan, Chenglu Wen, Cheng Wang, Xiaoliang Fan

Figure 1 for Federated Graph Learning for Cross-Domain Recommendation

Figure 2 for Federated Graph Learning for Cross-Domain Recommendation

Figure 3 for Federated Graph Learning for Cross-Domain Recommendation

Figure 4 for Federated Graph Learning for Cross-Domain Recommendation

Abstract:Cross-domain recommendation (CDR) offers a promising solution to the data sparsity problem by enabling knowledge transfer across source and target domains. However, many recent CDR models overlook crucial issues such as privacy as well as the risk of negative transfer (which negatively impact model performance), especially in multi-domain settings. To address these challenges, we propose FedGCDR, a novel federated graph learning framework that securely and effectively leverages positive knowledge from multiple source domains. First, we design a positive knowledge transfer module that ensures privacy during inter-domain knowledge transmission. This module employs differential privacy-based knowledge extraction combined with a feature mapping mechanism, transforming source domain embeddings from federated graph attention networks into reliable domain knowledge. Second, we design a knowledge activation module to filter out potential harmful or conflicting knowledge from source domains, addressing the issues of negative transfer. This module enhances target domain training by expanding the graph of the target domain to generate reliable domain attentions and fine-tunes the target model for improved negative knowledge filtering and more accurate predictions. We conduct extensive experiments on 16 popular domains of the Amazon dataset, demonstrating that FedGCDR significantly outperforms state-of-the-art methods.

* Accepted by NeurIPS'24

Via

Access Paper or Ask Questions

Factual Dialogue Summarization via Learning from Large Language Models

Jun 20, 2024

Rongxin Zhu, Jey Han Lau, Jianzhong Qi

Abstract:Factual consistency is an important quality in dialogue summarization. Large language model (LLM)-based automatic text summarization models generate more factually consistent summaries compared to those by smaller pretrained language models, but they face deployment challenges in real-world applications due to privacy or resource constraints. In this paper, we investigate the use of symbolic knowledge distillation to improve the factual consistency of smaller pretrained models for dialogue summarization. We employ zero-shot learning to extract symbolic knowledge from LLMs, generating both factually consistent (positive) and inconsistent (negative) summaries. We then apply two contrastive learning objectives on these summaries to enhance smaller summarization models. Experiments with BART, PEGASUS, and Flan-T5 indicate that our approach surpasses strong baselines that rely on complex data augmentation strategies. Our approach achieves better factual consistency while maintaining coherence, fluency, and relevance, as confirmed by various automatic evaluation metrics. We also provide access to the data and code to facilitate future research.

Via

Access Paper or Ask Questions

Sample-specific Masks for Visual Reprogramming-based Prompting

Jun 05, 2024

Chengyi Cai, Zesheng Ye, Lei Feng, Jianzhong Qi, Feng Liu

Figure 1 for Sample-specific Masks for Visual Reprogramming-based Prompting

Figure 2 for Sample-specific Masks for Visual Reprogramming-based Prompting

Figure 3 for Sample-specific Masks for Visual Reprogramming-based Prompting

Figure 4 for Sample-specific Masks for Visual Reprogramming-based Prompting

Abstract:Visual reprogramming (VR) is a prompting technique that aims to re-purpose a pre-trained model (e.g., a classifier on ImageNet) to target tasks (e.g., medical data prediction) by learning a small-scale pattern added into input images instead of tuning considerable parameters within the model. The location of the pattern within input samples is usually determined by a pre-defined mask shared across all samples. In this paper, we show that the shared mask potentially limits VR's generalization and increases its approximation error due to the lack of sample-level adaptation. Motivated by this finding, we design a new framework for VR called sample-specific multi-channel masks (SMM). Specifically, SMM employs a lightweight ConvNet and patch-wise interpolation to generate sample-specific three-channel masks instead of a shared and pre-defined mask. Since we generate different masks for individual samples, SMM is theoretically shown to reduce approximation error for the target tasks compared with existing state-of-the-art VR methods. We also empirically demonstrate its performance gain on both ResNet and ViT. The success of SMM further highlights the broader applicability of VR in leveraging the latent knowledge of pre-trained models for various target tasks. Our code is available at https://github.com/tmlr-group/SMM.

Via

Access Paper or Ask Questions

Spatial-temporal Forecasting for Regions without Observations

Jan 19, 2024

Xinyu Su, Jianzhong Qi, Egemen Tanin, Yanchuan Chang, Majid Sarvi

Figure 1 for Spatial-temporal Forecasting for Regions without Observations

Figure 2 for Spatial-temporal Forecasting for Regions without Observations

Figure 3 for Spatial-temporal Forecasting for Regions without Observations

Figure 4 for Spatial-temporal Forecasting for Regions without Observations

Abstract:Spatial-temporal forecasting plays an important role in many real-world applications, such as traffic forecasting, air pollutant forecasting, crowd-flow forecasting, and so on. State-of-the-art spatial-temporal forecasting models take data-driven approaches and rely heavily on data availability. Such models suffer from accuracy issues when data is incomplete, which is common in reality due to the heavy costs of deploying and maintaining sensors for data collection. A few recent studies attempted to address the issue of incomplete data. They typically assume some data availability in a region of interest either for a short period or at a few locations. In this paper, we further study spatial-temporal forecasting for a region of interest without any historical observations, to address scenarios such as unbalanced region development, progressive deployment of sensors or lack of open data. We propose a model named STSM for the task. The model takes a contrastive learning-based approach to learn spatial-temporal patterns from adjacent regions that have recorded data. Our key insight is to learn from the locations that resemble those in the region of interest, and we propose a selective masking strategy to enable the learning. As a result, our model outperforms adapted state-of-the-art models, reducing errors consistently over both traffic and air pollutant forecasting tasks. The source code is available at https://github.com/suzy0223/STSM.

* Accepted by EDBT2024

Via

Access Paper or Ask Questions

Urban Region Representation Learning with Attentive Fusion

Dec 07, 2023

Fengze Sun, Jianzhong Qi, Yanchuan Chang, Xiaoliang Fan, Shanika Karunasekera, Egemen Tanin

Figure 1 for Urban Region Representation Learning with Attentive Fusion

Figure 2 for Urban Region Representation Learning with Attentive Fusion

Figure 3 for Urban Region Representation Learning with Attentive Fusion

Figure 4 for Urban Region Representation Learning with Attentive Fusion

Abstract:An increasing number of related urban data sources have brought forth novel opportunities for learning urban region representations, i.e., embeddings. The embeddings describe latent features of urban regions and enable discovering similar regions for urban planning applications. Existing methods learn an embedding for a region using every different type of region feature data, and subsequently fuse all learned embeddings of a region to generate a unified region embedding. However, these studies often overlook the significance of the fusion process. The typical fusion methods rely on simple aggregation, such as summation and concatenation, thereby disregarding correlations within the fused region embeddings. To address this limitation, we propose a novel model named HAFusion. Our model is powered by a dual-feature attentive fusion module named DAFusion, which fuses embeddings from different region features to learn higher-order correlations between the regions as well as between the different types of region features. DAFusion is generic - it can be integrated into existing models to enhance their fusion process. Further, motivated by the effective fusion capability of an attentive module, we propose a hybrid attentive feature learning module named HALearning to enhance the embedding learning from each individual type of region features. Extensive experiments on three real-world datasets demonstrate that our model HAFusion outperforms state-of-the-art methods across three different prediction tasks. Using our learned region embedding leads to consistent and up to 31% improvements in the prediction accuracy.

Via

Access Paper or Ask Questions

Fake News Detection Through Graph-based Neural Networks: A Survey

Jul 24, 2023

Shuzhi Gong, Richard O. Sinnott, Jianzhong Qi, Cecile Paris

Abstract:The popularity of online social networks has enabled rapid dissemination of information. People now can share and consume information much more rapidly than ever before. However, low-quality and/or accidentally/deliberately fake information can also spread rapidly. This can lead to considerable and negative impacts on society. Identifying, labelling and debunking online misinformation as early as possible has become an increasingly urgent problem. Many methods have been proposed to detect fake news including many deep learning and graph-based approaches. In recent years, graph-based methods have yielded strong results, as they can closely model the social context and propagation process of online news. In this paper, we present a systematic review of fake news detection studies based on graph-based and deep learning-based techniques. We classify existing graph-based methods into knowledge-driven methods, propagation-based methods, and heterogeneous social context-based methods, depending on how a graph structure is constructed to model news related information flows. We further discuss the challenges and open problems in graph-based fake news detection and identify future research directions.

* 18 pages, 3 tables, 7 figures

Via

Access Paper or Ask Questions

AutoAlign: Fully Automatic and Effective Knowledge Graph Alignment enabled by Large Language Models

Jul 18, 2023

Rui Zhang, Yixin Su, Bayu Distiawan Trisedya, Xiaoyan Zhao, Min Yang, Hong Cheng, Jianzhong Qi

Figure 1 for AutoAlign: Fully Automatic and Effective Knowledge Graph Alignment enabled by Large Language Models

Figure 2 for AutoAlign: Fully Automatic and Effective Knowledge Graph Alignment enabled by Large Language Models

Figure 3 for AutoAlign: Fully Automatic and Effective Knowledge Graph Alignment enabled by Large Language Models

Figure 4 for AutoAlign: Fully Automatic and Effective Knowledge Graph Alignment enabled by Large Language Models

Abstract:The task of entity alignment between knowledge graphs (KGs) aims to identify every pair of entities from two different KGs that represent the same entity. Many machine learning-based methods have been proposed for this task. However, to our best knowledge, existing methods all require manually crafted seed alignments, which are expensive to obtain. In this paper, we propose the first fully automatic alignment method named AutoAlign, which does not require any manually crafted seed alignments. Specifically, for predicate embeddings, AutoAlign constructs a predicate-proximity-graph with the help of large language models to automatically capture the similarity between predicates across two KGs. For entity embeddings, AutoAlign first computes the entity embeddings of each KG independently using TransE, and then shifts the two KGs' entity embeddings into the same vector space by computing the similarity between entities based on their attributes. Thus, both predicate alignment and entity alignment can be done without manually crafted seed alignments. AutoAlign is not only fully automatic, but also highly effective. Experiments using real-world KGs show that AutoAlign improves the performance of entity alignment significantly compared to state-of-the-art methods.

* 14 pages, 5 figures, 4 tables. arXiv admin note: substantial text overlap with arXiv:2210.08540

Via

Access Paper or Ask Questions

Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization

May 26, 2023

Rongxin Zhu, Jianzhong Qi, Jey Han Lau

Abstract:A series of datasets and models have been proposed for summaries generated for well-formatted documents such as news articles. Dialogue summaries, however, have been under explored. In this paper, we present the first dataset with fine-grained factual error annotations named DIASUMFACT. We define fine-grained factual error detection as a sentence-level multi-label classification problem, and we evaluate two state-of-the-art (SOTA) models on our dataset. Both models yield sub-optimal results, with a macro-averaged F1 score of around 0.25 over 6 error classes. We further propose an unsupervised model ENDERANKER via candidate ranking using pretrained encoder-decoder models. Our model performs on par with the SOTA models while requiring fewer resources. These observations confirm the challenges in detecting factual errors from dialogue summaries, which call for further studies, for which our dataset and results offer a solid foundation.

* Accepted in ACL 2023

Via

Access Paper or Ask Questions