Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yonghao Liu

TRACE: Discovering Task-Specific Parameter via Adaptation-Aware Probing for Continual Fine-Tuning

May 29, 2026

Xiaosong Han, Ke Chen, Xindi Dai, Di Liang, Minlong Peng, Wei Pang, Fausto Giunchiglia, Xiaoyue Feng, Yonghao Liu, Renchu Guan

Abstract:In real-world deployment, LLMs are often adapted continually across tasks to keep LLMs up-to-date in production, where new fine-tuning should preserve previously learned skills. However, indiscriminately mixing tasks can dilute task specialization, while sequential fine-tuning (full-parameter or low rank adaptation) often causes catastrophic forgetting due to destructive overwriting. Replay-based continual tuning and maintaining separate task-specific adapters can mitigate forgetting, but introduce additional compute, storage, and management overhead. Recognizing the redundancy of LLM parameters for any single task, we reframe continual task adaptation as task-specific parameter discovery via adaptation-aware probing: a short warm-start probe exposes a task's adaptation trace, enabling us to identify and isolate the small subset of parameters essential for each task to mitigate catastrophic forgetting. Building on this view, we introduce TRACE, a novel approach for discovering Task-specific paRameters via Adaptation-aware probing for Continual finE-tuning. We perform a short warm-start fine-tune to derive task-specific core parameters by comparing the warm-started and pre-trained models. Core parameters are identified via two strategies: importance scoring (L$_2$ norm and Fisher Information) and specificity analysis (cosine similarity of parameter updates). In continual fine-tuning settings, only the active task's core parameters are updated while others remain frozen, preserving prior knowledge. We conduct extensive experiments across multiple standard benchmarks to demonstrate the superior performance of our proposed method. Additionally, we validate the generalization of our method through a cross-model and scale transferability study, demonstrating a "small-to-large" paradigm that guides the fine-tuning of large-scale models under resource constraints.

* KDD2026

Via

Access Paper or Ask Questions

Advancing Graph Few-Shot Learning via In-Context Learning

May 23, 2026

Renchu Guan, Yajun Wang, Chunli Guo, Bowen Cao, Fausto Giunchiglia, Wei Pang, Yonghao Liu, Xiaoyue Feng

Abstract:Graph few-shot learning, which aims to classify nodes from novel classes with only a few labeled examples, is a widely studied problem in graph learning. However, existing methods often face two key limitations. First, the predominant graph few-shot learning paradigm relies on supervised tasks, failing to leverage the vast number of unlabeled nodes in the graph. Second, many approaches require complex task adaptation or fine-tuning during inference, limiting their efficiency and applicability. Inspired by the powerful in-context learning capabilities of large language models, we propose a novel model named VISION for adVancIng graph few-Shot learning via In-cOntext LearNing to address these challenges. Our model reframes graph few-shot learning as a fine-tuning-free sequence reasoning problem. At its core is a context-aware network that initializes nodes with role embeddings and employs a dual-context fusion module to synergistically integrate local topological structures and global task-level dependencies. This allows our model to dynamically generate class-aware representations for the query set conditioned on the support set context in a single forward pass. To effectively train our model, we introduce an unsupervised task generator that creates structure-adaptive features and constructs diverse pseudo-tasks from abundant unlabeled data. Our method unifies unsupervised meta-learning with graph in-context learning, achieving efficient inference. Extensive experiments on multiple benchmark datasets demonstrate the superiority of our model. Our public code can be found

* KDD26

Via

Access Paper or Ask Questions

Dual-level Mixup for Graph Few-shot Learning with Fewer Tasks

Feb 19, 2025

Yonghao Liu, Mengyu Li, Fausto Giunchiglia, Lan Huang, Ximing Li, Xiaoyue Feng, Renchu Guan

Figure 1 for Dual-level Mixup for Graph Few-shot Learning with Fewer Tasks

Figure 2 for Dual-level Mixup for Graph Few-shot Learning with Fewer Tasks

Figure 3 for Dual-level Mixup for Graph Few-shot Learning with Fewer Tasks

Figure 4 for Dual-level Mixup for Graph Few-shot Learning with Fewer Tasks

Abstract:Graph neural networks have been demonstrated as a powerful paradigm for effectively learning graph-structured data on the web and mining content from it.Current leading graph models require a large number of labeled samples for training, which unavoidably leads to overfitting in few-shot scenarios. Recent research has sought to alleviate this issue by simultaneously leveraging graph learning and meta-learning paradigms. However, these graph meta-learning models assume the availability of numerous meta-training tasks to learn transferable meta-knowledge. Such assumption may not be feasible in the real world due to the difficulty of constructing tasks and the substantial costs involved. Therefore, we propose a SiMple yet effectIve approach for graph few-shot Learning with fEwer tasks, named SMILE. We introduce a dual-level mixup strategy, encompassing both within-task and across-task mixup, to simultaneously enrich the available nodes and tasks in meta-learning. Moreover, we explicitly leverage the prior information provided by the node degrees in the graph to encode expressive node representations. Theoretically, we demonstrate that SMILE can enhance the model generalization ability. Empirically, SMILE consistently outperforms other competitive models by a large margin across all evaluated datasets with in-domain and cross-domain settings. Our anonymous code can be found here.

* WWW25

Via

Access Paper or Ask Questions

Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning

Jan 16, 2025

Yonghao Liu, Mengyu Li, Wei Pang, Fausto Giunchiglia, Lan Huang, Xiaoyue Feng, Renchu Guan

Figure 1 for Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning

Figure 2 for Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning

Figure 3 for Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning

Figure 4 for Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning

Abstract:Short text classification, as a research subtopic in natural language processing, is more challenging due to its semantic sparsity and insufficient labeled samples in practical scenarios. We propose a novel model named MI-DELIGHT for short text classification in this work. Specifically, it first performs multi-source information (i.e., statistical information, linguistic information, and factual information) exploration to alleviate the sparsity issues. Then, the graph learning approach is adopted to learn the representation of short texts, which are presented in graph forms. Moreover, we introduce a dual-level (i.e., instance-level and cluster-level) contrastive learning auxiliary task to effectively capture different-grained contrastive information within massive unlabeled data. Meanwhile, previous models merely perform the main task and auxiliary tasks in parallel, without considering the relationship among tasks. Therefore, we introduce a hierarchical architecture to explicitly model the correlations between tasks. We conduct extensive experiments across various benchmark datasets, demonstrating that MI-DELIGHT significantly surpasses previous competitive models. It even outperforms popular large language models on several datasets.

* AAAI2025

Via

Access Paper or Ask Questions

A Simple Graph Contrastive Learning Framework for Short Text Classification

Jan 16, 2025

Yonghao Liu, Fausto Giunchiglia, Lan Huang, Ximing Li, Xiaoyue Feng, Renchu Guan

Figure 1 for A Simple Graph Contrastive Learning Framework for Short Text Classification

Figure 2 for A Simple Graph Contrastive Learning Framework for Short Text Classification

Figure 3 for A Simple Graph Contrastive Learning Framework for Short Text Classification

Figure 4 for A Simple Graph Contrastive Learning Framework for Short Text Classification

Abstract:Short text classification has gained significant attention in the information age due to its prevalence and real-world applications. Recent advancements in graph learning combined with contrastive learning have shown promising results in addressing the challenges of semantic sparsity and limited labeled data in short text classification. However, existing models have certain limitations. They rely on explicit data augmentation techniques to generate contrastive views, resulting in semantic corruption and noise. Additionally, these models only focus on learning the intrinsic consistency between the generated views, neglecting valuable discriminative information from other potential views. To address these issues, we propose a Simple graph contrastive learning framework for Short Text Classification (SimSTC). Our approach involves performing graph learning on multiple text-related component graphs to obtain multi-view text embeddings. Subsequently, we directly apply contrastive learning on these embeddings. Notably, our method eliminates the need for data augmentation operations to generate contrastive views while still leveraging the benefits of multi-view contrastive learning. Despite its simplicity, our model achieves outstanding performance, surpassing large language models on various datasets.

* AAAI2025

Via

Access Paper or Ask Questions

Enhancing Unsupervised Graph Few-shot Learning via Set Functions and Optimal Transport

Jan 10, 2025

Yonghao Liu, Fausto Giunchiglia, Ximing Li, Lan Huang, Xiaoyue Feng, Renchu Guan

Figure 1 for Enhancing Unsupervised Graph Few-shot Learning via Set Functions and Optimal Transport

Figure 2 for Enhancing Unsupervised Graph Few-shot Learning via Set Functions and Optimal Transport

Figure 3 for Enhancing Unsupervised Graph Few-shot Learning via Set Functions and Optimal Transport

Figure 4 for Enhancing Unsupervised Graph Few-shot Learning via Set Functions and Optimal Transport

Abstract:Graph few-shot learning has garnered significant attention for its ability to rapidly adapt to downstream tasks with limited labeled data, sparking considerable interest among researchers. Recent advancements in graph few-shot learning models have exhibited superior performance across diverse applications. Despite their successes, several limitations still exist. First, existing models in the meta-training phase predominantly focus on instance-level features within tasks, neglecting crucial set-level features essential for distinguishing between different categories. Second, these models often utilize query sets directly on classifiers trained with support sets containing only a few labeled examples, overlooking potential distribution shifts between these sets and leading to suboptimal performance. Finally, previous models typically require necessitate abundant labeled data from base classes to extract transferable knowledge, which is typically infeasible in real-world scenarios. To address these issues, we propose a novel model named STAR, which leverages Set funcTions and optimAl tRansport for enhancing unsupervised graph few-shot learning. Specifically, STAR utilizes expressive set functions to obtain set-level features in an unsupervised manner and employs optimal transport principles to align the distributions of support and query sets, thereby mitigating distribution shift effects. Theoretical analysis demonstrates that STAR can capture more task-relevant information and enhance generalization capabilities. Empirically, extensive experiments across multiple datasets validate the effectiveness of STAR. Our code can be found here.

* KDD2025

Via

Access Paper or Ask Questions

Meta-GPS++: Enhancing Graph Meta-Learning with Contrastive Learning and Self-Training

Jul 20, 2024

Yonghao Liu, Mengyu Li, Ximing Li, Lan Huang, Fausto Giunchiglia, Yanchun Liang, Xiaoyue Feng, Renchu Guan

Figure 1 for Meta-GPS++: Enhancing Graph Meta-Learning with Contrastive Learning and Self-Training

Figure 2 for Meta-GPS++: Enhancing Graph Meta-Learning with Contrastive Learning and Self-Training

Figure 3 for Meta-GPS++: Enhancing Graph Meta-Learning with Contrastive Learning and Self-Training

Figure 4 for Meta-GPS++: Enhancing Graph Meta-Learning with Contrastive Learning and Self-Training

Abstract:Node classification is an essential problem in graph learning. However, many models typically obtain unsatisfactory performance when applied to few-shot scenarios. Some studies have attempted to combine meta-learning with graph neural networks to solve few-shot node classification on graphs. Despite their promising performance, some limitations remain. First, they employ the node encoding mechanism of homophilic graphs to learn node embeddings, even in heterophilic graphs. Second, existing models based on meta-learning ignore the interference of randomness in the learning process. Third, they are trained using only limited labeled nodes within the specific task, without explicitly utilizing numerous unlabeled nodes. Finally, they treat almost all sampled tasks equally without customizing them for their uniqueness. To address these issues, we propose a novel framework for few-shot node classification called Meta-GPS++. Specifically, we first adopt an efficient method to learn discriminative node representations on homophilic and heterophilic graphs. Then, we leverage a prototype-based approach to initialize parameters and contrastive learning for regularizing the distribution of node embeddings. Moreover, we apply self-training to extract valuable information from unlabeled nodes. Additionally, we adopt S$^2$ (scaling & shifting) transformation to learn transferable knowledge from diverse tasks. The results on real-world datasets show the superiority of Meta-GPS++. Our code is available here.

* ACM Transactions on Knowledge Discovery from Data (TKDD)

Via

Access Paper or Ask Questions

Resolving Word Vagueness with Scenario-guided Adapter for Natural Language Inference

May 21, 2024

Yonghao Liu, Mengyu Li, Di Liang, Ximing Li, Fausto Giunchiglia, Lan Huang, Xiaoyue Feng, Renchu Guan

Figure 1 for Resolving Word Vagueness with Scenario-guided Adapter for Natural Language Inference

Figure 2 for Resolving Word Vagueness with Scenario-guided Adapter for Natural Language Inference

Figure 3 for Resolving Word Vagueness with Scenario-guided Adapter for Natural Language Inference

Figure 4 for Resolving Word Vagueness with Scenario-guided Adapter for Natural Language Inference

Abstract:Natural Language Inference (NLI) is a crucial task in natural language processing that involves determining the relationship between two sentences, typically referred to as the premise and the hypothesis. However, traditional NLI models solely rely on the semantic information inherent in independent sentences and lack relevant situational visual information, which can hinder a complete understanding of the intended meaning of the sentences due to the ambiguity and vagueness of language. To address this challenge, we propose an innovative ScenaFuse adapter that simultaneously integrates large-scale pre-trained linguistic knowledge and relevant visual information for NLI tasks. Specifically, we first design an image-sentence interaction module to incorporate visuals into the attention mechanism of the pre-trained model, allowing the two modalities to interact comprehensively. Furthermore, we introduce an image-sentence fusion module that can adaptively integrate visual information from images and semantic information from sentences. By incorporating relevant visual information and leveraging linguistic knowledge, our approach bridges the gap between language and vision, leading to improved understanding and inference capabilities in NLI tasks. Extensive benchmark experiments demonstrate that our proposed ScenaFuse, a scenario-guided approach, consistently boosts NLI performance.

* IJCAI24

Via

Access Paper or Ask Questions

Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive Learning for Text Classification

May 19, 2024

Mengyu Li, Yonghao Liu, Fausto Giunchiglia, Xiaoyue Feng, Renchu Guan

Figure 1 for Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive Learning for Text Classification

Figure 2 for Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive Learning for Text Classification

Figure 3 for Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive Learning for Text Classification

Figure 4 for Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive Learning for Text Classification

Abstract:Text classification is a crucial and fundamental task in natural language processing. Compared with the previous learning paradigm of pre-training and fine-tuning by cross entropy loss, the recently proposed supervised contrastive learning approach has received tremendous attention due to its powerful feature learning capability and robustness. Although several studies have incorporated this technique for text classification, some limitations remain. First, many text datasets are imbalanced, and the learning mechanism of supervised contrastive learning is sensitive to data imbalance, which may harm the model performance. Moreover, these models leverage separate classification branch with cross entropy and supervised contrastive learning branch without explicit mutual guidance. To this end, we propose a novel model named SharpReCL for imbalanced text classification tasks. First, we obtain the prototype vector of each class in the balanced classification branch to act as a representation of each class. Then, by further explicitly leveraging the prototype vectors, we construct a proper and sufficient target sample set with the same size for each class to perform the supervised contrastive learning procedure. The empirical results show the effectiveness of our model, which even outperforms popular large language models across several datasets.

* 12 pages, 9 figures

Via

Access Paper or Ask Questions

Time-aware Multiway Adaptive Fusion Network for Temporal Knowledge Graph Question Answering

Mar 14, 2023

Yonghao Liu, Di Liang, Fang Fang, Sirui Wang, Wei Wu, Rui Jiang

Figure 1 for Time-aware Multiway Adaptive Fusion Network for Temporal Knowledge Graph Question Answering

Figure 2 for Time-aware Multiway Adaptive Fusion Network for Temporal Knowledge Graph Question Answering

Figure 3 for Time-aware Multiway Adaptive Fusion Network for Temporal Knowledge Graph Question Answering

Figure 4 for Time-aware Multiway Adaptive Fusion Network for Temporal Knowledge Graph Question Answering

Abstract:Knowledge graphs (KGs) have received increasing attention due to its wide applications on natural language processing. However, its use case on temporal question answering (QA) has not been well-explored. Most of existing methods are developed based on pre-trained language models, which might not be capable to learn \emph{temporal-specific} presentations of entities in terms of temporal KGQA task. To alleviate this problem, we propose a novel \textbf{T}ime-aware \textbf{M}ultiway \textbf{A}daptive (\textbf{TMA}) fusion network. Inspired by the step-by-step reasoning behavior of humans. For each given question, TMA first extracts the relevant concepts from the KG, and then feeds them into a multiway adaptive module to produce a \emph{temporal-specific} representation of the question. This representation can be incorporated with the pre-trained KG embedding to generate the final prediction. Empirical results verify that the proposed model achieves better performance than the state-of-the-art models in the benchmark dataset. Notably, the Hits@1 and Hits@10 results of TMA on the CronQuestions dataset's complex questions are absolutely improved by 24\% and 10\% compared to the best-performing baseline. Furthermore, we also show that TMA employing an adaptive fusion mechanism can provide interpretability by analyzing the proportion of information in question representations.

* ICASSP 2023
* ICASSP 2023

Via

Access Paper or Ask Questions