Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Liang Du

Symmetry Nonnegative Matrix Factorization Algorithm Based on Self-paced Learning

Oct 20, 2024

Lei Wang, Liang Du, Peng Zhou, Peng Wu

Figure 1 for Symmetry Nonnegative Matrix Factorization Algorithm Based on Self-paced Learning

Figure 2 for Symmetry Nonnegative Matrix Factorization Algorithm Based on Self-paced Learning

Figure 3 for Symmetry Nonnegative Matrix Factorization Algorithm Based on Self-paced Learning

Figure 4 for Symmetry Nonnegative Matrix Factorization Algorithm Based on Self-paced Learning

Abstract:A symmetric nonnegative matrix factorization algorithm based on self-paced learning was proposed to improve the clustering performance of the model. It could make the model better distinguish normal samples from abnormal samples in an error-driven way. A weight variable that could measure the degree of difficulty to all samples was assigned in this method, and the variable was constrained by adopting both hard-weighting and soft-weighting strategies to ensure the rationality of the model. Cluster analysis was carried out on multiple data sets such as images and texts, and the experimental results showed the effectiveness of the proposed algorithm.

* Journal of Zhengzhou University(Natural Science Edition),2022,54 (05), 43-48
* in Chinese language

Via

Access Paper or Ask Questions

Unsupervised feature selection algorithm framework based on neighborhood interval disturbance fusion

Oct 20, 2024

Xiaolin Lv, Liang Du, Peng Zhou, Peng Wu

Abstract:Feature selection technology is a key technology of data dimensionality reduction. Becauseof the lack of label information of collected data samples, unsupervised feature selection has attracted more attention. The universality and stability of many unsupervised feature selection algorithms are very low and greatly affected by the dataset structure. For this reason, many researchers have been keen to improve the stability of the algorithm. This paper attempts to preprocess the data set and use an interval method to approximate the data set, experimentally verifying the advantages and disadvantages of the new interval data set. This paper deals with these data sets from the global perspective and proposes a new algorithm-unsupervised feature selection algorithm based on neighborhood interval disturbance fusion(NIDF). This method can realize the joint learning of the final score of the feature and the approximate data interval. By comparing with the original unsupervised feature selection methods and several existing feature selection frameworks, the superiority of the proposed model is verified.

* Journal of Nanjing University of Science and Technology, 2021, 45(04), 420-428
* in Chinese language

Via

Access Paper or Ask Questions

SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture

Oct 10, 2024

Jiayi Han, Liang Du, Hongwei Du, Xiangguo Zhou, Yiwen Wu, Weibo Zheng, Donghong Han

Figure 1 for SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture

Figure 2 for SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture

Figure 3 for SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture

Figure 4 for SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture

Abstract:Although many efforts have been made, it is still a challenge to balance the training budget, downstream performance, and the general capabilities of the LLMs in many applications. Training the whole model for downstream tasks is expensive, and could easily result in catastrophic forgetting. By introducing parameter-efficient fine-tuning (PEFT), the training cost could be reduced, but it still suffers from forgetting, and limits the learning on the downstream tasks. To efficiently fine-tune the LLMs with less limitation to their downstream performance while mitigating the forgetting of general capabilities, we propose a novel mixture of expert (MoE) framework based on Soft LoRA and Identity Mixture (SLIM), that allows dynamic routing between LoRA adapters and skipping connection, enables the suppression of forgetting. We adopt weight-yielding with sliding clustering for better out-of-domain distinguish to enhance the routing. We also propose to convert the mixture of low-rank adapters to the model merging formulation and introduce fast dynamic merging of LoRA adapters to keep the general capabilities of the base model. Extensive experiments demonstrate that the proposed SLIM is comparable to the state-of-the-art PEFT approaches on the downstream tasks while achieving the leading performance in mitigating catastrophic forgetting.

* 11 pages, 6 figures, 4 tables

Via

Access Paper or Ask Questions

Dynamic Neural Dowker Network: Approximating Persistent Homology in Dynamic Directed Graphs

Aug 17, 2024

Hao Li, Hao Jiang, Jiajun Fan, Dongsheng Ye, Liang Du

Abstract:Persistent homology, a fundamental technique within Topological Data Analysis (TDA), captures structural and shape characteristics of graphs, yet encounters computational difficulties when applied to dynamic directed graphs. This paper introduces the Dynamic Neural Dowker Network (DNDN), a novel framework specifically designed to approximate the results of dynamic Dowker filtration, aiming to capture the high-order topological features of dynamic directed graphs. Our approach creatively uses line graph transformations to produce both source and sink line graphs, highlighting the shared neighbor structures that Dowker complexes focus on. The DNDN incorporates a Source-Sink Line Graph Neural Network (SSLGNN) layer to effectively capture the neighborhood relationships among dynamic edges. Additionally, we introduce an innovative duality edge fusion mechanism, ensuring that the results for both the sink and source line graphs adhere to the duality principle intrinsic to Dowker complexes. Our approach is validated through comprehensive experiments on real-world datasets, demonstrating DNDN's capability not only to effectively approximate dynamic Dowker filtration results but also to perform exceptionally in dynamic graph classification tasks.

* KDD 2024

Via

Access Paper or Ask Questions

Stable Heterogeneous Treatment Effect Estimation across Out-of-Distribution Populations

Jul 03, 2024

Yuling Zhang, Anpeng Wu, Kun Kuang, Liang Du, Zixun Sun, Zhi Wang

Figure 1 for Stable Heterogeneous Treatment Effect Estimation across Out-of-Distribution Populations

Figure 2 for Stable Heterogeneous Treatment Effect Estimation across Out-of-Distribution Populations

Figure 3 for Stable Heterogeneous Treatment Effect Estimation across Out-of-Distribution Populations

Figure 4 for Stable Heterogeneous Treatment Effect Estimation across Out-of-Distribution Populations

Abstract:Heterogeneous treatment effect (HTE) estimation is vital for understanding the change of treatment effect across individuals or subgroups. Most existing HTE estimation methods focus on addressing selection bias induced by imbalanced distributions of confounders between treated and control units, but ignore distribution shifts across populations. Thereby, their applicability has been limited to the in-distribution (ID) population, which shares a similar distribution with the training dataset. In real-world applications, where population distributions are subject to continuous changes, there is an urgent need for stable HTE estimation across out-of-distribution (OOD) populations, which, however, remains an open problem. As pioneers in resolving this problem, we propose a novel Stable Balanced Representation Learning with Hierarchical-Attention Paradigm (SBRL-HAP) framework, which consists of 1) Balancing Regularizer for eliminating selection bias, 2) Independence Regularizer for addressing the distribution shift issue, 3) Hierarchical-Attention Paradigm for coordination between balance and independence. In this way, SBRL-HAP regresses counterfactual outcomes using ID data, while ensuring the resulting HTE estimation can be successfully generalized to out-of-distribution scenarios, thereby enhancing the model's applicability in real-world settings. Extensive experiments conducted on synthetic and real-world datasets demonstrate the effectiveness of our SBRL-HAP in achieving stable HTE estimation across OOD populations, with an average 10% reduction in the error metric PEHE and 11% decrease in the ATE bias, compared to the SOTA methods.

* Accepted by ICDE'2024

Via

Access Paper or Ask Questions

Fast Asymmetric Factorization for Large Scale Multiple Kernel Clustering

May 26, 2024

Yan Chen, Liang Du, Lei Duan

Abstract:Kernel methods are extensively employed for nonlinear data clustering, yet their effectiveness heavily relies on selecting suitable kernels and associated parameters, posing challenges in advance determination. In response, Multiple Kernel Clustering (MKC) has emerged as a solution, allowing the fusion of information from multiple base kernels for clustering. However, both early fusion and late fusion methods for large-scale MKC encounter challenges in memory and time constraints, necessitating simultaneous optimization of both aspects. To address this issue, we propose Efficient Multiple Kernel Concept Factorization (EMKCF), which constructs a new sparse kernel matrix inspired by local regression to achieve memory efficiency. EMKCF learns consensus and individual representations by extending orthogonal concept factorization to handle multiple kernels for time efficiency. Experimental results demonstrate the efficiency and effectiveness of EMKCF on benchmark datasets compared to state-of-the-art methods. The proposed method offers a straightforward, scalable, and effective solution for large-scale MKC tasks.

Via

Access Paper or Ask Questions

Supervisory Prompt Training

Mar 26, 2024

Jean Ghislain Billa, Min Oh, Liang Du

Figure 1 for Supervisory Prompt Training

Figure 2 for Supervisory Prompt Training

Figure 3 for Supervisory Prompt Training

Figure 4 for Supervisory Prompt Training

Abstract:The performance of Large Language Models (LLMs) relies heavily on the quality of prompts, which are often manually engineered and task-specific, making them costly and non-scalable. We propose a novel approach, Supervisory Prompt Training (SPT). SPT automates the generation of highly effective prompts using a dual LLM system. In this system, one LLM, the generator, performs a task while the other, the corrector, provides feedback and generates improved prompts. In contrast to earlier techniques, both the generator and corrector collaboratively and continuously improve their prompts over time. We also introduce the concept of \textit{impact scores} to measure the sentence-level effectiveness of the prompts. Our method was tested on four benchmarks, testing the level of hallucinations in LLMs. Notably, we were able to increase the accuracy of GPT-4 on GSM8K from 65.8\% to 94.1\% (28.3\% increase). SPT advances LLMs by refining prompts to enhance performance and reduce hallucinations, offering an efficient and scalable alternative to traditional model fine-tuning.

Via

Access Paper or Ask Questions

Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning

Oct 07, 2023

Murong Yue, Jie Zhao, Min Zhang, Liang Du, Ziyu Yao

Figure 1 for Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning

Figure 2 for Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning

Figure 3 for Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning

Figure 4 for Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning

Abstract:Large language models (LLMs) such as GPT-4 have exhibited remarkable performance in a variety of tasks, but this strong performance often comes with the high expense of using paid API services. In this paper, we are motivated to study building an LLM cascade to save the cost of using LLMs, particularly for performing reasoning (e.g., mathematical, causal) tasks. Our cascade pipeline follows the intuition that simpler questions can be addressed by a weaker but more affordable LLM, whereas only the challenging questions necessitate the stronger and more expensive LLM. To realize this decision-making, we consider the "answer consistency" of the weaker LLM as a signal of the question difficulty and propose several methods for the answer sampling and consistency checking, including one leveraging a mixture of two thought representations (i.e., Chain-of-Thought and Program-of-Thought). Through experiments on six reasoning benchmark datasets, with GPT-3.5-turbo and GPT-4 being the weaker and stronger LLMs, respectively, we demonstrate that our proposed LLM cascades can achieve performance comparable to using solely the stronger LLM but require only 40% of its cost.

Via

Access Paper or Ask Questions

REFORM: Removing False Correlation in Multi-level Interaction for CTR Prediction

Sep 26, 2023

Songli Wu, Liang Du, Jia-Qi Yang, Yuai Wang, De-Chuan Zhan, Shuang Zhao, Zixun Sun

Abstract:Click-through rate (CTR) prediction is a critical task in online advertising and recommendation systems, as accurate predictions are essential for user targeting and personalized recommendations. Most recent cutting-edge methods primarily focus on investigating complex implicit and explicit feature interactions. However, these methods neglect the issue of false correlations caused by confounding factors or selection bias. This problem is further magnified by the complexity and redundancy of these interactions. We propose a CTR prediction framework that removes false correlation in multi-level feature interaction, termed REFORM. The proposed REFORM framework exploits a wide range of multi-level high-order feature representations via a two-stream stacked recurrent structure while eliminating false correlations. The framework has two key components: I. The multi-level stacked recurrent (MSR) structure enables the model to efficiently capture diverse nonlinear interactions from feature spaces of different levels, and the richer representations lead to enhanced CTR prediction accuracy. II. The false correlation elimination (FCE) module further leverages Laplacian kernel mapping and sample reweighting methods to eliminate false correlations concealed within the multi-level features, allowing the model to focus on the true causal effects. Extensive experiments based on four challenging CTR datasets and our production dataset demonstrate that the proposed REFORM model achieves state-of-the-art performance. Codes, models and our dataset will be released at https://github.com/yansuoyuli/REFORM.

* 9 pages, 5 figures

Via

Access Paper or Ask Questions

BatchPrompt: Accomplish more with less

Sep 05, 2023

Jianzhe Lin, Maurice Diesendruck, Liang Du, Robin Abraham

Figure 1 for BatchPrompt: Accomplish more with less

Figure 2 for BatchPrompt: Accomplish more with less

Figure 3 for BatchPrompt: Accomplish more with less

Figure 4 for BatchPrompt: Accomplish more with less

Abstract:As the ever-increasing token limits of large language models (LLMs) have enabled long context as input, prompting with single data samples might no longer an efficient way. A straightforward strategy improving efficiency is to batch data within the token limit (e.g., 8k for gpt-3.5-turbo; 32k for GPT-4), which we call BatchPrompt. We have two initial observations for prompting with batched data. First, we find that prompting with batched data in longer contexts will inevitably lead to worse performance, compared to single-data prompting. Second, the performance of the language model is significantly correlated with the positions and order of the batched data, due to the corresponding change in decoder context. To retain efficiency and overcome performance loss, we propose Batch Permutation and Ensembling (BPE), and a novel Self-reflection-guided EArly Stopping (SEAS) technique. Our comprehensive experimental evaluation demonstrates that BPE can boost the performance of BatchPrompt with a striking margin on a range of popular NLP tasks, including question answering (Boolq), textual entailment (RTE), and duplicate questions identification (QQP). These performances are even competitive with/higher than single-data prompting(SinglePrompt), while BatchPrompt requires much fewer LLM calls and input tokens (For SinglePrompt v.s. BatchPrompt with batch size 32, using just 9%-16% the number of LLM calls, Boolq accuracy 90.6% to 90.9% with 27.4% tokens, QQP accuracy 87.2% to 88.4% with 18.6% tokens, RTE accuracy 91.5% to 91.1% with 30.8% tokens). To the best of our knowledge, this is the first work to technically improve prompting efficiency of large language models. We hope our simple yet effective approach will shed light on the future research of large language models. The code will be released.

* 20 pages, 5 figures

Via

Access Paper or Ask Questions