Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xinyu Ding

Block Circulant Adapter for Large Language Models

May 01, 2025

Xinyu Ding, Meiqi Wang, Siyu Liao, Zhongfeng Wang

Abstract:Fine-tuning large language models (LLMs) is difficult due to their huge model size. Recent Fourier domain-based methods show potential for reducing fine-tuning costs. We propose a block circulant matrix-based fine-tuning method with a stable training heuristic to leverage the properties of circulant matrices and one-dimensional Fourier transforms to reduce storage and computation costs. Experiments show that our method uses $14\times$ less number of parameters than VeRA, $16\times$ smaller than LoRA and $32\times$ less FLOPs than FourierFT, while maintaining close or better task performance. Our approach presents a promising way in frequency domain to fine-tune large models on downstream tasks.

* to appear in Proceedings of the 2025 International Joint Conference on Artificial Intelligence (IJCAI-2025)

Via

Access Paper or Ask Questions

Parameter-Efficient Fine-Tuning with Circulant and Diagonal Vectors

May 01, 2025

Xinyu Ding, Lexuan Chen, Siyu Liao, Zhongfeng Wang

Abstract:Foundation models have achieved tremendous success in different domains. However, their huge computation and storage complexity make these models difficult to fine-tune and also less applicable in practice. Recent study shows training in Fourier domain can be an effective fine-tuning method in terms of both model performance and number of training parameters. In this work, we propose to further reduce the complexity by the factorization through the product of interleaved circulant and diagonal matrices. In addition, we address the case of non-square fine-tuning weights by partitioning the circulant matrix into blocks. Our method avoids the construction of weight change matrix and utilizes 1D fast Fourier transform (FFT) instead of 2D FFT. Experimental results show that our method achieves similar or better performance across various tasks with much less floating-point operations (FLOPs) and the number of trainable parameters.

* to appear in Proceedings of the 2025 International Joint Conference on Artificial Intelligence (IJCAI-2025)

Via

Access Paper or Ask Questions