Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Le-Trung Nguyen

Beyond Low-rank Decomposition: A Shortcut Approach for Efficient On-Device Learning

May 08, 2025

Le-Trung Nguyen, Ael Quelennec, Van-Tam Nguyen, Enzo Tartaglione

Abstract:On-device learning has emerged as a promising direction for AI development, particularly because of its potential to reduce latency issues and mitigate privacy risks associated with device-server communication, while improving energy efficiency. Despite these advantages, significant memory and computational constraints still represent major challenges for its deployment. Drawing on previous studies on low-rank decomposition methods that address activation memory bottlenecks in backpropagation, we propose a novel shortcut approach as an alternative. Our analysis and experiments demonstrate that our method can reduce activation memory usage, even up to $120.09\times$ compared to vanilla training, while also reducing overall training FLOPs up to $1.86\times$ when evaluated on traditional benchmarks.

Via

Access Paper or Ask Questions

Activation Map Compression through Tensor Decomposition for Deep Learning

Nov 10, 2024

Le-Trung Nguyen, Aël Quélennec, Enzo Tartaglione, Samuel Tardieu, Van-Tam Nguyen

Figure 1 for Activation Map Compression through Tensor Decomposition for Deep Learning

Figure 2 for Activation Map Compression through Tensor Decomposition for Deep Learning

Figure 3 for Activation Map Compression through Tensor Decomposition for Deep Learning

Figure 4 for Activation Map Compression through Tensor Decomposition for Deep Learning

Abstract:Internet of Things and Deep Learning are synergetically and exponentially growing industrial fields with a massive call for their unification into a common framework called Edge AI. While on-device inference is a well-explored topic in recent research, backpropagation remains an open challenge due to its prohibitive computational and memory costs compared to the extreme resource constraints of embedded devices. Drawing on tensor decomposition research, we tackle the main bottleneck of backpropagation, namely the memory footprint of activation map storage. We investigate and compare the effects of activation compression using Singular Value Decomposition and its tensor variant, High-Order Singular Value Decomposition. The application of low-order decomposition results in considerable memory savings while preserving the features essential for learning, and also offers theoretical guarantees to convergence. Experimental results obtained on main-stream architectures and tasks demonstrate Pareto-superiority over other state-of-the-art solutions, in terms of the trade-off between generalization and memory footprint.

Via

Access Paper or Ask Questions