Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ayman A. Ameen

LoC-LIC: Low Complexity Learned Image Coding Using Hierarchical Feature Transforms

Apr 30, 2025

Ayman A. Ameen, Thomas Richter, André Kaup

Abstract:Current learned image compression models typically exhibit high complexity, which demands significant computational resources. To overcome these challenges, we propose an innovative approach that employs hierarchical feature extraction transforms to significantly reduce complexity while preserving bit rate reduction efficiency. Our novel architecture achieves this by using fewer channels for high spatial resolution inputs/feature maps. On the other hand, feature maps with a large number of channels have reduced spatial dimensions, thereby cutting down on computational load without sacrificing performance. This strategy effectively reduces the forward pass complexity from $1256 \, \text{kMAC/Pixel}$ to just $270 \, \text{kMAC/Pixel}$. As a result, the reduced complexity model can open the way for learned image compression models to operate efficiently across various devices and pave the way for the development of new architectures in image compression technology.

Via

Access Paper or Ask Questions

Compact Latent Representation for Image Compression (CLRIC)

Feb 20, 2025

Ayman A. Ameen, Thomas Richter, André Kaup

Abstract:Current image compression models often require separate models for each quality level, making them resource-intensive in terms of both training and storage. To address these limitations, we propose an innovative approach that utilizes latent variables from pre-existing trained models (such as the Stable Diffusion Variational Autoencoder) for perceptual image compression. Our method eliminates the need for distinct models dedicated to different quality levels. We employ overfitted learnable functions to compress the latent representation from the target model at any desired quality level. These overfitted functions operate in the latent space, ensuring low computational complexity, around $25.5$ MAC/pixel for a forward pass on images with dimensions $(1363 \times 2048)$ pixels. This approach efficiently utilizes resources during both training and decoding. Our method achieves comparable perceptual quality to state-of-the-art learned image compression models while being both model-agnostic and resolution-agnostic. This opens up new possibilities for the development of innovative image compression methods.

Via

Access Paper or Ask Questions