Abstract:Motion compensation is a key component of video codecs. Conventional codecs (HEVC and VVC) have carefully refined this coding step, with an important focus on sub-pixel motion compensation. On the other hand, learned codecs achieve sub-pixel motion compensation through simple bilinear filtering. This paper offers to improve learned codec motion compensation by drawing inspiration from conventional codecs. It is shown that the usage of more advanced interpolation filters, block-based motion information and finite motion accuracy lead to better compression performance and lower decoding complexity. Experimental results are provided on the Cool-chic video codec, where we demonstrate a rate decrease of more than 10% and a lowering of motion-related decoding complexity from 391 MAC per pixel to 214 MAC per pixel. All contributions are made open-source at https://github.com/Orange-OpenSource/Cool-Chic
Abstract:Neural image compression, based on auto-encoders and overfitted representations, relies on a latent representation of the coded signal. This representation needs to be compact and uses low resolution feature maps. In the decoding process, those latents are upsampled and filtered using stacks of convolution filters and non linear elements to recover the decoded image. Therefore, the upsampling process is crucial in the design of a neural coding scheme and is of particular importance for overfitted codecs where the network parameters, including the upsampling filters, are part of the representation. This paper addresses the improvement of the upsampling process in order to reduce its complexity and limit the number of parameters. A new upsampling structure is presented whose improvements are illustrated within the Cool-Chic overfitted image coding framework. The proposed approach offers a rate reduction of 4.7%. The code is provided.
Abstract:Overfitted image codecs offer compelling compression performance and low decoder complexity, through the overfitting of a lightweight decoder for each image. Such codecs include Cool-chic, which presents image coding performance on par with VVC while requiring around 2000 multiplications per decoded pixel. This paper proposes to decrease Cool-chic encoding and decoding complexity. The encoding complexity is reduced by shortening Cool-chic training, up to the point where no overfitting is performed at all. It is also shown that a tiny neural decoder with 300 multiplications per pixel still outperforms HEVC. A near real-time CPU implementation of this decoder is made available at https://orange-opensource.github.io/Cool-Chic/.
Abstract:This paper summarises the design of the Cool-Chic candidate for the Challenge on Learned Image Compression. This candidate attempts to demonstrate that neural coding methods can lead to low complexity and lightweight image decoders while still offering competitive performance. The approach is based on the already published overfitted lightweight neural networks Cool-Chic, further adapted to the human subjective viewing targeted in this challenge.
Abstract:We propose a neural image codec at reduced complexity which overfits the decoder parameters to each input image. While autoencoders perform up to a million multiplications per decoded pixel, the proposed approach only requires 2300 multiplications per pixel. Albeit low-complexity, the method rivals autoencoder performance and surpasses HEVC performance under various coding conditions. Additional lightweight modules and an improved training process provide a 14% rate reduction with respect to previous overfitted codecs, while offering a similar complexity. This work is made open-source at https://orange-opensource.github.io/Cool-Chic/
Abstract:We introduce COOL-CHIC, a Coordinate-based Low Complexity Hierarchical Image Codec. It is a learned alternative to autoencoders with approximately 2000 parameters and 2500 multiplications per decoded pixel. Despite its low complexity, COOL-CHIC offers compression performance close to modern conventional MPEG codecs such as HEVC and VVC. This method is inspired by the Coordinate-based Neural Representation, where an image is represented as a learned function which maps pixel coordinates to RGB values. The parameters of the mapping function are then sent using entropy coding. At the receiver side, the compressed image is obtained by evaluating the mapping function for all pixel coordinates. COOL-CHIC implementation is made available upon request.
Abstract:This paper presents the AIVC submission to the CLIC 2022 video track. AIVC is a fully-learned video codec based on conditional autoencoders. The flexibility of the AIVC models is leveraged to implement rate allocation and frame structure competition to select the optimal coding configuration per-sequence. This competition yields compelling compression performance, offering a rate reduction of -26 % compared with the absence of competition.