In this paper, we consider a problem of self-supervised learning for small-scale datasets based on contrastive loss between multiple views of the data, which demonstrates the state-of-the-art performance in classification task. Despite the reported results, such factors as the complexity of training requiring complex architectures, the needed number of views produced by data augmentation, and their impact on the classification accuracy are understudied problems. To establish the role of these factors, we consider an architecture of contrastive loss system such as SimCLR, where baseline model is replaced by geometrically invariant "hand-crafted" network ScatNet with small trainable adapter network and argue that the number of parameters of the whole system and the number of views can be considerably reduced while practically preserving the same classification accuracy. In addition, we investigate the impact of regularization strategies using pretext task learning based on an estimation of parameters of augmentation transform such as rotation and jigsaw permutation for both traditional baseline models and ScatNet based models. Finally, we demonstrate that the proposed architecture with pretext task learning regularization achieves the state-of-the-art classification performance with a smaller number of trainable parameters and with reduced number of views.
We propose a scheme for multi-layer representation of images. The problem is first treated from an information-theoretic viewpoint where we analyze the behavior of different sources of information under a multi-layer data compression framework and compare it with a single-stage (shallow) structure. We then consider the image data as the source of information and link the proposed representation scheme to the problem of multi-layer dictionary learning for visual data. For the current work we focus on the problem of image compression for a special class of images where we report a considerable performance boost in terms of PSNR at high compression ratios in comparison with the JPEG2000 codec.