Abstract:Privacy-preserving action recognition (PPAR) enables machines to understand human activities in videos without revealing sensitive visual content. Among the various strategies for PPAR, encryption-based methods achieve strong privacy protection while maintaining high recognition performance. However, these methods lead to a catastrophic decrease in recognition performance and visual quality when the encrypted videos are compressed. That is, the previous methods are not compression-friendly. To address these issues, in this paper, we propose the first compression-friendly encryption method for PPAR, called CFE-PPAR. In CFE-PPAR, videos encrypted with secret keys can be directly recognized by a video transformer, which uses parameters transformed by the same keys as those used for video encryption. In experiments, it is verified that CFE-PPAR outperforms previous methods on the UCF101 and HMDB51 datasets under Motion-JPEG and H.264 compression.
Abstract:We propose a low-rank adaptation method for training privacy-preserving vision transformer (ViT) models that efficiently freezes pre-trained ViT model weights. In the proposed method, trainable rank decomposition matrices are injected into each layer of the ViT architecture, and moreover, the patch embedding layer is not frozen, unlike in the case of the conventional low-rank adaptation methods. The proposed method allows us not only to reduce the number of trainable parameters but to also maintain almost the same accuracy as that of full-time tuning.