Recent studies show that widely used deep neural networks (DNNs) are vulnerable to carefully crafted adversarial examples, it inevitably brings some security challenges. However, the attack characteristic of adversarial examples can be taken advantage to do privacy-preserving image research. In this paper, we make use of Reversible Image Transformation to construct reversible adversarial examples, which are still misclassified by DNNs that are utilized by illegal organizations to steal privacy of image content that we upload to the cloud or social platforms. Most importantly, the proposed method can recover original images from downloaded reversible adversarial examples with no distortion. The experimental results show that the attack success rate of the reversible adversarial examples obtained by this method can reach more than 95 % on MNIST and more than 60 % on ImageNet.
The video super-resolution (VSR) task aims to restore a high-resolution video frame by using its corresponding low-resolution frame and multiple neighboring frames. At present, many deep learning-based VSR methods rely on optical flow to perform frame alignment. The final recovery results will be greatly affected by the accuracy of optical flow. However, optical flow estimation cannot be completely accurate, and there are always some errors. In this paper, we propose a novel deformable non-local network (DNLN) which is non-flow-based. Specifically, we apply the improved deformable convolution in our alignment module to achieve adaptive frame alignment at the feature level. Furthermore, we utilize a non-local module to capture the global correlation between the reference frame and aligned neighboring frame, and simultaneously enhance desired fine details in the aligned frame. To reconstruct the final high-quality HR video frames, we use residual in residual dense blocks to take full advantage of the hierarchical features. Experimental results on several datasets demonstrate that the proposed DNLN can achieve state of the art performance on video super-resolution task.
As one of the most popular linear subspace learning methods, the Linear Discriminant Analysis (LDA) method has been widely studied in machine learning community and applied to many scientific applications. Traditional LDA minimizes the ratio of squared L2-norms, which is sensitive to outliers. In recent research, many L1-norm based robust Principle Component Analysis methods were proposed to improve the robustness to outliers. However, due to the difficulty of L1-norm ratio optimization, so far there is no existing work to utilize sparsity-inducing norms for LDA objective. In this paper, we propose a novel robust linear discriminant analysis method based on the L1,2-norm ratio minimization. Minimizing the L1,2-norm ratio is a much more challenging problem than the traditional methods, and there is no existing optimization algorithm to solve such non-smooth terms ratio problem. We derive a new efficient algorithm to solve this challenging problem, and provide a theoretical analysis on the convergence of our algorithm. The proposed algorithm is easy to implement, and converges fast in practice. Extensive experiments on both synthetic data and nine real benchmark data sets show the effectiveness of the proposed robust LDA method.
Deep neural networks (DNNs) have demonstrated their outstanding performance in many fields such as image classification and speech recognition. However, DNNs image classifiers are susceptible to interference from adversarial examples, which ultimately leads to incorrect classification output of neural network models. Based on this, this paper proposes a method based on War (WebPcompression and resize) to detect adversarial examples. The method takes WebP compression as the core, firstly performs WebP compression on the input image, and then appropriately resizes the compressed image, so that the label of the adversarial example changes, thereby detecting the existence of the adversarial image. The experimental results show that the proposed method can effectively resist IFGSM, DeepFool and C&W attacks, and the recognition accuracy is improved by more than 10% compared with the HGD method, the detection success rate of adversarial examples is 5% higher than that of the Feature Squeezing method. The method in this paper can effectively reduce the small noise disturbance in the adversarial image, and accurately detect the adversarial example according to the change of the samplelabelwhileensuringtheaccuracyoftheoriginalsampleidentification.
Principal Component Analysis (PCA) is one of the most important methods to handle high dimensional data. However, most of the studies on PCA aim to minimize the loss after projection, which usually measures the Euclidean distance, though in some fields, angle distance is known to be more important and critical for analysis. In this paper, we propose a method by adding constraints on factors to unify the Euclidean distance and angle distance. However, due to the nonconvexity of the objective and constraints, the optimized solution is not easy to obtain. We propose an alternating linearized minimization method to solve it with provable convergence rate and guarantee. Experiments on synthetic data and real-world datasets have validated the effectiveness of our method and demonstrated its advantages over state-of-art clustering methods.
We participated the Task 1: Lesion Segmentation. The paper describes our algorithm and the final result of validation set for the ISIC Challenge 2018 - Skin Lesion Analysis Towards Melanoma Detection.
Scene text detection is a challenging problem in computer vision. In this paper, we propose a novel text detection network based on prevalent object detection frameworks. In order to obtain stronger semantic feature, we adopt ResNet as feature extraction layers and exploit multi-level feature by combining hierarchical convolutional networks. A vertical proposal mechanism is utilized to avoid proposal classification, while regression layer remains working to improve localization accuracy. Our approach evaluated on ICDAR2013 dataset achieves F-measure of 0.91, which outperforms previous state-of-the-art results in scene text detection.
In this paper, we propose a novel method called Rotational Region CNN (R2CNN) for detecting arbitrary-oriented texts in natural scene images. The framework is based on Faster R-CNN [1] architecture. First, we use the Region Proposal Network (RPN) to generate axis-aligned bounding boxes that enclose the texts with different orientations. Second, for each axis-aligned text box proposed by RPN, we extract its pooled features with different pooled sizes and the concatenated features are used to simultaneously predict the text/non-text score, axis-aligned box and inclined minimum area box. At last, we use an inclined non-maximum suppression to get the detection results. Our approach achieves competitive results on text detection benchmarks: ICDAR 2015 and ICDAR 2013.
Recently, realistic image generation using deep neural networks has become a hot topic in machine learning and computer vision. Images can be generated at the pixel level by learning from a large collection of images. Learning to generate colorful cartoon images from black-and-white sketches is not only an interesting research problem, but also a potential application in digital entertainment. In this paper, we investigate the sketch-to-image synthesis problem by using conditional generative adversarial networks (cGAN). We propose the auto-painter model which can automatically generate compatible colors for a sketch. The new model is not only capable of painting hand-draw sketch with proper colors, but also allowing users to indicate preferred colors. Experimental results on two sketch datasets show that the auto-painter performs better that existing image-to-image methods.