Filtering out unrealistic images from trained generative adversarial networks (GANs) has attracted considerable attention recently. Two density ratio based subsampling methods---Discriminator Rejection Sampling (DRS) and Metropolis-Hastings GAN (MH-GAN)---were recently proposed, and their effectiveness in improving GANs was demonstrated on multiple datasets. However, DRS and MH-GAN are based on discriminator based density ratio estimation (DRE) methods, so they may not work well if the discriminator in the trained GAN is far from optimal. Moreover, they do not apply to some GANs (e.g., MMD-GAN). In this paper, we propose a novel Softplus (SP) loss for DRE. Based on it, we develop a sample-based DRE method in a feature space learned by a specially designed and pre-trained ResNet-34 (DRE-F-SP). We derive the rate of convergence of a density ratio model trained under the SP loss. Then, we propose three different density ratio subsampling methods (DRE-F-SP+RS, DRE-F-SP+MH, and DRE-F-SP+SIR) for GANs based on DRE-F-SP. Our subsampling methods do not rely on the optimality of the discriminator and are suitable for all types of GANs. We empirically show our subsampling approach can substantially outperform DRS and MH-GAN on a synthetic dataset and the CIFAR-10 dataset, using multiple GANs.
In this letter, as a proof of concept, we propose a deep learning-based approach to attack the chaos-based image encryption algorithm in \cite{guan2005chaos}. The proposed method first projects the chaos-based encrypted images into the low-dimensional feature space, where essential information of plain images has been largely preserved. With the low-dimensional features, a deconvolutional generator is utilized to regenerate perceptually similar decrypted images to approximate the plain images in the high-dimensional space. Compared with conventional image encryption attack algorithms, the proposed method does not require to manually analyze and infer keys in a time-consuming way. Instead, we directly attack the chaos-based encryption algorithms in a key-independent manner. Moreover, the proposed method can be trained end-to-end. Given the chaos-based encrypted images, a well-trained decryption model is able to automatically reconstruct plain images with high fidelity. In the experiments, we successfully attack the chaos-based algorithm \cite{guan2005chaos} and the decrypted images are visually similar to their ground truth plain images. Experimental results on both static-key and dynamic-key scenarios verify the efficacy of the proposed method.
Objective: The aim of this study is to develop an efficient and reliable epileptic seizure prediction system using intracranial EEG (iEEG) data, especially for people with drug-resistant epilepsy. The prediction procedure should yield accurate results in a fast enough fashion to alert patients of impending seizures. Methods: We quantitatively analyze the human iEEG data to obtain insights into how the human brain behaves before and between epileptic seizures. We then introduce an efficient pre-processing method for reducing the data size and converting the time-series iEEG data into an image-like format that can be used as inputs to convolutional neural networks (CNNs). Further, we propose a seizure prediction algorithm that uses cooperative multi-scale CNNs for automatic feature learning of iEEG data. Results: 1) iEEG channels contain complementary information and excluding individual channels is not advisable to retain the spatial information needed for accurate prediction of epileptic seizures. 2) The traditional PCA is not a reliable method for iEEG data reduction in seizure prediction. 3) Hand-crafted iEEG features may not be suitable for reliable seizure prediction performance as the iEEG data varies between patients and over time for the same patient. 4) Seizure prediction results show that our algorithm outperforms existing methods by achieving an average sensitivity of 87.85% and AUC score of 0.84. Conclusion: Understanding how the human brain behaves before seizure attacks and far from them facilitates better designs of epileptic seizure predictors. Significance: Accurate seizure prediction algorithms can warn patients about the next seizure attack so they could avoid dangerous activities. Medications could then be administered to abort the impending seizure and minimize the risk of injury.
With the widespread applications of deep convolutional neural networks (DCNNs), it becomes increasingly important for DCNNs not only to make accurate predictions but also to explain how they make their decisions. In this work, we propose a CHannel-wise disentangled InterPretation (CHIP) model to give the visual interpretation to the predictions of DCNNs. The proposed model distills the class-discriminative importance of channels in networks by utilizing the sparse regularization. Here, we first introduce the network perturbation technique to learn the model. The proposed model is capable to not only distill the global perspective knowledge from networks but also present the class-discriminative visual interpretation for specific predictions of networks. It is noteworthy that the proposed model is able to interpret different layers of networks without re-training. By combining the distilled interpretation knowledge in different layers, we further propose the Refined CHIP visual interpretation that is both high-resolution and class-discriminative. Experimental results on the standard dataset demonstrate that the proposed model provides promising visual interpretation for the predictions of networks in image classification task compared with existing visual interpretation methods. Besides, the proposed method outperforms related approaches in the application of ILSVRC 2015 weakly-supervised localization task.
Previous transfer learning methods based on deep network assume the knowledge should be transferred between the same hidden layers of the source domain and the target domains. This assumption doesn't always hold true, especially when the data from the two domains are heterogeneous with different resolutions. In such case, the most suitable numbers of layers for the source domain data and the target domain data would differ. As a result, the high level knowledge from the source domain would be transferred to the wrong layer of target domain. Based on this observation, "where to transfer" proposed in this paper should be a novel research frontier. We propose a new mathematic model named DT-LET to solve this heterogeneous transfer learning problem. In order to select the best matching of layers to transfer knowledge, we define specific loss function to estimate the corresponding relationship between high-level features of data in the source domain and the target domain. To verify this proposed cross-layer model, experiments for two cross-domain recognition/classification tasks are conducted, and the achieved superior results demonstrate the necessity of layer correspondence searching.
In this paper, we present a joint compression and classification approach of EEG and EMG signals using a deep learning approach. Specifically, we build our system based on the deep autoencoder architecture which is designed not only to extract discriminant features in the multimodal data representation but also to reconstruct the data from the latent representation using encoder-decoder layers. Since autoencoder can be seen as a compression approach, we extend it to handle multimodal data at the encoder layer, reconstructed and retrieved at the decoder layer. We show through experimental results, that exploiting both multimodal data intercorellation and intracorellation 1) Significantly reduces signal distortion particularly for high compression levels 2) Achieves better accuracy in classifying EEG and EMG signals recorded and labeled according to the sentiments of the volunteer.
In this paper, we present a Convolutional Neural Network (CNN) regression approach for real-time 2-D/3-D registration. Different from optimization-based methods, which iteratively optimize the transformation parameters over a scalar-valued metric function representing the quality of the registration, the proposed method exploits the information embedded in the appearances of the Digitally Reconstructed Radiograph and X-ray images, and employs CNN regressors to directly estimate the transformation parameters. The CNN regressors are trained for local zones and applied in a hierarchical manner to break down the complex regression task into simpler sub-tasks that can be learned separately. Our experiment results demonstrate the advantage of the proposed method in computational efficiency with negligible degradation of registration accuracy compared to intensity-based methods.
Digital images nowadays have various styles of appearance, in the aspects of color tones, contrast, vignetting, and etc. These 'picture styles' are directly related to the scene radiance, image pipeline of the camera, and post processing functions. Due to the complexity and nonlinearity of these causes, popular gradient-based image descriptors won't be invariant to different picture styles, which will decline the performance of object recognition. Given that images shared online or created by individual users are taken with a wide range of devices and may be processed by various post processing functions, to find a robust object recognition system is useful and challenging. In this paper, we present the first study on the influence of picture styles for object recognition, and propose an adaptive approach based on the kernel view of gradient descriptors and multiple kernel learning, without estimating or specifying the styles of images used in training and testing. We conduct experiments on Domain Adaptation data set and Oxford Flower data set. The experiments also include several variants of the flower data set by processing the images with popular photo effects. The results demonstrate that our proposed method improve from standard descriptors in all cases.