Magnetic resonance imaging (MRI) is a powerful imaging modality that revolutionizes medicine and biology. The imaging speed of high-dimensional MRI is often limited, which constrains its practical utility. Recently, low-rank tensor models have been exploited to enable fast MR imaging with sparse sampling. Most existing methods use some pre-defined sampling design, and active sensing has not been explored for low-rank tensor imaging. In this paper, we introduce an active low-rank tensor model for fast MR imaging.We propose an active sampling method based on a Query-by-Committee model, making use of the benefits of low-rank tensor structure. Numerical experiments on a 3-D MRI data set demonstrate the effectiveness of the proposed method.
Deep neural networks (DNNs) are inherently vulnerable to well-designed input samples called adversarial examples. The adversary can easily fool DNNs by adding slight perturbations to the input. In this paper, we propose a novel black-box adversarial example attack named GreedyFool, which synthesizes adversarial examples based on the differential evolution and the greedy approximation. The differential evolution is utilized to evaluate the effects of perturbed pixels on the confidence of the DNNs-based classifier. The greedy approximation is an approximate optimization algorithm to automatically get adversarial perturbations. Existing works synthesize the adversarial examples by leveraging simple metrics to penalize the perturbations, which lack sufficient consideration of the human visual system (HVS), resulting in noticeable artifacts. In order to sufficient imperceptibility, we launch a lot of investigations into the HVS and design an integrated metric considering just noticeable distortion (JND), Weber-Fechner law, texture masking and channel modulation, which is proven to be a better metric to measure the perceptual distance between the benign examples and the adversarial ones. The experimental results demonstrate that the GreedyFool has several remarkable properties including black-box, 100% success rate, flexibility, automation and can synthesize the more imperceptible adversarial examples than the state-of-the-art pixel-wise methods.
Recent approaches have achieved great success in image generation from structured inputs, e.g., semantic segmentation, scene graph or layout. Although these methods allow specification of objects and their locations at image-level, they lack the fidelity and semantic control to specify visual appearance of these objects at an instance-level. To address this limitation, we propose a new image generation method that enables instance-level attribute control. Specifically, the input to our attribute-guided generative model is a tuple that contains: (1) object bounding boxes, (2) object categories and (3) an (optional) set of attributes for each object. The output is a generated image where the requested objects are in the desired locations and have prescribed attributes. Several losses work collaboratively to encourage accurate, consistent and diverse image generation. Experiments on Visual Genome dataset demonstrate our model's capacity to control object-level attributes in generated images, and validate plausibility of disentangled object-attribute representation in the image generation from layout task. Also, the generated images from our model have higher resolution, object classification accuracy and consistency, as compared to the previous state-of-the-art.
Few-shot Learning (FSL) which aims to learn from few labeled training data is becoming a popular research topic, due to the expensive labeling cost in many real-world applications. One kind of successful FSL method learns to compare the testing (query) image and training (support) image by simply concatenating the features of two images and feeding it into the neural network. However, with few labeled data in each class, the neural network has difficulty in learning or comparing the local features of two images. Such simple image-level comparison may cause serious mis-classification. To solve this problem, we propose Augmented Bi-path Network (ABNet) for learning to compare both global and local features on multi-scales. Specifically, the salient patches are extracted and embedded as the local features for every image. Then, the model learns to augment the features for better robustness. Finally, the model learns to compare global and local features separately, i.e., in two paths, before merging the similarities. Extensive experiments show that the proposed ABNet outperforms the state-of-the-art methods. Both quantitative and visual ablation studies are provided to verify that the proposed modules lead to more precise comparison results.
Latent user representations are widely adopted in the tech industry for powering personalized recommender systems. Most prior work infers a single high dimensional embedding to represent a user, which is a good starting point but falls short in delivering a full understanding of the user's interests. In this work, we introduce PinnerSage, an end-to-end recommender system that represents each user via multi-modal embeddings and leverages this rich representation of users to provides high quality personalized recommendations. PinnerSage achieves this by clustering users' actions into conceptually coherent clusters with the help of a hierarchical clustering method (Ward) and summarizes the clusters via representative pins (Medoids) for efficiency and interpretability. PinnerSage is deployed in production at Pinterest and we outline the several design decisions that makes it run seamlessly at a very large scale. We conduct several offline and online A/B experiments to show that our method significantly outperforms single embedding methods.
With the explosion of digital data in recent years, continuously learning new tasks from a stream of data without forgetting previously acquired knowledge has become increasingly important. In this paper, we propose a new continual learning (CL) setting, namely ``continual representation learning'', which focuses on learning better representation in a continuous way. We also provide two large-scale multi-step benchmarks for biometric identification, where the visual appearance of different classes are highly relevant. In contrast to requiring the model to recognize more learned classes, we aim to learn feature representation that can be better generalized to not only previously unseen images but also unseen classes/identities. For the new setting, we propose a novel approach that performs the knowledge distillation over a large number of identities by applying the neighbourhood selection and consistency relaxation strategies to improve scalability and flexibility of the continual learning model. We demonstrate that existing CL methods can improve the representation in the new setting, and our method achieves better results than the competitors.
prevention of stroke with its associated risk factors has been one of the public health priorities worldwide. Emerging artificial intelligence technology is being increasingly adopted to predict stroke. Because of privacy concerns, patient data are stored in distributed electronic health record (EHR) databases, voluminous clinical datasets, which prevent patient data from being aggregated and restrains AI technology to boost the accuracy of stroke prediction with centralized training data. In this work, our scientists and engineers propose a privacy-preserving scheme to predict the risk of stroke and deploy our federated prediction model on cloud servers. Our system of federated prediction model asynchronously supports any number of client connections and arbitrary local gradient iterations in each communication round. It adopts federated averaging during the model training process, without patient data being taken out of the hospitals during the whole process of model training and forecasting. With the privacy-preserving mechanism, our federated prediction model trains over all the healthcare data from hospitals in a certain city without actual data sharing among them. Therefore, it is not only secure but also more accurate than any single prediction model that trains over the data only from one single hospital. Especially for small hospitals with few confirmed stroke cases, our federated model boosts model performance by 10%~20% in several machine learning metrics. To help stroke experts comprehend the advantage of our prediction system more intuitively, we developed a mobile app that collects the key information of patients' statistics and demonstrates performance comparisons between the federated prediction model and the single prediction model during the federated training process.
Efficient training of deep neural networks is an increasingly important problem in the era of sophisticated architectures and large-scale datasets. This paper proposes a training set synthesis technique, called Dataset Condensation, that learns to produce a small set of informative samples for training deep neural networks from scratch in a small fraction of the required computational cost on the original data while achieving comparable results. We rigorously evaluate its performance in several computer vision benchmarks and show that it significantly outperforms the state-of-the-art methods. Finally we show promising applications of our method in continual learning and domain adaptation.