Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Victor Lempitsky

Samsung AI Center, Skolkovo Institute of Science and Technology

Aggregating Deep Convolutional Features for Image Retrieval

Oct 26, 2015

Artem Babenko, Victor Lempitsky

Figure 1 for Aggregating Deep Convolutional Features for Image Retrieval

Figure 2 for Aggregating Deep Convolutional Features for Image Retrieval

Figure 3 for Aggregating Deep Convolutional Features for Image Retrieval

Figure 4 for Aggregating Deep Convolutional Features for Image Retrieval

Abstract:Several recent works have shown that image descriptors produced by deep convolutional neural networks provide state-of-the-art performance for image classification and retrieval problems. It has also been shown that the activations from the convolutional layers can be interpreted as local features describing particular image regions. These local features can be aggregated using aggregation approaches developed for local features (e.g. Fisher vectors), thus providing new powerful global descriptors. In this paper we investigate possible ways to aggregate local deep features to produce compact global descriptors for image retrieval. First, we show that deep features and traditional hand-engineered features have quite different distributions of pairwise similarities, hence existing aggregation methods have to be carefully re-evaluated. Such re-evaluation reveals that in contrast to shallow features, the simple aggregation method based on sum pooling provides arguably the best performance for deep convolutional features. This method is efficient, has few parameters, and bears little risk of overfitting when e.g. learning the PCA matrix. Overall, the new compact global descriptor improves the state-of-the-art on four common benchmarks considerably.

* accepted for ICCV 2015

Via

Access Paper or Ask Questions

Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition

Apr 24, 2015

Vadim Lebedev, Yaroslav Ganin, Maksim Rakhuba, Ivan Oseledets, Victor Lempitsky

Figure 1 for Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition

Figure 2 for Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition

Figure 3 for Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition

Abstract:We propose a simple two-step approach for speeding up convolution layers within large convolutional neural networks based on tensor decomposition and discriminative fine-tuning. Given a layer, we use non-linear least squares to compute a low-rank CP-decomposition of the 4D convolution kernel tensor into a sum of a small number of rank-one tensors. At the second step, this decomposition is used to replace the original convolutional layer with a sequence of four convolutional layers with small kernels. After such replacement, the entire network is fine-tuned on the training data using standard backpropagation process. We evaluate this approach on two CNNs and show that it is competitive with previous approaches, leading to higher obtained CPU speedups at the cost of lower accuracy drops for the smaller of the two networks. Thus, for the 36-class character classification CNN, our approach obtains a 8.5x CPU speedup of the whole network with only minor accuracy drop (1% from 91% to 90%). For the standard ImageNet architecture (AlexNet), the approach speeds up the second convolution layer by a factor of 4x at the cost of $1\%$ increase of the overall top-5 classification error.

Via

Access Paper or Ask Questions

Unsupervised Domain Adaptation by Backpropagation

Feb 27, 2015

Yaroslav Ganin, Victor Lempitsky

Figure 1 for Unsupervised Domain Adaptation by Backpropagation

Figure 2 for Unsupervised Domain Adaptation by Backpropagation

Figure 3 for Unsupervised Domain Adaptation by Backpropagation

Figure 4 for Unsupervised Domain Adaptation by Backpropagation

Abstract:Top-performing deep architectures are trained on massive amounts of labeled data. In the absence of labeled data for a certain task, domain adaptation often provides an attractive option given that labeled data of similar nature but from a different domain (e.g. synthetic images) are available. Here, we propose a new approach to domain adaptation in deep architectures that can be trained on large amount of labeled data from the source domain and large amount of unlabeled data from the target domain (no labeled target-domain data is necessary). As the training progresses, the approach promotes the emergence of "deep" features that are (i) discriminative for the main learning task on the source domain and (ii) invariant with respect to the shift between the domains. We show that this adaptation behaviour can be achieved in almost any feed-forward model by augmenting it with few standard layers and a simple new gradient reversal layer. The resulting augmented architecture can be trained using standard backpropagation. Overall, the approach can be implemented with little effort using any of the deep-learning packages. The method performs very well in a series of image classification experiments, achieving adaptation effect in the presence of big domain shifts and outperforming previous state-of-the-art on Office datasets.

Via

Access Paper or Ask Questions

Neural Codes for Image Retrieval

Jul 07, 2014

Artem Babenko, Anton Slesarev, Alexandr Chigorin, Victor Lempitsky

Figure 1 for Neural Codes for Image Retrieval

Figure 2 for Neural Codes for Image Retrieval

Figure 3 for Neural Codes for Image Retrieval

Figure 4 for Neural Codes for Image Retrieval

Abstract:It has been shown that the activations invoked by an image within the top layers of a large convolutional neural network provide a high-level descriptor of the visual content of the image. In this paper, we investigate the use of such descriptors (neural codes) within the image retrieval application. In the experiments with several standard retrieval benchmarks, we establish that neural codes perform competitively even when the convolutional neural network has been trained for an unrelated classification task (e.g.\ Image-Net). We also evaluate the improvement in the retrieval performance of neural codes, when the network is retrained on a dataset of images that are similar to images encountered at test time. We further evaluate the performance of the compressed neural codes and show that a simple PCA compression provides very good short codes that give state-of-the-art accuracy on a number of datasets. In general, neural codes turn out to be much more resilient to such compression in comparison other state-of-the-art descriptors. Finally, we show that discriminative dimensionality reduction trained on a dataset of pairs of matched photographs improves the performance of PCA-compressed neural codes even further. Overall, our quantitative experiments demonstrate the promise of neural codes as visual descriptors for image retrieval.

* to appear at ECCV 2014

Via

Access Paper or Ask Questions

$ N^4 $-Fields: Neural Network Nearest Neighbor Fields for Image Transforms

Jul 03, 2014

Yaroslav Ganin, Victor Lempitsky

Figure 1 for $ N^4 $-Fields: Neural Network Nearest Neighbor Fields for Image Transforms

Figure 2 for $ N^4 $-Fields: Neural Network Nearest Neighbor Fields for Image Transforms

Figure 3 for $ N^4 $-Fields: Neural Network Nearest Neighbor Fields for Image Transforms

Figure 4 for $ N^4 $-Fields: Neural Network Nearest Neighbor Fields for Image Transforms

Abstract:We propose a new architecture for difficult image processing operations, such as natural edge detection or thin object segmentation. The architecture is based on a simple combination of convolutional neural networks with the nearest neighbor search. We focus our attention on the situations when the desired image transformation is too hard for a neural network to learn explicitly. We show that in such situations, the use of the nearest neighbor search on top of the network output allows to improve the results considerably and to account for the underfitting effect during the neural network training. The approach is validated on three challenging benchmarks, where the performance of the proposed architecture matches or exceeds the state-of-the-art.

Via

Access Paper or Ask Questions

Improving Bilayer Product Quantization for Billion-Scale Approximate Nearest Neighbors in High Dimensions

Apr 07, 2014

Artem Babenko, Victor Lempitsky

Figure 1 for Improving Bilayer Product Quantization for Billion-Scale Approximate Nearest Neighbors in High Dimensions

Figure 2 for Improving Bilayer Product Quantization for Billion-Scale Approximate Nearest Neighbors in High Dimensions

Figure 3 for Improving Bilayer Product Quantization for Billion-Scale Approximate Nearest Neighbors in High Dimensions

Figure 4 for Improving Bilayer Product Quantization for Billion-Scale Approximate Nearest Neighbors in High Dimensions

Abstract:The top-performing systems for billion-scale high-dimensional approximate nearest neighbor (ANN) search are all based on two-layer architectures that include an indexing structure and a compressed datapoints layer. An indexing structure is crucial as it allows to avoid exhaustive search, while the lossy data compression is needed to fit the dataset into RAM. Several of the most successful systems use product quantization (PQ) for both the indexing and the dataset compression layers. These systems are however limited in the way they exploit the interaction of product quantization processes that happen at different stages of these systems. Here we introduce and evaluate two approximate nearest neighbor search systems that both exploit the synergy of product quantization processes in a more efficient way. The first system, called Fast Bilayer Product Quantization (FBPQ), speeds up the runtime of the baseline system (Multi-D-ADC) by several times, while achieving the same accuracy. The second system, Hierarchical Bilayer Product Quantization (HBPQ) provides a significantly better recall for the same runtime at a cost of small memory footprint increase. For the BIGANN dataset of billion SIFT descriptors, the 10% increase in Recall@1 and the 17% increase in Recall@10 is observed.

Via

Access Paper or Ask Questions