Flowers, ENSTA ParisTech U2IS/RV
Abstract:Transfer learning is a machine learning technique that uses previously acquired knowledge from a source domain to enhance learning in a target domain by reusing learned weights. This technique is ubiquitous because of its great advantages in achieving high performance while saving training time, memory, and effort in network design. In this paper, we investigate how to select the best pre-trained model that meets the target domain requirements for image classification tasks. In our study, we refined the output layers and general network parameters to apply the knowledge of eleven image processing models, pre-trained on ImageNet, to five different target domain datasets. We measured the accuracy, accuracy density, training time, and model size to evaluate the pre-trained models both in training sessions in one episode and with ten episodes.




Abstract:We present a novel method to perform multi-class pattern classification with neural networks and test it on a challenging 3D hand gesture recognition problem. Our method consists of a standard one-against-all (OAA) classification, followed by another network layer classifying the resulting class scores, possibly augmented by the original raw input vector. This allows the network to disambiguate hard-to-separate classes as the distribution of class scores carries considerable information as well, and is in fact often used for assessing the confidence of a decision. We show that by this approach we are able to significantly boost our results, overall as well as for particular difficult cases, on the hard 10-class gesture classification task.




Abstract:We present a novel hierarchical approach to multi-class classification which is generic in that it can be applied to different classification models (e.g., support vector machines, perceptrons), and makes no explicit assumptions about the probabilistic structure of the problem as it is usually done in multi-class classification. By adding a cascade of additional classifiers, each of which receives the previous classifier's output in addition to regular input data, the approach harnesses unused information that manifests itself in the form of, e.g., correlations between predicted classes. Using multilayer perceptrons as a classification model, we demonstrate the validity of this approach by testing it on a complex ten-class 3D gesture recognition task.