Abstract:In the thesis we consider the problem of local feature descriptor learning for wide baseline stereo focusing on the HardNet descriptor, which is close to state-of-the-art. AMOS Patches dataset is introduced, which improves robustness to illumination and appearance changes. It is based on registered images from selected cameras from the AMOS dataset. We provide recommendations on the patch dataset creation process and evaluate HardNet trained on data of different modalities. We also introduce a dataset combination and reduction methods, that allow comparable performance on a significantly smaller dataset. HardNet8, consistently outperforming the original HardNet, benefits from the architectural choices made: connectivity pattern, final pooling, receptive field, CNN building blocks found by manual or automatic search algorithms -- DARTS. We show impact of overlooked hyperparameters such as batch size and length of training on the descriptor quality. PCA dimensionality reduction further boosts performance and also reduces memory footprint. Finally, the insights gained lead to two HardNet8 descriptors: one performing well on a variety of benchmarks -- HPatches, AMOS Patches and IMW Phototourism, the other is optimized for IMW Phototourism.
Abstract:We present AMOS Patches, a large set of image cut-outs, intended primarily for the robustification of trainable local feature descriptors to illumination and appearance changes. Images contributing to AMOS Patches originate from the AMOS dataset of recordings from a large set of outdoor webcams. The semiautomatic method used to generate AMOS Patches is described. It includes camera selection, viewpoint clustering and patch selection. For training, we provide both the registered full source images as well as the patches. A new descriptor, trained on the AMOS Patches and 6Brown datasets, is introduced. It achieves state-of-the-art in matching under illumination changes on standard benchmarks.