CMM, PSL, STIM
Abstract:Semantic segmentation and hyperspectral unmixing are two central problems in spectral image analysis. The former assigns each pixel a discrete label corresponding to its material class, whereas the latter estimates pure material spectra, called endmembers, and, for each pixel, a vector representing material abundances in the observed scene. Despite their complementarity, these two problems are usually addressed independently. This paper aims to bridge these two lines of work by formally showing that, under the linear mixing model, pixel classification by dominant materials induces polyhedral-cone regions in the spectral space. We leverage this fundamental property to propose a direct segmentation-to-unmixing pipeline that performs blind hyperspectral unmixing from any semantic segmentation by constructing a polyhedral-cone partition of the space that best fits the labeled pixels. Signed distances from pixels to the estimated regions are then computed, linearly transformed via a change of basis in the distance space, and projected onto the probability simplex, yielding an initial abundance estimate. This estimate is used to extract endmembers and recover final abundances via matrix pseudo-inversion. Because the segmentation method can be freely chosen, the user gains explicit control over the unmixing process, while the rest of the pipeline remains essentially deterministic and lightweight. Beyond improving interpretability, experiments on three real datasets demonstrate the effectiveness of the proposed approach when associated with appropriate clustering algorithms, and show consistent improvements over recent deep and non-deep state-of-the-art methods. The code is available at: https://github.com/antoine-bottenmuller/polyhedral-unmixing
Abstract:Deep neural networks show great potential for automating various visual quality inspection tasks in manufacturing. However, their applicability is limited in more volatile scenarios, such as remanufacturing, where the inspected products and defect patterns often change. In such settings, deployed models require frequent adaptation to novel conditions, effectively posing a continual learning problem. To enable quick adaptation, the necessary training processes must be computationally efficient while still avoiding effects like catastrophic forgetting. This work presents a multi-level feature fusion (MLFF) approach that aims to improve both aspects simultaneously by utilizing representations from different depths of a pretrained network. We show that our approach is able to match the performance of end-to-end training for different quality inspection problems while using significantly less trainable parameters. Furthermore, it reduces catastrophic forgetting and improves generalization robustness to new product types or defects.
Abstract:Remanufacturing describes a process where worn products are restored to like-new condition and it offers vast ecological and economic potentials. A key step is the quality inspection of disassembled components, which is mostly done manually due to the high variety of parts and defect patterns. Deep neural networks show great potential to automate such visual inspection tasks but struggle to generalize to new product variants, components, or defect patterns. To tackle this challenge, we propose a novel image dataset depicting typical gearbox components in good and defective condition from two automotive transmissions. Depending on the train-test split of the data, different distribution shifts are generated to benchmark the generalization ability of a classification model. We evaluate different models using the dataset and propose a contrastive regularization loss to enhance model robustness. The results obtained demonstrate the ability of the loss to improve generalisation to unseen types of components.




Abstract:Deep convolutional neural networks accuracy is heavily impacted by rotations of the input data. In this paper, we propose a convolutional predictor that is invariant to rotations in the input. This architecture is capable of predicting the angular orientation without angle-annotated data. Furthermore, the predictor maps continuously the random rotation of the input to a circular space of the prediction. For this purpose, we use the roto-translation properties existing in the Scattering Transform Networks with a series of 3D Convolutions. We validate the results by training with upright and randomly rotated samples. This allows further applications of this work on fields like automatic re-orientation of randomly oriented datasets.