Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Oliver Heimann

Physical Annotation for Automated Optical Inspection: A Concept for In-Situ, Pointer-Based Trainingdata Generation

Jun 05, 2025

Oliver Krumpek, Oliver Heimann, Jörg Krüger

Abstract:This paper introduces a novel physical annotation system designed to generate training data for automated optical inspection. The system uses pointer-based in-situ interaction to transfer the valuable expertise of trained inspection personnel directly into a machine learning (ML) training pipeline. Unlike conventional screen-based annotation methods, our system captures physical trajectories and contours directly on the object, providing a more intuitive and efficient way to label data. The core technology uses calibrated, tracked pointers to accurately record user input and transform these spatial interactions into standardised annotation formats that are compatible with open-source annotation software. Additionally, a simple projector-based interface projects visual guidance onto the object to assist users during the annotation process, ensuring greater accuracy and consistency. The proposed concept bridges the gap between human expertise and automated data generation, enabling non-IT experts to contribute to the ML training pipeline and preventing the loss of valuable training samples. Preliminary evaluation results confirm the feasibility of capturing detailed annotation trajectories and demonstrate that integration with CVAT streamlines the workflow for subsequent ML tasks. This paper details the system architecture, calibration procedures and interface design, and discusses its potential contribution to future ML data generation for automated optical inspection.

Via

Access Paper or Ask Questions

Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders

Mar 25, 2025

Paul Koch, Jörg Krüger, Ankit Chowdhury, Oliver Heimann

Figure 1 for Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders

Figure 2 for Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders

Figure 3 for Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders

Figure 4 for Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders

Abstract:Generalized metric depth understanding is critical for precise vision-guided robotics, which current state-of-the-art (SOTA) vision-encoders do not support. To address this, we propose Vanishing Depth, a self-supervised training approach that extends pretrained RGB encoders to incorporate and align metric depth into their feature embeddings. Based on our novel positional depth encoding, we enable stable depth density and depth distribution invariant feature extraction. We achieve performance improvements and SOTA results across a spectrum of relevant RGBD downstream tasks - without the necessity of finetuning the encoder. Most notably, we achieve 56.05 mIoU on SUN-RGBD segmentation, 88.3 RMSE on Void's depth completion, and 83.8 Top 1 accuracy on NYUv2 scene classification. In 6D-object pose estimation, we outperform our predecessors of DinoV2, EVA-02, and Omnivore and achieve SOTA results for non-finetuned encoders in several related RGBD downstream tasks.

* Preprint

Via

Access Paper or Ask Questions

On the Application of Egocentric Computer Vision to Industrial Scenarios

Jun 11, 2024

Vivek Chavan, Oliver Heimann, Jörg Krüger

Figure 1 for On the Application of Egocentric Computer Vision to Industrial Scenarios

Figure 2 for On the Application of Egocentric Computer Vision to Industrial Scenarios

Figure 3 for On the Application of Egocentric Computer Vision to Industrial Scenarios

Abstract:Egocentric vision aims to capture and analyse the world from the first-person perspective. We explore the possibilities for egocentric wearable devices to improve and enhance industrial use cases w.r.t. data collection, annotation, labelling and downstream applications. This would contribute to easier data collection and allow users to provide additional context. We envision that this approach could serve as a supplement to the traditional industrial Machine Vision workflow. Code, Dataset and related resources will be available at: https://github.com/Vivek9Chavan/EgoVis24

* To be presented at the First Joint Egocentric Vision (EgoVis) Workshop, held in conjunction with CVPR 2024

Via

Access Paper or Ask Questions