Object Detection (OD) has proven to be a significant computer vision method in extracting localized class information and has multiple applications in the industry. Although many of the state-of-the-art (SOTA) OD models perform well on medium and large sized objects, they seem to under perform on small objects. In most of the industrial use cases, it is difficult to collect and annotate data for small objects, as it is time-consuming and prone to human errors. Additionally, those datasets are likely to be unbalanced and often result in an inefficient model convergence. To tackle this challenge, this study presents a novel approach that injects additional data points to improve the performance of the OD models. Using synthetic data generation, the difficulties in data collection and annotations for small object data points can be minimized and to create a dataset with balanced distribution. This paper discusses the effects of a simple proportional class-balancing technique, to enable better anchor matching of the OD models. A comparison was carried out on the performances of the SOTA OD models: YOLOv5, YOLOv7 and SSD, for combinations of real and synthetic datasets within an industrial use case.
The importance of proper data normalization for deep neural networks is well known. However, in continuous-time state-space model estimation, it has been observed that improper normalization of either the hidden state or hidden state derivative of the model estimate, or even of the time interval can lead to numerical and optimization challenges with deep learning based methods. This results in a reduced model quality. In this contribution, we show that these three normalization tasks are inherently coupled. Due to the existence of this coupling, we propose a solution to all three normalization challenges by introducing a normalization constant at the state derivative level. We show that the appropriate choice of the normalization constant is related to the dynamics of the to-be-identified system and we derive multiple methods of obtaining an effective normalization constant. We compare and discuss all the normalization strategies on a benchmark problem based on experimental data from a cascaded tanks system and compare our results with other methods of the identification literature.
The use of distributed optimization in machine learning can be motivated either by the resulting preservation of privacy or the increase in computational efficiency. On the one hand, training data might be stored across multiple devices. Training a global model within a network where each node only has access to its confidential data requires the use of distributed algorithms. Even if the data is not confidential, sharing it might be prohibitive due to bandwidth limitations. On the other hand, the ever-increasing amount of available data leads to large-scale machine learning problems. By splitting the training process across multiple nodes its efficiency can be significantly increased. This paper aims to demonstrate how dual decomposition can be applied for distributed training of $ K $-means clustering problems. After an overview of distributed and federated machine learning, the mixed-integer quadratically constrained programming-based formulation of the $ K $-means clustering training problem is presented. The training can be performed in a distributed manner by splitting the data across different nodes and linking these nodes through consensus constraints. Finally, the performance of the subgradient method, the bundle trust method, and the quasi-Newton dual ascent algorithm are evaluated on a set of benchmark problems. While the mixed-integer programming-based formulation of the clustering problems suffers from weak integer relaxations, the presented approach can potentially be used to enable an efficient solution in the future, both in a central and distributed setting.
Federated learning (FL) has gained significant traction as a privacy-preserving algorithm, but the underlying resembles of federated learning algorithm like Federated averaging (FED Avg) or Federated SGD (FED SGD) to ensemble learning algorithms has not been fully explored. The purpose of this paper is to examine the application of FL to object detection as a method to enhance generalizability, and to compare its performance against a centralized training approach for an object detection algorithm. Specifically, we investigate the performance of a YOLOv5 model trained using FL across multiple clients and employ a random sampling strategy without replacement, so each client holds a portion of the same dataset used for centralized training. Our experimental results showcase the superior efficiency of the FL object detector's global model in generating accurate bounding boxes for unseen objects, with the test set being a mixture of objects from two distinct clients not represented in the training dataset. These findings suggest that FL can be viewed from an ensemble algorithm perspective, akin to a synergistic blend of Bagging and Boosting techniques. As a result, FL can be seen not only as a method to enhance privacy, but also as a method to enhance the performance of a machine learning model.
Federated learning (FL) has emerged as a promising approach for training machine learning models on decentralized data without compromising data privacy. In this paper, we propose a FL algorithm for object detection in quality inspection tasks using YOLOv5 as the object detection algorithm and Federated Averaging (FedAvg) as the FL algorithm. We apply this approach to a manufacturing use-case where multiple factories/clients contribute data for training a global object detection model while preserving data privacy on a non-IID dataset. Our experiments demonstrate that our FL approach achieves better generalization performance on the overall clients' test dataset and generates improved bounding boxes around the objects compared to models trained using local clients' datasets. This work showcases the potential of FL for quality inspection tasks in the manufacturing industry and provides valuable insights into the performance and feasibility of utilizing YOLOv5 and FedAvg for federated object detection.
A vast amount of data is created every minute, both in the private sector and industry. Whereas it is often easy to get hold of data in the private entertainment sector, in the industrial production environment it is much more difficult due to laws, preservation of intellectual property, and other factors. However, most machine learning methods require a data source that is sufficient in terms of quantity and quality. A suitable way to bring both requirements together is federated learning where learning progress is aggregated, but everyone remains the owner of their data. Federate learning was first proposed by Google researchers in 2016 and is used for example in the improvement of Google's keyboard Gboard. In contrast to billions of android users, comparable machinery is only used by few companies. This paper examines which other constraints prevail in production and which federated learning approaches can be considered as a result.
Detecting small objects in video streams of head-worn augmented reality devices in near real-time is a huge challenge: training data is typically scarce, the input video stream can be of limited quality, and small objects are notoriously hard to detect. In industrial scenarios, however, it is often possible to leverage contextual knowledge for the detection of small objects. Furthermore, CAD data of objects are typically available and can be used to generate synthetic training data. We describe a near real-time small object detection pipeline for egocentric perception in a manual assembly scenario: We generate a training data set based on CAD data and realistic backgrounds in Unity. We then train a YOLOv4 model for a two-stage detection process: First, the context is recognized, then the small object of interest is detected. We evaluate our pipeline on the augmented reality device Microsoft Hololens 2.
A flexible operation of multiple robotic manipulators in a shared workspace requires an online trajectory planning with static and dynamic collision avoidance. In this work, we propose a real-time capable motion control algorithm, based on non-linear model predictive control, which accounts for static and dynamic collision avoidance. The proposed algorithm is formulated as a non-cooperative game, where each robot is considered as an agent. Each agent optimizes its own motion and accounts for the predicted movement of surrounding agents. We propose a novel approach to formulate the dynamic collision constraints. Additionally, we account for deadlocks that might occur in a setup of multiple robotic manipulators. We validate our algorithm on a pick and place scenario for four collaborative robots operating in a common workspace in the simulation environment Gazebo. The robots are controlled by the Robot Operating System (ROS). We demonstrate, that our approach is real-time capable and, due to the distributed nature of the approach, easily scales to an arbitrary number of robot manipulators in a shared workspace.
Due to their compliant structure, industrial robots without precision-enhancing measures are only to a limited extent suitable for machining applications. Apart from structural, thermal and bearing deformations, the main cause for compliant structure is backlash of transmission drives. This paper proposes a method to improve trajectory tracking accuracy by using secondary encoders and applying a feedback and a flatness based feed forward control strategy. For this purpose, a novel nonlinear, continuously differentiable dynamical model of a flexible robot joint is presented. The robot joint is modeled as a two-mass oscillator with pose-dependent inertia, nonlinear friction and nonlinear stiffness, including backlash. A flatness based feed forward control is designed to improve the guiding behaviour and a feedback controller, based on secondary encoders, is implemented for disturbance compensation. Using Automatic Differentiation, the nonlinear feed forward controller can be computed in a few microseconds online. Finally, the proposed algorithms are evaluated in simulations and experimentally on a real KUKA Quantec KR300 Ultra SE.