Abstract:Estimating the 3D pose of a drone is important for anti-drone systems, but existing methods struggle with the unique challenges of drone keypoint detection. Drone propellers serve as keypoints but are difficult to detect due to their high visual similarity and diversity of poses. To address these challenges, we propose DroneKey, a framework that combines a 2D keypoint detector and a 3D pose estimator specifically designed for drones. In the keypoint detection stage, we extract two key-representations (intermediate and compact) from each transformer encoder layer and optimally combine them using a gated sum. We also introduce a pose-adaptive Mahalanobis distance in the loss function to ensure stable keypoint predictions across extreme poses. We built new datasets of drone 2D keypoints and 3D pose to train and evaluate our method, which have been publicly released. Experiments show that our method achieves an AP of 99.68% (OKS) in keypoint detection, outperforming existing methods. Ablation studies confirm that the pose-adaptive Mahalanobis loss function improves keypoint prediction stability and accuracy. Additionally, improvements in the encoder design enable real-time processing at 44 FPS. For 3D pose estimation, our method achieved an MAE-angle of 10.62{\deg}, an RMSE of 0.221m, and an MAE-absolute of 0.076m, demonstrating high accuracy and reliability. The code and dataset are available at https://github.com/kkanuseobin/DroneKey.
Abstract:The flatfish is a major farmed species consumed globally in large quantities. However, due to the densely populated farming environment, flatfish are susceptible to injuries and diseases, making early disease detection crucial. Traditionally, diseases were detected through visual inspection, but observing large numbers of fish is challenging. Automated approaches based on deep learning technologies have been widely used, to address this problem, but accurate detection remains difficult due to the diversity of the fish and the lack of the fish disease dataset. In this study, augments fish disease images using generative adversarial networks and image harmonization methods. Next, disease detectors are trained separately for three body parts (head, fins, and body) to address individual diseases properly. In addition, a flatfish disease image dataset called \texttt{FlatIMG} is created and verified on the dataset using the proposed methods. A flash salmon disease dataset is also tested to validate the generalizability of the proposed methods. The results achieved 12\% higher performance than the baseline framework. This study is the first attempt to create a large-scale flatfish disease image dataset and propose an effective disease detection framework. Automatic disease monitoring could be achieved in farming environments based on the proposed methods and dataset.