Abstract:Apples growing in natural environments often face severe visual obstructions from leaves and branches. This significantly increases the risk of false detections in object detection tasks, thereby escalating the challenge. Addressing this issue, we introduce a technique called "Occlusion-Enhanced Distillation" (OED). This approach utilizes occlusion information to regularize the learning of semantically aligned features on occluded datasets and employs Exponential Moving Average (EMA) to enhance training stability. Specifically, we first design an occlusion-enhanced dataset that integrates Grounding DINO and SAM methods to extract occluding elements such as leaves and branches from each sample, creating occlusion examples that reflect the natural growth state of fruits. Additionally, we propose a multi-scale knowledge distillation strategy, where the student network uses images with increased occlusions as inputs, while the teacher network employs images without natural occlusions. Through this setup, the strategy guides the student network to learn from the teacher across scales of semantic and local features alignment, effectively narrowing the feature distance between occluded and non-occluded targets and enhancing the robustness of object detection. Lastly, to improve the stability of the student network, we introduce the EMA strategy, which aids the student network in learning more generalized feature expressions that are less affected by the noise of individual image occlusions. Our method significantly outperforms current state-of-the-art techniques through extensive comparative experiments.




Abstract:We proposed a novel unsupervised methodology named Disarranged Zone Learning (DZL) to automatically recognize stenosis in coronary angiography. The methodology firstly disarranges the frames in a video, secondly it generates an effective zone and lastly trains an encoder-decoder GRU model to learn the capability to recover disarranged frames. The breakthrough of our study is to discover and validate the Sequence Intensity (Recover Difficulty) is a measure of Coronary Artery Stenosis Status. Hence, the prediction accuracy of DZL is used as an approximator of coronary stenosis indicator. DZL is an unsupervised methodology and no label engineering effort is needed, the sub GRU model in DZL works as a self-supervised approach. So DZL could theoretically utilize infinitely huge amounts of coronary angiographies to learn and improve performance without laborious data labeling. There is no data preprocessing precondition to run DZL as it dynamically utilizes the whole video, hence it is easy to be implemented and generalized to overcome the data heterogeneity of coronary angiography. The overall average precision score achieves 0.93, AUC achieves 0.8 for this pure methodology. The highest segmented average precision score is 0.98 and the highest segmented AUC is 0.87 for coronary occlusion indicator. Finally, we developed a software demo to implement DZL methodology.