The proposed architecture, Dual Attentive U-Net with Feature Infusion (DAU-FI Net), addresses challenges in semantic segmentation, particularly on multiclass imbalanced datasets with limited samples. DAU-FI Net integrates multiscale spatial-channel attention mechanisms and feature injection to enhance precision in object localization. The core employs a multiscale depth-separable convolution block, capturing localized patterns across scales. This block is complemented by a spatial-channel squeeze and excitation (scSE) attention unit, modeling inter-dependencies between channels and spatial regions in feature maps. Additionally, additive attention gates refine segmentation by connecting encoder-decoder pathways. To augment the model, engineered features using Gabor filters for textural analysis, Sobel and Canny filters for edge detection are injected guided by semantic masks to expand the feature space strategically. Comprehensive experiments on a challenging sewer pipe and culvert defect dataset and a benchmark dataset validate DAU-FI Net's capabilities. Ablation studies highlight incremental benefits from attention blocks and feature injection. DAU-FI Net achieves state-of-the-art mean Intersection over Union (IoU) of 95.6% and 98.8% on the defect test set and benchmark respectively, surpassing prior methods by 8.9% and 12.6%, respectively. Ablation studies highlight incremental benefits from attention blocks and feature injection. The proposed architecture provides a robust solution, advancing semantic segmentation for multiclass problems with limited training data. Our sewer-culvert defects dataset, featuring pixel-level annotations, opens avenues for further research in this crucial domain. Overall, this work delivers key innovations in architecture, attention, and feature engineering to elevate semantic segmentation efficacy.
Recent advancements have significantly improved the efficiency and effectiveness of deep learning methods for imagebased remote sensing tasks. However, the requirement for large amounts of labeled data can limit the applicability of deep neural networks to existing remote sensing datasets. To overcome this challenge, fewshot learning has emerged as a valuable approach for enabling learning with limited data. While previous research has evaluated the effectiveness of fewshot learning methods on satellite based datasets, little attention has been paid to exploring the applications of these methods to datasets obtained from UAVs, which are increasingly used in remote sensing studies. In this review, we provide an up to date overview of both existing and newly proposed fewshot classification techniques, along with appropriate datasets that are used for both satellite based and UAV based data. Our systematic approach demonstrates that fewshot learning can effectively adapt to the broader and more diverse perspectives that UAVbased platforms can provide. We also evaluate some SOTA fewshot approaches on a UAV disaster scene classification dataset, yielding promising results. We emphasize the importance of integrating XAI techniques like attention maps and prototype analysis to increase the transparency, accountability, and trustworthiness of fewshot models for remote sensing. Key challenges and future research directions are identified, including tailored fewshot methods for UAVs, extending to unseen tasks like segmentation, and developing optimized XAI techniques suited for fewshot remote sensing problems. This review aims to provide researchers and practitioners with an improved understanding of fewshot learnings capabilities and limitations in remote sensing, while highlighting open problems to guide future progress in efficient, reliable, and interpretable fewshot methods.
Incorporating deep learning (DL) classification models into unmanned aerial vehicles (UAVs) can significantly augment search-and-rescue operations and disaster management efforts. In such critical situations, the UAV's ability to promptly comprehend the crisis and optimally utilize its limited power and processing resources to narrow down search areas is crucial. Therefore, developing an efficient and lightweight method for scene classification is of utmost importance. However, current approaches tend to prioritize accuracy on benchmark datasets at the expense of computational efficiency. To address this shortcoming, we introduce the Wider ATTENTION EfficientNet (WATT-EffNet), a novel method that achieves higher accuracy with a more lightweight architecture compared to the baseline EfficientNet. The WATT-EffNet leverages width-wise incremental feature modules and attention mechanisms over width-wise features to ensure the network structure remains lightweight. We evaluate our method on a UAV-based aerial disaster image classification dataset and demonstrate that it outperforms the baseline by up to 15 times in terms of classification accuracy and 38.3% in terms of computing efficiency as measured by Floating Point Operations per second (FLOPs). Additionally, we conduct an ablation study to investigate the effect of varying the width of WATT-EffNet on accuracy and computational efficiency. Our code is available at \url{https://github.com/TanmDL/WATT-EffNet}.
In real world scenarios, out-of-distribution (OOD) datasets may have a large distributional shift from training datasets. This phenomena generally occurs when a trained classifier is deployed on varying dynamic environments, which causes a significant drop in performance. To tackle this issue, we are proposing an end-to-end deep multi-task network in this work. Observing a strong relationship between rotation prediction (self-supervised) accuracy and semantic classification accuracy on OOD tasks, we introduce an additional auxiliary classification head in our multi-task network along with semantic classification and rotation prediction head. To observe the influence of this addition classifier in improving the rotation prediction head, our proposed learning method is framed into bi-level optimisation problem where the upper-level is trained to update the parameters for semantic classification and rotation prediction head. In the lower-level optimisation, only the auxiliary classification head is updated through semantic classification head by fixing the parameters of the semantic classification head. The proposed method has been validated through three unseen OOD datasets where it exhibits a clear improvement in semantic classification accuracy than other two baseline methods. Our code is available on GitHub \url{https://github.com/harshita-555/OSSL}
Many real-world classification problems have imbalanced frequency of class labels; a well-known issue known as the "class imbalance" problem. Classic classification algorithms tend to be biased towards the majority class, leaving the classifier vulnerable to misclassification of the minority class. While the literature is rich with methods to fix this problem, as the dimensionality of the problem increases, many of these methods do not scale-up and the cost of running them become prohibitive. In this paper, we present an end-to-end deep generative classifier. We propose a domain-constraint autoencoder to preserve the latent-space as prior for a generator, which is then used to play an adversarial game with two other deep networks, a discriminator and a classifier. Extensive experiments are carried out on three different multi-class imbalanced problems and a comparison with state-of-the-art methods. Experimental results confirmed the superiority of our method over popular algorithms in handling high-dimensional imbalanced classification problems. Our code is available on https://github.com/TanmDL/SLPPL-GAN.
Traditional oversampling methods are generally employed to handle class imbalance in datasets. This oversampling approach is independent of the classifier; thus, it does not offer an end-to-end solution. To overcome this, we propose a three-player adversarial game-based end-to-end method, where a domain-constraints mixture of generators, a discriminator, and a multi-class classifier are used. Rather than adversarial minority oversampling, we propose an adversarial oversampling (AO) and a data-space oversampling (DO) approach. In AO, the generator updates by fooling both the classifier and discriminator, however, in DO, it updates by favoring the classifier and fooling the discriminator. While updating the classifier, it considers both the real and synthetically generated samples in AO. But, in DO, it favors the real samples and fools the subset class-specific generated samples. To mitigate the biases of a classifier towards the majority class, minority samples are over-sampled at a fractional rate. Such implementation is shown to provide more robust classification boundaries. The effectiveness of our proposed method has been validated with high-dimensional, highly imbalanced and large-scale multi-class tabular datasets. The results as measured by average class specific accuracy (ACSA) clearly indicate that the proposed method provides better classification accuracy (improvement in the range of 0.7% to 49.27%) as compared to the baseline classifier.
There exists an increasing demand of a flexible and computationally efficient controller for micro aerial vehicles (MAVs) due to a high degree of environmental perturbations. In this work, an evolving neuro-fuzzy controller namely Parsimonious Controller (PAC) is proposed and features less network parameters than conventional approaches due to the absence of rule premise parameters. PAC is built upon a recently developed evolving neuro-fuzzy system known as parsimonious learning machine (PALM) and adopts new rule growing and pruning modules derived from the approximation of bias and variance. These methods has no reliance on user-defined thresholds, thereby increasing its autonomy for the real-time deployment. PAC adapts the consequent parameters with the sliding mode control (SMC) theory in the single-pass fashion. The stability of our PAC is proven utilizing the Lyapunov stability analysis. Lastly, the controller's efficacy is evaluated by observing various trajectory tracking performance from a bio-inspired flapping wing micro aerial vehicle (BI-FWMAV) and a rotary wing micro aerial vehicle called hexacopter. Furthermore, it is compared against three distinctive controller. Our PAC outperforms the linear PID controller and generalized regression neural network (GRNN) based nonlinear adaptive controller. Compared to its predecessor, G-controller, the tracking accuracy is comparable but the PAC incurs significantly less parameters to attain similar or better performance than the G-controller.
Data stream has been the underlying challenge in the age of big data because it calls for real-time data processing with the absence of a retraining process and/or an iterative learning approach. In realm of fuzzy system community, data stream is handled by algorithmic development of self-adaptive neurofuzzy systems (SANFS) characterized by the single-pass learning mode and the open structure property which enables effective handling of fast and rapidly changing natures of data streams. The underlying bottleneck of SANFSs lies in its design principle which involves a high number of free parameters (rule premise and rule consequent) to be adapted in the training process. This figure can even double in the case of type-2 fuzzy system. In this work, a novel SANFS, namely parsimonious learning machine (PALM), is proposed. PALM features utilization of a new type of fuzzy rule based on the concept of hyperplane clustering which significantly reduces the number of network parameters because it has no rule premise parameters. PALM is proposed in both type-1 and type-2 fuzzy systems where all of which characterize a fully dynamic rule-based system. That is, it is capable of automatically generating, merging and tuning the hyperplane-based fuzzy rule in the single pass manner. Moreover, an extension of PALM, namely recurrent PALM (rPALM), is proposed and adopts the concept of teacher-forcing mechanism in the deep learning literature. The efficacy of PALM has been evaluated through numerical study with six real-world and synthetic data streams from public database and our own real-world project of autonomous vehicles. The proposed model showcases significant improvements in terms of computational complexity and number of required parameters against several renowned SANFSs, while attaining comparable and often better predictive accuracy.
Controlling of a flapping flight is one of the recent research topics related to the field of Flapping Wing Micro Air Vehicle (FW MAV). In this work, an adaptive control system for a four-wing FW MAV is proposed, inspired by its advanced features like quick flight, vertical take-off and landing, hovering, and fast turn, and enhanced manoeuvrability. Sliding Mode Control (SMC) theory has been used to develop the adaptation laws for the proposed adaptive fuzzy controller. The SMC theory confirms the closed-loop stability of the controller. The controller is utilized to control the altitude of the FW MAV, that can adapt to environmental disturbances by tuning the antecedent and consequent parameters of the fuzzy system.