Abstract:Ensuring food safety is critical due to its profound impact on public health, economic stability, and global supply chains. Cultivation of Mango, a major agricultural product in several South Asian countries, faces high financial losses due to different diseases, affecting various aspects of the entire supply chain. While deep learning-based methods have been explored for mango leaf disease classification, there remains a gap in designing solutions that are computationally efficient and compatible with low-end devices. In this work, we propose a lightweight Vision Transformer-based pipeline with a self-attention mechanism to classify mango leaf diseases, achieving state-of-the-art performance with minimal computational overhead. Our approach leverages global attention to capture intricate patterns among disease types and incorporates runtime augmentation for enhanced performance. Evaluation on the MangoLeafBD dataset demonstrates a 99.43% accuracy, outperforming existing methods in terms of model size, parameter count, and FLOPs count.