Abstract:Person re-identification (ReID) in surveillance is challenged by occlusion, viewpoint distortion, and poor image quality. Most existing methods rely on complex modules or perform well only on clear frontal images. We propose Sh-ViT (Shuffling Vision Transformer), a lightweight and robust model for occluded person ReID. Built on ViT-Base, Sh-ViT introduces three components: First, a Shuffle module in the final Transformer layer to break spatial correlations and enhance robustness to occlusion and blur; Second, scenario-adapted augmentation (geometric transforms, erasing, blur, and color adjustment) to simulate surveillance conditions; Third, DeiT-based knowledge distillation to improve learning with limited labels.To support real-world evaluation, we construct the MyTT dataset, containing over 10,000 pedestrians and 30,000+ images from base station inspections, with frequent equipment occlusion and camera variations. Experiments show that Sh-ViT achieves 83.2% Rank-1 and 80.1% mAP on MyTT, outperforming CNN and ViT baselines, and 94.6% Rank-1 and 87.5% mAP on Market1501, surpassing state-of-the-art methods.In summary, Sh-ViT improves robustness to occlusion and blur without external modules, offering a practical solution for surveillance-based personnel monitoring.
Abstract:The proliferation of the Internet of Things (IoT) and widespread use of devices with sensing, computing, and communication capabilities have motivated intelligent applications empowered by artificial intelligence. The classical artificial intelligence algorithms require centralized data collection and processing which are challenging in realistic intelligent IoT applications due to growing data privacy concerns and distributed datasets. Federated Learning (FL) has emerged as a distributed privacy-preserving learning framework that enables IoT devices to train global model through sharing model parameters. However, inefficiency due to frequent parameters transmissions significantly reduce FL performance. Existing acceleration algorithms consist of two main type including local update considering trade-offs between communication and computation and parameter compression considering trade-offs between communication and precision. Jointly considering these two trade-offs and adaptively balancing their impacts on convergence have remained unresolved. To solve the problem, this paper proposes a novel efficient adaptive federated optimization (EAFO) algorithm to improve efficiency of FL, which minimizes the learning error via jointly considering two variables including local update and parameter compression and enables FL to adaptively adjust the two variables and balance trade-offs among computation, communication and precision. The experiment results illustrate that comparing with state-of-the-art algorithms, the proposed EAFO can achieve higher accuracies faster.