Abstract:Tactile-based reinforcement learning (RL) is currently hindered by fragmented research and a focus on over-saturated orientation tasks. We introduce v2 of the Robot Tactile Olympiad (\texttt{roto 2.0}), a GPU-parallelised benchmark designed to standardise tactile-based RL across four distinct robotic morphologies (16-DOF to 24-DOF). Unlike prior benchmarks, roto focuses on end-to-end "blind" manipulation, utilising only proprioception and tactile sensing without state information or distillation. We demonstrate a significant performance leap, with our blind agents achieving 13 Baoding ball rotations in 10 seconds, an order of magnitude faster than current state-of-the-art speeds. By open-sourcing our environments and robustly tuned baselines, we reduce the barrier to entry and enable researchers to prioritise fundamental algorithmic challenges over tedious RL tuning. Website: https://elle-miller.github.io/roto/
Abstract:The global outbreak of Mpox virus, classified as a Public Health Emergency of International Concern by WHO, presents significant diagnostic challenges due to its visual similarity to other skin lesion diseases. Current clinical detection techniques face limitations in accuracy and efficiency, necessitating improved automated diagnostic solutions. This study introduces a novel Cascaded Atrous Group Attention (CAGA) module, specifically designed to enhance multi-scale feature representation while optimizing computational efficiency. By integrating CAGA with EfficientViT-L1 as the backbone architecture, our approach achieves state-of-the-art performance with a score of 0.98% on the MCSI dataset, while reducing model parameters by 37.5% compared to the original EfficientViT-L1. This reduction in computational complexity maintains diagnostic accuracy while enabling broader deployment across resource-constrained healthcare settings. Extensive validation across two other benchmark datasets, including MSID and MSLD, demonstrate the model's robustness, consistently outperforming existing approaches. Our findings suggest that CAGA's efficient feature extraction mechanism could be adapted for other medical imaging tasks requiring fine-grained visual discrimination.