Abstract:Medical image segmentation is constrained by sparse pathological annotations. Existing augmentation strategies, from conventional transforms to random masking for self-supervision, are feature-agnostic: they often corrupt critical diagnostic semantics or fail to prioritize essential features. We introduce "Keep the Core," a novel data-centric paradigm that uses adversarial priors to guide both augmentation and masking in a significance-preserving manner. Our approach uses SAGE (Sparse Adversarial Gated Estimator), an offline module identifying minimal tokens whose micro-perturbation flips segmentation boundaries. SAGE forges the Token Importance Map $W$ by solving an adversarial optimization problem to maximally degrade performance, while an $\ell_1$ sparsity penalty encourages a compact set of sensitive tokens. The online KEEP (Key-region Enhancement \& Preservation) module uses $W$ for a two-pronged augmentation strategy: (1) Semantic-Preserving Augmentation: High-importance tokens are augmented, but their original pixel values are strictly restored. (2) Guided-Masking Augmentation: Low-importance tokens are selectively masked for an $\text{MAE}$-style reconstruction, forcing the model to learn robust representations from preserved critical features. "Keep the Core" is backbone-agnostic with no inference overhead. Extensive experiments show SAGE's structured priors and KEEP's region-selective mechanism are highly complementary, achieving state-of-the-art segmentation robustness and generalization on 2D medical datasets.
Abstract:State Space models (SSMs) such as PointMamba enable efficient feature extraction for point cloud self-supervised learning with linear complexity, outperforming Transformers in computational efficiency. However, existing PointMamba-based methods depend on complex token ordering and random masking, which disrupt spatial continuity and local semantic correlations. We propose ZigzagPointMamba to tackle these challenges. The core of our approach is a simple zigzag scan path that globally sequences point cloud tokens, enhancing spatial continuity by preserving the proximity of spatially adjacent point tokens. Nevertheless, random masking undermines local semantic modeling in self-supervised learning. To address this, we introduce a Semantic-Siamese Masking Strategy (SMS), which masks semantically similar tokens to facilitate reconstruction by integrating local features of original and similar tokens. This overcomes the dependence on isolated local features and enables robust global semantic modeling. Our pre-trained ZigzagPointMamba weights significantly improve downstream tasks, achieving a 1.59% mIoU gain on ShapeNetPart for part segmentation, a 0.4% higher accuracy on ModelNet40 for classification, and 0.19%, 1.22%, and 0.72% higher accuracies respectively for the classification tasks on the OBJ-BG, OBJ-ONLY, and PB-T50-RS subsets of ScanObjectNN. The code is available at: https://anonymous.4open.science/r/ZigzagPointMamba-1800/