Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mian Muhammad Naeem Abid

LeMoRe: Learn More Details for Lightweight Semantic Segmentation

May 29, 2025

Mian Muhammad Naeem Abid, Nancy Mehta, Zongwei Wu, Radu Timofte

Figure 1 for LeMoRe: Learn More Details for Lightweight Semantic Segmentation

Figure 2 for LeMoRe: Learn More Details for Lightweight Semantic Segmentation

Figure 3 for LeMoRe: Learn More Details for Lightweight Semantic Segmentation

Figure 4 for LeMoRe: Learn More Details for Lightweight Semantic Segmentation

Abstract:Lightweight semantic segmentation is essential for many downstream vision tasks. Unfortunately, existing methods often struggle to balance efficiency and performance due to the complexity of feature modeling. Many of these existing approaches are constrained by rigid architectures and implicit representation learning, often characterized by parameter-heavy designs and a reliance on computationally intensive Vision Transformer-based frameworks. In this work, we introduce an efficient paradigm by synergizing explicit and implicit modeling to balance computational efficiency with representational fidelity. Our method combines well-defined Cartesian directions with explicitly modeled views and implicitly inferred intermediate representations, efficiently capturing global dependencies through a nested attention mechanism. Extensive experiments on challenging datasets, including ADE20K, CityScapes, Pascal Context, and COCO-Stuff, demonstrate that LeMoRe strikes an effective balance between performance and efficiency.

* Accepted at IEEE ICIP 2025

Via

Access Paper or Ask Questions

ContextFormer: Redefining Efficiency in Semantic Segmentation

Jan 31, 2025

Mian Muhammad Naeem Abid, Nancy Mehta, Zongwei Wu, Fayaz Ali Dharejo, Radu Timofte

Figure 1 for ContextFormer: Redefining Efficiency in Semantic Segmentation

Figure 2 for ContextFormer: Redefining Efficiency in Semantic Segmentation

Figure 3 for ContextFormer: Redefining Efficiency in Semantic Segmentation

Figure 4 for ContextFormer: Redefining Efficiency in Semantic Segmentation

Abstract:Semantic segmentation assigns labels to pixels in images, a critical yet challenging task in computer vision. Convolutional methods, although capturing local dependencies well, struggle with long-range relationships. Vision Transformers (ViTs) excel in global context capture but are hindered by high computational demands, especially for high-resolution inputs. Most research optimizes the encoder architecture, leaving the bottleneck underexplored - a key area for enhancing performance and efficiency. We propose ContextFormer, a hybrid framework leveraging the strengths of CNNs and ViTs in the bottleneck to balance efficiency, accuracy, and robustness for real-time semantic segmentation. The framework's efficiency is driven by three synergistic modules: the Token Pyramid Extraction Module (TPEM) for hierarchical multi-scale representation, the Transformer and Modulating DepthwiseConv (Trans-MDC) block for dynamic scale-aware feature modeling, and the Feature Merging Module (FMM) for robust integration with enhanced spatial and contextual consistency. Extensive experiments on ADE20K, Pascal Context, CityScapes, and COCO-Stuff datasets show ContextFormer significantly outperforms existing models, achieving state-of-the-art mIoU scores, setting a new benchmark for efficiency and performance. The codes will be made publicly available.

Via

Access Paper or Ask Questions