Panoptic Segmentation


Panoptic segmentation is a computer vision task that combines semantic segmentation and instance segmentation to provide a comprehensive understanding of the scene. The goal of panoptic segmentation is to segment the image into semantically meaningful parts or regions, while also detecting and distinguishing individual instances of objects within those regions. In a given image, every pixel is assigned a semantic label, and pixels belonging to things classes (countable objects with instances, like cars and people) are assigned unique instance IDs.

Leverage Cross-Attention for End-to-End Open-Vocabulary Panoptic Reconstruction

Add code
Jan 02, 2025
Figure 1 for Leverage Cross-Attention for End-to-End Open-Vocabulary Panoptic Reconstruction
Figure 2 for Leverage Cross-Attention for End-to-End Open-Vocabulary Panoptic Reconstruction
Figure 3 for Leverage Cross-Attention for End-to-End Open-Vocabulary Panoptic Reconstruction
Figure 4 for Leverage Cross-Attention for End-to-End Open-Vocabulary Panoptic Reconstruction
Viaarxiv icon

Layer-wise Model Merging for Unsupervised Domain Adaptation in Segmentation Tasks

Add code
Sep 24, 2024
Figure 1 for Layer-wise Model Merging for Unsupervised Domain Adaptation in Segmentation Tasks
Figure 2 for Layer-wise Model Merging for Unsupervised Domain Adaptation in Segmentation Tasks
Figure 3 for Layer-wise Model Merging for Unsupervised Domain Adaptation in Segmentation Tasks
Figure 4 for Layer-wise Model Merging for Unsupervised Domain Adaptation in Segmentation Tasks
Viaarxiv icon

A Simple and Generalist Approach for Panoptic Segmentation

Add code
Aug 29, 2024
Figure 1 for A Simple and Generalist Approach for Panoptic Segmentation
Figure 2 for A Simple and Generalist Approach for Panoptic Segmentation
Figure 3 for A Simple and Generalist Approach for Panoptic Segmentation
Figure 4 for A Simple and Generalist Approach for Panoptic Segmentation
Viaarxiv icon

PLGS: Robust Panoptic Lifting with 3D Gaussian Splatting

Add code
Oct 23, 2024
Viaarxiv icon

Condition-Aware Multimodal Fusion for Robust Semantic Perception of Driving Scenes

Add code
Oct 14, 2024
Figure 1 for Condition-Aware Multimodal Fusion for Robust Semantic Perception of Driving Scenes
Figure 2 for Condition-Aware Multimodal Fusion for Robust Semantic Perception of Driving Scenes
Figure 3 for Condition-Aware Multimodal Fusion for Robust Semantic Perception of Driving Scenes
Figure 4 for Condition-Aware Multimodal Fusion for Robust Semantic Perception of Driving Scenes
Viaarxiv icon

Configurable Embodied Data Generation for Class-Agnostic RGB-D Video Segmentation

Add code
Oct 16, 2024
Figure 1 for Configurable Embodied Data Generation for Class-Agnostic RGB-D Video Segmentation
Figure 2 for Configurable Embodied Data Generation for Class-Agnostic RGB-D Video Segmentation
Figure 3 for Configurable Embodied Data Generation for Class-Agnostic RGB-D Video Segmentation
Figure 4 for Configurable Embodied Data Generation for Class-Agnostic RGB-D Video Segmentation
Viaarxiv icon

PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM

Add code
Dec 31, 2024
Figure 1 for PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM
Figure 2 for PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM
Figure 3 for PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM
Figure 4 for PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM
Viaarxiv icon

Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding

Add code
Sep 12, 2024
Figure 1 for Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding
Figure 2 for Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding
Figure 3 for Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding
Figure 4 for Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding
Viaarxiv icon

3rd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation

Add code
Jun 07, 2024
Figure 1 for 3rd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation
Figure 2 for 3rd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation
Figure 3 for 3rd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation
Figure 4 for 3rd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation
Viaarxiv icon

@Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology

Add code
Sep 21, 2024
Figure 1 for @Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology
Figure 2 for @Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology
Figure 3 for @Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology
Figure 4 for @Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology
Viaarxiv icon