Picture for Shuting He

Shuting He

RSGround-R1: Rethinking Remote Sensing Visual Grounding through Spatial Reasoning

Add code
Jan 29, 2026
Viaarxiv icon

DiffStyle3D: Consistent 3D Gaussian Stylization via Attention Optimization

Add code
Jan 27, 2026
Viaarxiv icon

MeViS: A Multi-Modal Dataset for Referring Motion Expression Video Segmentation

Add code
Dec 11, 2025
Figure 1 for MeViS: A Multi-Modal Dataset for Referring Motion Expression Video Segmentation
Figure 2 for MeViS: A Multi-Modal Dataset for Referring Motion Expression Video Segmentation
Figure 3 for MeViS: A Multi-Modal Dataset for Referring Motion Expression Video Segmentation
Figure 4 for MeViS: A Multi-Modal Dataset for Referring Motion Expression Video Segmentation
Viaarxiv icon

SplitFlux: Learning to Decouple Content and Style from a Single Image

Add code
Nov 19, 2025
Figure 1 for SplitFlux: Learning to Decouple Content and Style from a Single Image
Figure 2 for SplitFlux: Learning to Decouple Content and Style from a Single Image
Figure 3 for SplitFlux: Learning to Decouple Content and Style from a Single Image
Figure 4 for SplitFlux: Learning to Decouple Content and Style from a Single Image
Viaarxiv icon

ReferSplat: Referring Segmentation in 3D Gaussian Splatting

Add code
Aug 11, 2025
Viaarxiv icon

FantasyStyle: Controllable Stylized Distillation for 3D Gaussian Splatting

Add code
Aug 11, 2025
Viaarxiv icon

MOSEv2: A More Challenging Dataset for Video Object Segmentation in Complex Scenes

Add code
Aug 07, 2025
Viaarxiv icon

Multimodal Referring Segmentation: A Survey

Add code
Aug 01, 2025
Viaarxiv icon

PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild

Add code
Apr 15, 2025
Figure 1 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Figure 2 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Figure 3 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Figure 4 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Viaarxiv icon

Taylor Series-Inspired Local Structure Fitting Network for Few-shot Point Cloud Semantic Segmentation

Add code
Apr 03, 2025
Viaarxiv icon