Picture for Heng Tao Shen

Heng Tao Shen

GeoPurify: A Data-Efficient Geometric Distillation Framework for Open-Vocabulary 3D Segmentation

Add code
Oct 02, 2025
Figure 1 for GeoPurify: A Data-Efficient Geometric Distillation Framework for Open-Vocabulary 3D Segmentation
Figure 2 for GeoPurify: A Data-Efficient Geometric Distillation Framework for Open-Vocabulary 3D Segmentation
Figure 3 for GeoPurify: A Data-Efficient Geometric Distillation Framework for Open-Vocabulary 3D Segmentation
Figure 4 for GeoPurify: A Data-Efficient Geometric Distillation Framework for Open-Vocabulary 3D Segmentation
Viaarxiv icon

Unified modality separation: A vision-language framework for unsupervised domain adaptation

Add code
Aug 07, 2025
Figure 1 for Unified modality separation: A vision-language framework for unsupervised domain adaptation
Figure 2 for Unified modality separation: A vision-language framework for unsupervised domain adaptation
Figure 3 for Unified modality separation: A vision-language framework for unsupervised domain adaptation
Figure 4 for Unified modality separation: A vision-language framework for unsupervised domain adaptation
Viaarxiv icon

Implicit Counterfactual Learning for Audio-Visual Segmentation

Add code
Jul 28, 2025
Figure 1 for Implicit Counterfactual Learning for Audio-Visual Segmentation
Figure 2 for Implicit Counterfactual Learning for Audio-Visual Segmentation
Figure 3 for Implicit Counterfactual Learning for Audio-Visual Segmentation
Figure 4 for Implicit Counterfactual Learning for Audio-Visual Segmentation
Viaarxiv icon

Multimodal Mathematical Reasoning with Diverse Solving Perspective

Add code
Jul 03, 2025
Figure 1 for Multimodal Mathematical Reasoning with Diverse Solving Perspective
Figure 2 for Multimodal Mathematical Reasoning with Diverse Solving Perspective
Figure 3 for Multimodal Mathematical Reasoning with Diverse Solving Perspective
Figure 4 for Multimodal Mathematical Reasoning with Diverse Solving Perspective
Viaarxiv icon

SafePTR: Token-Level Jailbreak Defense in Multimodal LLMs via Prune-then-Restore Mechanism

Add code
Jul 02, 2025
Viaarxiv icon

Rethinking Range-View LiDAR Segmentation in Adverse Weather

Add code
Jun 10, 2025
Figure 1 for Rethinking Range-View LiDAR Segmentation in Adverse Weather
Figure 2 for Rethinking Range-View LiDAR Segmentation in Adverse Weather
Figure 3 for Rethinking Range-View LiDAR Segmentation in Adverse Weather
Figure 4 for Rethinking Range-View LiDAR Segmentation in Adverse Weather
Viaarxiv icon

Truth in the Few: High-Value Data Selection for Efficient Multi-Modal Reasoning

Add code
Jun 05, 2025
Viaarxiv icon

InSpire: Vision-Language-Action Models with Intrinsic Spatial Reasoning

Add code
May 20, 2025
Viaarxiv icon

Policy Contrastive Decoding for Robotic Foundation Models

Add code
May 19, 2025
Viaarxiv icon

Towards Generalized and Training-Free Text-Guided Semantic Manipulation

Add code
Apr 24, 2025
Figure 1 for Towards Generalized and Training-Free Text-Guided Semantic Manipulation
Figure 2 for Towards Generalized and Training-Free Text-Guided Semantic Manipulation
Figure 3 for Towards Generalized and Training-Free Text-Guided Semantic Manipulation
Figure 4 for Towards Generalized and Training-Free Text-Guided Semantic Manipulation
Viaarxiv icon