Picture for Zhenyu He

Zhenyu He

Fusing in 3D: Free-Viewpoint Fusion Rendering with a 3D Infrared-Visible Scene Representation

Add code
Jan 19, 2026
Viaarxiv icon

Modality-Decoupled RGB-Thermal Object Detector via Query Fusion

Add code
Jan 13, 2026
Viaarxiv icon

FusionFM: All-in-One Multi-Modal Image Fusion with Flow Matching

Add code
Nov 17, 2025
Viaarxiv icon

Learning A Robust RGB-Thermal Detector for Extreme Modality Imbalance

Add code
May 28, 2025
Viaarxiv icon

Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning

Add code
Feb 12, 2025
Figure 1 for Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning
Figure 2 for Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning
Figure 3 for Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning
Figure 4 for Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning
Viaarxiv icon

ZeroBP: Learning Position-Aware Correspondence for Zero-shot 6D Pose Estimation in Bin-Picking

Add code
Feb 03, 2025
Viaarxiv icon

PRSI: Privacy-Preserving Recommendation Model Based on Vector Splitting and Interactive Protocols

Add code
Nov 27, 2024
Figure 1 for PRSI: Privacy-Preserving Recommendation Model Based on Vector Splitting and Interactive Protocols
Figure 2 for PRSI: Privacy-Preserving Recommendation Model Based on Vector Splitting and Interactive Protocols
Figure 3 for PRSI: Privacy-Preserving Recommendation Model Based on Vector Splitting and Interactive Protocols
Figure 4 for PRSI: Privacy-Preserving Recommendation Model Based on Vector Splitting and Interactive Protocols
Viaarxiv icon

MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking

Add code
Nov 23, 2024
Figure 1 for MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking
Figure 2 for MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking
Figure 3 for MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking
Figure 4 for MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking
Viaarxiv icon

LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation

Add code
Sep 09, 2024
Figure 1 for LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation
Figure 2 for LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation
Figure 3 for LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation
Figure 4 for LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation
Viaarxiv icon

Discriminative Spatial-Semantic VOS Solution: 1st Place Solution for 6th LSVOS

Add code
Aug 29, 2024
Figure 1 for Discriminative Spatial-Semantic VOS Solution: 1st Place Solution for 6th LSVOS
Figure 2 for Discriminative Spatial-Semantic VOS Solution: 1st Place Solution for 6th LSVOS
Figure 3 for Discriminative Spatial-Semantic VOS Solution: 1st Place Solution for 6th LSVOS
Viaarxiv icon