Picture for Xingjian He

Xingjian He

The Instance-centric Transformer for the RVOS Track of LSVOS Challenge: 3rd Place Solution

Add code
Aug 20, 2024
Viaarxiv icon

PVUW 2024 Challenge on Complex Video Understanding: Methods and Results

Add code
Jun 24, 2024
Viaarxiv icon

2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation

Add code
Jun 20, 2024
Figure 1 for 2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation
Figure 2 for 2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation
Viaarxiv icon

Fuse & Calibrate: A bi-directional Vision-Language Guided Framework for Referring Image Segmentation

Add code
May 18, 2024
Viaarxiv icon

Calibration & Reconstruction: Deep Integrated Language for Referring Image Segmentation

Add code
Apr 12, 2024
Viaarxiv icon

SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models

Add code
Mar 20, 2024
Figure 1 for SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Figure 2 for SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Figure 3 for SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Figure 4 for SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Viaarxiv icon

Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions

Add code
Feb 17, 2024
Figure 1 for Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions
Figure 2 for Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions
Figure 3 for Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions
Figure 4 for Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions
Viaarxiv icon

Unveiling Parts Beyond Objects:Towards Finer-Granularity Referring Expression Segmentation

Add code
Dec 13, 2023
Figure 1 for Unveiling Parts Beyond Objects:Towards Finer-Granularity Referring Expression Segmentation
Figure 2 for Unveiling Parts Beyond Objects:Towards Finer-Granularity Referring Expression Segmentation
Figure 3 for Unveiling Parts Beyond Objects:Towards Finer-Granularity Referring Expression Segmentation
Figure 4 for Unveiling Parts Beyond Objects:Towards Finer-Granularity Referring Expression Segmentation
Viaarxiv icon

EAVL: Explicitly Align Vision and Language for Referring Image Segmentation

Add code
Aug 22, 2023
Figure 1 for EAVL: Explicitly Align Vision and Language for Referring Image Segmentation
Figure 2 for EAVL: Explicitly Align Vision and Language for Referring Image Segmentation
Figure 3 for EAVL: Explicitly Align Vision and Language for Referring Image Segmentation
Figure 4 for EAVL: Explicitly Align Vision and Language for Referring Image Segmentation
Viaarxiv icon

COSA: Concatenated Sample Pretrained Vision-Language Foundation Model

Add code
Jun 15, 2023
Figure 1 for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
Figure 2 for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
Figure 3 for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
Figure 4 for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
Viaarxiv icon