Picture for Kaining Ying

Kaining Ying

Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation

Add code
Jul 30, 2025
Viaarxiv icon

MOVE: Motion-Guided Few-Shot Video Object Segmentation

Add code
Jul 29, 2025
Viaarxiv icon

MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

Add code
Apr 24, 2024
Figure 1 for MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
Figure 2 for MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
Figure 3 for MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
Figure 4 for MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
Viaarxiv icon

CTVIS: Consistent Training for Online Video Instance Segmentation

Add code
Jul 24, 2023
Viaarxiv icon

Human-to-Human Interaction Detection

Add code
Jul 02, 2023
Viaarxiv icon

ISDA: Position-Aware Instance Segmentation with Deformable Attention

Add code
Feb 23, 2022
Figure 1 for ISDA: Position-Aware Instance Segmentation with Deformable Attention
Figure 2 for ISDA: Position-Aware Instance Segmentation with Deformable Attention
Figure 3 for ISDA: Position-Aware Instance Segmentation with Deformable Attention
Figure 4 for ISDA: Position-Aware Instance Segmentation with Deformable Attention
Viaarxiv icon