Picture for Zhaoyu Chen

Zhaoyu Chen

PG-Attack: A Precision-Guided Adversarial Attack Framework Against Vision Foundation Models for Autonomous Driving

Add code
Jul 18, 2024
Viaarxiv icon

Large Vision-Language Models as Emotion Recognizers in Context Awareness

Add code
Jul 16, 2024
Viaarxiv icon

Towards Context-Aware Emotion Recognition Debiasing from a Causal Demystification Perspective via De-confounded Training

Add code
Jul 06, 2024
Viaarxiv icon

Self-Cooperation Knowledge Distillation for Novel Class Discovery

Add code
Jul 02, 2024
Viaarxiv icon

LVOS: A Benchmark for Large-scale Long-term Video Object Segmentation

Add code
May 01, 2024
Figure 1 for LVOS: A Benchmark for Large-scale Long-term Video Object Segmentation
Figure 2 for LVOS: A Benchmark for Large-scale Long-term Video Object Segmentation
Figure 3 for LVOS: A Benchmark for Large-scale Long-term Video Object Segmentation
Figure 4 for LVOS: A Benchmark for Large-scale Long-term Video Object Segmentation
Viaarxiv icon

De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts

Add code
Mar 28, 2024
Figure 1 for De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts
Figure 2 for De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts
Figure 3 for De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts
Figure 4 for De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts
Viaarxiv icon

Improving Adversarial Transferability of Visual-Language Pre-training Models through Collaborative Multimodal Interaction

Add code
Mar 16, 2024
Figure 1 for Improving Adversarial Transferability of Visual-Language Pre-training Models through Collaborative Multimodal Interaction
Figure 2 for Improving Adversarial Transferability of Visual-Language Pre-training Models through Collaborative Multimodal Interaction
Figure 3 for Improving Adversarial Transferability of Visual-Language Pre-training Models through Collaborative Multimodal Interaction
Figure 4 for Improving Adversarial Transferability of Visual-Language Pre-training Models through Collaborative Multimodal Interaction
Viaarxiv icon

OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning

Add code
Mar 14, 2024
Figure 1 for OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning
Figure 2 for OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning
Figure 3 for OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning
Figure 4 for OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning
Viaarxiv icon

ClickVOS: Click Video Object Segmentation

Add code
Mar 10, 2024
Figure 1 for ClickVOS: Click Video Object Segmentation
Figure 2 for ClickVOS: Click Video Object Segmentation
Figure 3 for ClickVOS: Click Video Object Segmentation
Figure 4 for ClickVOS: Click Video Object Segmentation
Viaarxiv icon

Towards Multimodal Human Intention Understanding Debiasing via Subject-Deconfounding

Add code
Mar 08, 2024
Figure 1 for Towards Multimodal Human Intention Understanding Debiasing via Subject-Deconfounding
Figure 2 for Towards Multimodal Human Intention Understanding Debiasing via Subject-Deconfounding
Figure 3 for Towards Multimodal Human Intention Understanding Debiasing via Subject-Deconfounding
Figure 4 for Towards Multimodal Human Intention Understanding Debiasing via Subject-Deconfounding
Viaarxiv icon