Picture for Lei Zhang

Lei Zhang

Sid

Responsible Visual Editing

Add code
Apr 08, 2024
Figure 1 for Responsible Visual Editing
Figure 2 for Responsible Visual Editing
Figure 3 for Responsible Visual Editing
Figure 4 for Responsible Visual Editing
Viaarxiv icon

ToolEENet: Tool Affordance 6D Pose Estimation

Add code
Apr 05, 2024
Viaarxiv icon

Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks

Add code
Apr 04, 2024
Figure 1 for Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks
Figure 2 for Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks
Figure 3 for Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks
Figure 4 for Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks
Viaarxiv icon

Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models

Add code
Mar 26, 2024
Viaarxiv icon

Implicit Discriminative Knowledge Learning for Visible-Infrared Person Re-Identification

Add code
Mar 26, 2024
Viaarxiv icon

An Open-World, Diverse, Cross-Spatial-Temporal Benchmark for Dynamic Wild Person Re-Identification

Add code
Mar 22, 2024
Viaarxiv icon

T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

Add code
Mar 21, 2024
Figure 1 for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Figure 2 for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Figure 3 for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Figure 4 for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Viaarxiv icon

MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge Distillation

Add code
Mar 21, 2024
Figure 1 for MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge Distillation
Figure 2 for MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge Distillation
Figure 3 for MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge Distillation
Figure 4 for MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge Distillation
Viaarxiv icon

Compress3D: a Compressed Latent Space for 3D Generation from a Single Image

Add code
Mar 20, 2024
Viaarxiv icon

IVAC-P2L: Leveraging Irregular Repetition Priors for Improving Video Action Counting

Add code
Mar 20, 2024
Viaarxiv icon