Picture for Zhiding Yu

Zhiding Yu

VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion

Add code
Feb 23, 2023
Figure 1 for VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion
Figure 2 for VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion
Figure 3 for VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion
Figure 4 for VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion
Viaarxiv icon

Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning

Add code
Feb 09, 2023
Figure 1 for Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Figure 2 for Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Figure 3 for Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Figure 4 for Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Viaarxiv icon

Vision Transformers Are Good Mask Auto-Labelers

Add code
Jan 10, 2023
Viaarxiv icon

1st Place Solution of The Robust Vision Challenge 2022 Semantic Segmentation Track

Add code
Nov 07, 2022
Viaarxiv icon

Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models

Add code
Sep 15, 2022
Figure 1 for Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Figure 2 for Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Figure 3 for Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Figure 4 for Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Viaarxiv icon

PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition

Add code
Aug 21, 2022
Figure 1 for PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition
Figure 2 for PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition
Figure 3 for PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition
Figure 4 for PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition
Viaarxiv icon

MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training

Add code
Aug 03, 2022
Figure 1 for MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training
Figure 2 for MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training
Figure 3 for MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training
Figure 4 for MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training
Viaarxiv icon

How Much More Data Do I Need? Estimating Requirements for Downstream Tasks

Add code
Jul 13, 2022
Figure 1 for How Much More Data Do I Need? Estimating Requirements for Downstream Tasks
Figure 2 for How Much More Data Do I Need? Estimating Requirements for Downstream Tasks
Figure 3 for How Much More Data Do I Need? Estimating Requirements for Downstream Tasks
Figure 4 for How Much More Data Do I Need? Estimating Requirements for Downstream Tasks
Viaarxiv icon

Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions

Add code
May 27, 2022
Figure 1 for Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions
Figure 2 for Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions
Figure 3 for Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions
Figure 4 for Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions
Viaarxiv icon

Understanding The Robustness in Vision Transformers

Add code
Apr 27, 2022
Figure 1 for Understanding The Robustness in Vision Transformers
Figure 2 for Understanding The Robustness in Vision Transformers
Figure 3 for Understanding The Robustness in Vision Transformers
Figure 4 for Understanding The Robustness in Vision Transformers
Viaarxiv icon