Picture for Zhiding Yu

Zhiding Yu

SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving

Add code
Jun 15, 2023
Viaarxiv icon

Real-Time Radiance Fields for Single-Image Portrait View Synthesis

Add code
May 03, 2023
Figure 1 for Real-Time Radiance Fields for Single-Image Portrait View Synthesis
Figure 2 for Real-Time Radiance Fields for Single-Image Portrait View Synthesis
Figure 3 for Real-Time Radiance Fields for Single-Image Portrait View Synthesis
Figure 4 for Real-Time Radiance Fields for Single-Image Portrait View Synthesis
Viaarxiv icon

Prismer: A Vision-Language Model with An Ensemble of Experts

Add code
Mar 12, 2023
Viaarxiv icon

VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion

Add code
Feb 23, 2023
Figure 1 for VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion
Figure 2 for VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion
Figure 3 for VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion
Figure 4 for VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion
Viaarxiv icon

Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning

Add code
Feb 09, 2023
Figure 1 for Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Figure 2 for Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Figure 3 for Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Figure 4 for Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Viaarxiv icon

Vision Transformers Are Good Mask Auto-Labelers

Add code
Jan 10, 2023
Viaarxiv icon

1st Place Solution of The Robust Vision Challenge 2022 Semantic Segmentation Track

Add code
Nov 07, 2022
Viaarxiv icon

Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models

Add code
Sep 15, 2022
Figure 1 for Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Figure 2 for Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Figure 3 for Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Figure 4 for Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Viaarxiv icon

PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition

Add code
Aug 21, 2022
Figure 1 for PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition
Figure 2 for PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition
Figure 3 for PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition
Figure 4 for PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition
Viaarxiv icon

MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training

Add code
Aug 03, 2022
Figure 1 for MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training
Figure 2 for MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training
Figure 3 for MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training
Figure 4 for MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training
Viaarxiv icon