Picture for Errui Ding

Errui Ding

LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction

Add code
Jul 16, 2024
Figure 1 for LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
Figure 2 for LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
Figure 3 for LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
Figure 4 for LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
Viaarxiv icon

OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection

Add code
Jul 15, 2024
Viaarxiv icon

OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer

Add code
Jul 15, 2024
Viaarxiv icon

XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis

Add code
Jun 27, 2024
Viaarxiv icon

VDG: Vision-Only Dynamic Gaussian for Driving Simulation

Add code
Jun 26, 2024
Figure 1 for VDG: Vision-Only Dynamic Gaussian for Driving Simulation
Figure 2 for VDG: Vision-Only Dynamic Gaussian for Driving Simulation
Figure 3 for VDG: Vision-Only Dynamic Gaussian for Driving Simulation
Figure 4 for VDG: Vision-Only Dynamic Gaussian for Driving Simulation
Viaarxiv icon

Skim then Focus: Integrating Contextual and Fine-grained Views for Repetitive Action Counting

Add code
Jun 13, 2024
Viaarxiv icon

LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection

Add code
Jun 05, 2024
Figure 1 for LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
Figure 2 for LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
Figure 3 for LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
Figure 4 for LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
Viaarxiv icon

StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond

Add code
Jun 04, 2024
Figure 1 for StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond
Figure 2 for StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond
Figure 3 for StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond
Figure 4 for StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond
Viaarxiv icon

OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding

Add code
Jun 04, 2024
Viaarxiv icon

Towards Unified Multi-granularity Text Detection with Interactive Attention

Add code
May 30, 2024
Figure 1 for Towards Unified Multi-granularity Text Detection with Interactive Attention
Figure 2 for Towards Unified Multi-granularity Text Detection with Interactive Attention
Figure 3 for Towards Unified Multi-granularity Text Detection with Interactive Attention
Figure 4 for Towards Unified Multi-granularity Text Detection with Interactive Attention
Viaarxiv icon