Picture for Hengshuang Zhao

Hengshuang Zhao

LogoSticker: Inserting Logos into Diffusion Models for Customized Generation

Add code
Jul 18, 2024
Viaarxiv icon

ViLLa: Video Reasoning Segmentation with Large Language Model

Add code
Jul 18, 2024
Viaarxiv icon

OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces

Add code
Jul 16, 2024
Viaarxiv icon

HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models

Add code
Jul 11, 2024
Figure 1 for HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
Figure 2 for HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
Figure 3 for HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
Figure 4 for HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
Viaarxiv icon

Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images

Add code
Jul 08, 2024
Viaarxiv icon

Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models

Add code
Jul 07, 2024
Viaarxiv icon

Depth Anything V2

Add code
Jun 13, 2024
Viaarxiv icon

Zero-shot Image Editing with Reference Imitation

Add code
Jun 11, 2024
Viaarxiv icon

LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence

Add code
May 27, 2024
Viaarxiv icon

OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation

Add code
Mar 28, 2024
Figure 1 for OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation
Figure 2 for OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation
Figure 3 for OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation
Figure 4 for OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation
Viaarxiv icon