Picture for Howard Zhou

Howard Zhou

TIPS: Text-Image Pretraining with Spatial Awareness

Add code
Oct 21, 2024
Figure 1 for TIPS: Text-Image Pretraining with Spatial Awareness
Figure 2 for TIPS: Text-Image Pretraining with Spatial Awareness
Figure 3 for TIPS: Text-Image Pretraining with Spatial Awareness
Figure 4 for TIPS: Text-Image Pretraining with Spatial Awareness
Viaarxiv icon

ConDense: Consistent 2D/3D Pre-training for Dense and Sparse Features from Multi-View Images

Add code
Aug 30, 2024
Figure 1 for ConDense: Consistent 2D/3D Pre-training for Dense and Sparse Features from Multi-View Images
Figure 2 for ConDense: Consistent 2D/3D Pre-training for Dense and Sparse Features from Multi-View Images
Figure 3 for ConDense: Consistent 2D/3D Pre-training for Dense and Sparse Features from Multi-View Images
Figure 4 for ConDense: Consistent 2D/3D Pre-training for Dense and Sparse Features from Multi-View Images
Viaarxiv icon

HAMMR: HierArchical MultiModal React agents for generic VQA

Add code
Apr 08, 2024
Viaarxiv icon

Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use

Add code
Mar 05, 2024
Viaarxiv icon

NAVI: Category-Agnostic Image Collections with High-Quality 3D Shape and Pose Annotations

Add code
Jun 15, 2023
Figure 1 for NAVI: Category-Agnostic Image Collections with High-Quality 3D Shape and Pose Annotations
Figure 2 for NAVI: Category-Agnostic Image Collections with High-Quality 3D Shape and Pose Annotations
Figure 3 for NAVI: Category-Agnostic Image Collections with High-Quality 3D Shape and Pose Annotations
Figure 4 for NAVI: Category-Agnostic Image Collections with High-Quality 3D Shape and Pose Annotations
Viaarxiv icon

Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories

Add code
Jun 15, 2023
Viaarxiv icon

LFM-3D: Learnable Feature Matching Across Wide Baselines Using 3D Signals

Add code
Mar 22, 2023
Viaarxiv icon

IBRNet: Learning Multi-View Image-Based Rendering

Add code
Feb 25, 2021
Figure 1 for IBRNet: Learning Multi-View Image-Based Rendering
Figure 2 for IBRNet: Learning Multi-View Image-Based Rendering
Figure 3 for IBRNet: Learning Multi-View Image-Based Rendering
Figure 4 for IBRNet: Learning Multi-View Image-Based Rendering
Viaarxiv icon

Unifying Specialist Image Embedding into Universal Image Embedding

Add code
Mar 08, 2020
Figure 1 for Unifying Specialist Image Embedding into Universal Image Embedding
Figure 2 for Unifying Specialist Image Embedding into Universal Image Embedding
Figure 3 for Unifying Specialist Image Embedding into Universal Image Embedding
Figure 4 for Unifying Specialist Image Embedding into Universal Image Embedding
Viaarxiv icon

The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition

Add code
Oct 18, 2016
Figure 1 for The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition
Figure 2 for The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition
Figure 3 for The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition
Figure 4 for The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition
Viaarxiv icon