Picture for Derek Hoiem

Derek Hoiem

FRAME: Pre-Training Video Feature Representations via Anticipation and Memory

Add code
Jun 05, 2025
Viaarxiv icon

TextRegion: Text-Aligned Region Tokens from Frozen Image-Text Models

Add code
May 29, 2025
Viaarxiv icon

PARTONOMY: Large Multimodal Models with Part-Level Visual Understanding

Add code
May 27, 2025
Viaarxiv icon

REN: Fast and Efficient Region Encodings from Patch-Based Image Encoders

Add code
May 23, 2025
Viaarxiv icon

Can We Generate Visual Programs Without Prompting LLMs?

Add code
Dec 11, 2024
Viaarxiv icon

RELOCATE: A Simple Training-Free Baseline for Visual Query Localization Using Region-Based Representations

Add code
Dec 02, 2024
Viaarxiv icon

Anytime Continual Learning for Open Vocabulary Classification

Add code
Sep 13, 2024
Viaarxiv icon

MonoPatchNeRF: Improving Neural Radiance Fields with Patch-based Monocular Guidance

Add code
Apr 12, 2024
Viaarxiv icon

Region-Based Representations Revisited

Add code
Feb 04, 2024
Viaarxiv icon

Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action

Add code
Dec 28, 2023
Viaarxiv icon