Picture for Trevor Darrell

Trevor Darrell

Latent Implicit Visual Reasoning

Add code
Dec 24, 2025
Viaarxiv icon

Visually Prompted Benchmarks Are Surprisingly Fragile

Add code
Dec 19, 2025
Viaarxiv icon

DAVE: A VLM Vision Encoder for Document Understanding and Web Agents

Add code
Dec 19, 2025
Viaarxiv icon

FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos

Add code
Dec 11, 2025
Viaarxiv icon

UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity

Add code
Nov 17, 2025
Viaarxiv icon

Discovering Divergent Representations between Text-to-Image Models

Add code
Sep 10, 2025
Figure 1 for Discovering Divergent Representations between Text-to-Image Models
Figure 2 for Discovering Divergent Representations between Text-to-Image Models
Figure 3 for Discovering Divergent Representations between Text-to-Image Models
Figure 4 for Discovering Divergent Representations between Text-to-Image Models
Viaarxiv icon

Reconstruction Alignment Improves Unified Multimodal Models

Add code
Sep 08, 2025
Viaarxiv icon

MultiGen: Using Multimodal Generation in Simulation to Learn Multimodal Policies in Real

Add code
Jul 03, 2025
Viaarxiv icon

Activation Reward Models for Few-Shot Model Alignment

Add code
Jul 02, 2025
Viaarxiv icon

Whole-Body Conditioned Egocentric Video Prediction

Add code
Jun 26, 2025
Figure 1 for Whole-Body Conditioned Egocentric Video Prediction
Figure 2 for Whole-Body Conditioned Egocentric Video Prediction
Figure 3 for Whole-Body Conditioned Egocentric Video Prediction
Figure 4 for Whole-Body Conditioned Egocentric Video Prediction
Viaarxiv icon