Picture for Roei Herzig

Roei Herzig

Latent Implicit Visual Reasoning

Add code
Dec 24, 2025
Viaarxiv icon

DAVE: A VLM Vision Encoder for Document Understanding and Web Agents

Add code
Dec 19, 2025
Viaarxiv icon

Activation Reward Models for Few-Shot Model Alignment

Add code
Jul 02, 2025
Viaarxiv icon

TULIP: Towards Unified Language-Image Pretraining

Add code
Mar 19, 2025
Figure 1 for TULIP: Towards Unified Language-Image Pretraining
Figure 2 for TULIP: Towards Unified Language-Image Pretraining
Figure 3 for TULIP: Towards Unified Language-Image Pretraining
Figure 4 for TULIP: Towards Unified Language-Image Pretraining
Viaarxiv icon

Visualizing Thought: Conceptual Diagrams Enable Robust Planning in LMMs

Add code
Mar 14, 2025
Viaarxiv icon

Pre-training Auto-regressive Robotic Models with 4D Representations

Add code
Feb 18, 2025
Figure 1 for Pre-training Auto-regressive Robotic Models with 4D Representations
Figure 2 for Pre-training Auto-regressive Robotic Models with 4D Representations
Figure 3 for Pre-training Auto-regressive Robotic Models with 4D Representations
Figure 4 for Pre-training Auto-regressive Robotic Models with 4D Representations
Viaarxiv icon

Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers

Add code
Nov 28, 2024
Figure 1 for Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers
Figure 2 for Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers
Figure 3 for Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers
Figure 4 for Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers
Viaarxiv icon

In-Context Learning Enables Robot Action Prediction in LLMs

Add code
Oct 16, 2024
Viaarxiv icon

Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning

Add code
Jun 21, 2024
Viaarxiv icon

Navigating the Labyrinth: Evaluating and Enhancing LLMs' Ability to Reason About Search Problems

Add code
Jun 18, 2024
Viaarxiv icon