Alert button
Picture for Roei Herzig

Roei Herzig

Alert button

TraveLER: A Multi-LMM Agent Framework for Video Question-Answering

Add code
Bookmark button
Alert button
Apr 01, 2024
Chuyi Shang, Amos You, Sanjay Subramanian, Trevor Darrell, Roei Herzig

Viaarxiv icon

Unsupervised Universal Image Segmentation

Add code
Bookmark button
Alert button
Dec 28, 2023
Dantong Niu, Xudong Wang, Xinyang Han, Long Lian, Roei Herzig, Trevor Darrell

Viaarxiv icon

Recursive Visual Programming

Add code
Bookmark button
Alert button
Dec 04, 2023
Jiaxin Ge, Sanjay Subramanian, Baifeng Shi, Roei Herzig, Trevor Darrell

Viaarxiv icon

Object-based (yet Class-agnostic) Video Domain Adaptation

Add code
Bookmark button
Alert button
Nov 29, 2023
Dantong Niu, Amir Bar, Roei Herzig, Trevor Darrell, Anna Rohrbach

Viaarxiv icon

Compositional Chain-of-Thought Prompting for Large Multimodal Models

Add code
Bookmark button
Alert button
Nov 27, 2023
Chancharik Mitra, Brandon Huang, Trevor Darrell, Roei Herzig

Viaarxiv icon

Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models

Add code
Bookmark button
Alert button
Jun 01, 2023
Sivan Doveh, Assaf Arbelle, Sivan Harary, Roei Herzig, Donghyun Kim, Paola Cascante-bonilla, Amit Alfassy, Rameswar Panda, Raja Giryes, Rogerio Feris, Shimon Ullman, Leonid Karlinsky

Figure 1 for Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Figure 2 for Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Figure 3 for Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Figure 4 for Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Viaarxiv icon

Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs

Add code
Bookmark button
Alert button
May 10, 2023
Roei Herzig, Alon Mendelson, Leonid Karlinsky, Assaf Arbelle, Rogerio Feris, Trevor Darrell, Amir Globerson

Figure 1 for Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
Figure 2 for Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
Figure 3 for Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
Figure 4 for Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
Viaarxiv icon

PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data

Add code
Bookmark button
Alert button
Dec 08, 2022
Roei Herzig, Ofir Abramovich, Elad Ben-Avraham, Assaf Arbelle, Leonid Karlinsky, Ariel Shamir, Trevor Darrell, Amir Globerson

Figure 1 for PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data
Figure 2 for PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data
Figure 3 for PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data
Figure 4 for PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data
Viaarxiv icon

Teaching Structured Vision&Language Concepts to Vision&Language Models

Add code
Bookmark button
Alert button
Nov 21, 2022
Sivan Doveh, Assaf Arbelle, Sivan Harary, Rameswar Panda, Roei Herzig, Eli Schwartz, Donghyun Kim, Raja Giryes, Rogerio Feris, Shimon Ullman, Leonid Karlinsky

Figure 1 for Teaching Structured Vision&Language Concepts to Vision&Language Models
Figure 2 for Teaching Structured Vision&Language Concepts to Vision&Language Models
Figure 3 for Teaching Structured Vision&Language Concepts to Vision&Language Models
Figure 4 for Teaching Structured Vision&Language Concepts to Vision&Language Models
Viaarxiv icon

FETA: Towards Specializing Foundation Models for Expert Task Applications

Add code
Bookmark button
Alert button
Sep 08, 2022
Amit Alfassy, Assaf Arbelle, Oshri Halimi, Sivan Harary, Roei Herzig, Eli Schwartz, Rameswar Panda, Michele Dolfi, Christoph Auer, Kate Saenko, PeterW. J. Staar, Rogerio Feris, Leonid Karlinsky

Figure 1 for FETA: Towards Specializing Foundation Models for Expert Task Applications
Figure 2 for FETA: Towards Specializing Foundation Models for Expert Task Applications
Figure 3 for FETA: Towards Specializing Foundation Models for Expert Task Applications
Figure 4 for FETA: Towards Specializing Foundation Models for Expert Task Applications
Viaarxiv icon