Picture for Roy Ganz

Roy Ganz

DREAM: Deep Research Evaluation with Agentic Metrics

Add code
Feb 21, 2026
Viaarxiv icon

DODO: Discrete OCR Diffusion Models

Add code
Feb 18, 2026
Viaarxiv icon

DocVLM: Make Your VLM an Efficient Reader

Add code
Dec 11, 2024
Figure 1 for DocVLM: Make Your VLM an Efficient Reader
Figure 2 for DocVLM: Make Your VLM an Efficient Reader
Figure 3 for DocVLM: Make Your VLM an Efficient Reader
Figure 4 for DocVLM: Make Your VLM an Efficient Reader
Viaarxiv icon

TAP-VL: Text Layout-Aware Pre-training for Enriched Vision-Language Models

Add code
Nov 07, 2024
Figure 1 for TAP-VL: Text Layout-Aware Pre-training for Enriched Vision-Language Models
Figure 2 for TAP-VL: Text Layout-Aware Pre-training for Enriched Vision-Language Models
Figure 3 for TAP-VL: Text Layout-Aware Pre-training for Enriched Vision-Language Models
Figure 4 for TAP-VL: Text Layout-Aware Pre-training for Enriched Vision-Language Models
Viaarxiv icon

Text-to-Image Generation Via Energy-Based CLIP

Add code
Aug 30, 2024
Figure 1 for Text-to-Image Generation Via Energy-Based CLIP
Figure 2 for Text-to-Image Generation Via Energy-Based CLIP
Figure 3 for Text-to-Image Generation Via Energy-Based CLIP
Figure 4 for Text-to-Image Generation Via Energy-Based CLIP
Viaarxiv icon

Adversaries With Incentives: A Strategic Alternative to Adversarial Robustness

Add code
Jun 17, 2024
Viaarxiv icon

Enhancing Consistency-Based Image Generation via Adversarialy-Trained Classification and Energy-Based Discrimination

Add code
May 25, 2024
Figure 1 for Enhancing Consistency-Based Image Generation via Adversarialy-Trained Classification and Energy-Based Discrimination
Figure 2 for Enhancing Consistency-Based Image Generation via Adversarialy-Trained Classification and Energy-Based Discrimination
Figure 3 for Enhancing Consistency-Based Image Generation via Adversarialy-Trained Classification and Energy-Based Discrimination
Figure 4 for Enhancing Consistency-Based Image Generation via Adversarialy-Trained Classification and Energy-Based Discrimination
Viaarxiv icon

Paint by Inpaint: Learning to Add Image Objects by Removing Them First

Add code
Apr 28, 2024
Figure 1 for Paint by Inpaint: Learning to Add Image Objects by Removing Them First
Figure 2 for Paint by Inpaint: Learning to Add Image Objects by Removing Them First
Figure 3 for Paint by Inpaint: Learning to Add Image Objects by Removing Them First
Figure 4 for Paint by Inpaint: Learning to Add Image Objects by Removing Them First
Viaarxiv icon

Question Aware Vision Transformer for Multimodal Reasoning

Add code
Feb 08, 2024
Figure 1 for Question Aware Vision Transformer for Multimodal Reasoning
Figure 2 for Question Aware Vision Transformer for Multimodal Reasoning
Figure 3 for Question Aware Vision Transformer for Multimodal Reasoning
Figure 4 for Question Aware Vision Transformer for Multimodal Reasoning
Viaarxiv icon

GRAM: Global Reasoning for Multi-Page VQA

Add code
Jan 07, 2024
Figure 1 for GRAM: Global Reasoning for Multi-Page VQA
Figure 2 for GRAM: Global Reasoning for Multi-Page VQA
Figure 3 for GRAM: Global Reasoning for Multi-Page VQA
Figure 4 for GRAM: Global Reasoning for Multi-Page VQA
Viaarxiv icon