Picture for Muskaan Kumar

Muskaan Kumar

V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM

Add code
May 24, 2024
Figure 1 for V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM
Figure 2 for V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM
Figure 3 for V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM
Figure 4 for V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM
Viaarxiv icon