Picture for Geewook Kim

Geewook Kim

MMRefine: Unveiling the Obstacles to Robust Refinement in Multimodal Large Language Models

Add code
Jun 05, 2025
Viaarxiv icon

Evaluating Multimodal Generative AI with Korean Educational Standards

Add code
Feb 21, 2025
Viaarxiv icon

How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?

Add code
Oct 10, 2024
Figure 1 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Figure 2 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Figure 3 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Figure 4 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Viaarxiv icon

On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning

Add code
Jun 17, 2024
Viaarxiv icon

CREPE: Coordinate-Aware End-to-End Document Parser

Add code
May 01, 2024
Figure 1 for CREPE: Coordinate-Aware End-to-End Document Parser
Figure 2 for CREPE: Coordinate-Aware End-to-End Document Parser
Figure 3 for CREPE: Coordinate-Aware End-to-End Document Parser
Figure 4 for CREPE: Coordinate-Aware End-to-End Document Parser
Viaarxiv icon

HyperCLOVA X Technical Report

Add code
Apr 13, 2024
Viaarxiv icon

Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation

Add code
Jan 12, 2024
Viaarxiv icon

SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap

Add code
Sep 21, 2023
Viaarxiv icon

Cream: Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models

Add code
May 24, 2023
Figure 1 for Cream: Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models
Figure 2 for Cream: Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models
Figure 3 for Cream: Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models
Figure 4 for Cream: Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models
Viaarxiv icon

Technical Report on Web-based Visual Corpus Construction for Visual Document Understanding

Add code
Nov 07, 2022
Figure 1 for Technical Report on Web-based Visual Corpus Construction for Visual Document Understanding
Figure 2 for Technical Report on Web-based Visual Corpus Construction for Visual Document Understanding
Figure 3 for Technical Report on Web-based Visual Corpus Construction for Visual Document Understanding
Figure 4 for Technical Report on Web-based Visual Corpus Construction for Visual Document Understanding
Viaarxiv icon