Picture for Seunghyun Park

Seunghyun Park

Watermarking for Factuality: Guiding Vision-Language Models Toward Truth via Tri-layer Contrastive Decoding

Add code
Oct 16, 2025
Viaarxiv icon

CREPE: Coordinate-Aware End-to-End Document Parser

Add code
May 01, 2024
Figure 1 for CREPE: Coordinate-Aware End-to-End Document Parser
Figure 2 for CREPE: Coordinate-Aware End-to-End Document Parser
Figure 3 for CREPE: Coordinate-Aware End-to-End Document Parser
Figure 4 for CREPE: Coordinate-Aware End-to-End Document Parser
Viaarxiv icon

HyperCLOVA X Technical Report

Add code
Apr 13, 2024
Viaarxiv icon

EGTR: Extracting Graph from Transformer for Scene Graph Generation

Add code
Apr 05, 2024
Figure 1 for EGTR: Extracting Graph from Transformer for Scene Graph Generation
Figure 2 for EGTR: Extracting Graph from Transformer for Scene Graph Generation
Figure 3 for EGTR: Extracting Graph from Transformer for Scene Graph Generation
Figure 4 for EGTR: Extracting Graph from Transformer for Scene Graph Generation
Viaarxiv icon

Cream: Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models

Add code
May 24, 2023
Figure 1 for Cream: Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models
Figure 2 for Cream: Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models
Figure 3 for Cream: Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models
Figure 4 for Cream: Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models
Viaarxiv icon

Grounding Visual Representations with Texts for Domain Generalization

Add code
Jul 21, 2022
Figure 1 for Grounding Visual Representations with Texts for Domain Generalization
Figure 2 for Grounding Visual Representations with Texts for Domain Generalization
Figure 3 for Grounding Visual Representations with Texts for Domain Generalization
Figure 4 for Grounding Visual Representations with Texts for Domain Generalization
Viaarxiv icon

An Embedding-Dynamic Approach to Self-supervised Learning

Add code
Jul 07, 2022
Figure 1 for An Embedding-Dynamic Approach to Self-supervised Learning
Figure 2 for An Embedding-Dynamic Approach to Self-supervised Learning
Figure 3 for An Embedding-Dynamic Approach to Self-supervised Learning
Figure 4 for An Embedding-Dynamic Approach to Self-supervised Learning
Viaarxiv icon

DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting

Add code
Mar 10, 2022
Figure 1 for DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting
Figure 2 for DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting
Figure 3 for DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting
Figure 4 for DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting
Viaarxiv icon

Semi-Structured Query Grounding for Document-Oriented Databases with Deep Retrieval and Its Application to Receipt and POI Matching

Add code
Feb 23, 2022
Figure 1 for Semi-Structured Query Grounding for Document-Oriented Databases with Deep Retrieval and Its Application to Receipt and POI Matching
Figure 2 for Semi-Structured Query Grounding for Document-Oriented Databases with Deep Retrieval and Its Application to Receipt and POI Matching
Figure 3 for Semi-Structured Query Grounding for Document-Oriented Databases with Deep Retrieval and Its Application to Receipt and POI Matching
Figure 4 for Semi-Structured Query Grounding for Document-Oriented Databases with Deep Retrieval and Its Application to Receipt and POI Matching
Viaarxiv icon

Donut: Document Understanding Transformer without OCR

Add code
Nov 30, 2021
Figure 1 for Donut: Document Understanding Transformer without OCR
Figure 2 for Donut: Document Understanding Transformer without OCR
Figure 3 for Donut: Document Understanding Transformer without OCR
Figure 4 for Donut: Document Understanding Transformer without OCR
Viaarxiv icon