Text Extraction From Documents


Text extraction from documents is the process of extracting text data from scanned documents or images.

Application of deep learning approaches for medieval historical documents transcription

Add code
Dec 21, 2025
Viaarxiv icon

A Large-Language-Model Framework for Automated Humanitarian Situation Reporting

Add code
Dec 22, 2025
Viaarxiv icon

Event Extraction in Large Language Model

Add code
Dec 22, 2025
Viaarxiv icon

Seeing Justice Clearly: Handwritten Legal Document Translation with OCR and Vision-Language Models

Add code
Dec 19, 2025
Viaarxiv icon

MFE-GAN: Efficient GAN-based Framework for Document Image Enhancement and Binarization with Multi-scale Feature Extraction

Add code
Dec 16, 2025
Viaarxiv icon

Uni-Parser Technical Report

Add code
Dec 17, 2025
Viaarxiv icon

Building from Scratch: A Multi-Agent Framework with Human-in-the-Loop for Multilingual Legal Terminology Mapping

Add code
Dec 15, 2025
Viaarxiv icon

Benchmarking Document Parsers on Mathematical Formula Extraction from PDFs

Add code
Dec 10, 2025
Viaarxiv icon

VisKnow: Constructing Visual Knowledge Base for Object Understanding

Add code
Dec 09, 2025
Viaarxiv icon

MedDCR: Learning to Design Agentic Workflows for Medical Coding

Add code
Nov 17, 2025
Viaarxiv icon