Scene Text Recognition


Scene text recognition is the process of identifying and transcribing text in natural scenes using computer vision techniques.

Hyper-Local Deformable Transformers for Text Spotting on Historical Maps

Add code
Jun 17, 2025
Viaarxiv icon

Text-Aware Image Restoration with Diffusion Models

Add code
Jun 11, 2025
Viaarxiv icon

EmoNet-Voice: A Fine-Grained, Expert-Verified Benchmark for Speech Emotion Detection

Add code
Jun 11, 2025
Viaarxiv icon

Better Reasoning with Less Data: Enhancing VLMs Through Unified Modality Scoring

Add code
Jun 10, 2025
Viaarxiv icon

Aligning Text, Images, and 3D Structure Token-by-Token

Add code
Jun 09, 2025
Viaarxiv icon

Reading in the Dark with Foveated Event Vision

Add code
Jun 07, 2025
Viaarxiv icon

TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance

Add code
May 29, 2025
Viaarxiv icon

From Data to Modeling: Fully Open-vocabulary Scene Graph Generation

Add code
May 26, 2025
Viaarxiv icon

Correlating instruction-tuning (in multimodal models) with vision-language processing (in the brain)

Add code
May 26, 2025
Viaarxiv icon

Place Recognition: A Comprehensive Review, Current Challenges and Future Directions

Add code
May 20, 2025
Viaarxiv icon