Scene Text Recognition


Scene text recognition is the process of identifying and transcribing text in natural scenes using computer vision techniques.

Large OCR Model:An Empirical Study of Scaling Law for OCR

Add code
Jan 02, 2024
Viaarxiv icon

Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following

Add code
Jun 06, 2024
Figure 1 for Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following
Figure 2 for Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following
Figure 3 for Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following
Figure 4 for Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following
Viaarxiv icon

HierCode: A Lightweight Hierarchical Codebook for Zero-shot Chinese Text Recognition

Add code
Mar 20, 2024
Viaarxiv icon

SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models

Add code
Jun 03, 2024
Figure 1 for SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models
Figure 2 for SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models
Figure 3 for SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models
Figure 4 for SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models
Viaarxiv icon

Recognition-Guided Diffusion Model for Scene Text Image Super-Resolution

Add code
Nov 22, 2023
Viaarxiv icon

Scene Text Recognition Models Explainability Using Local Features

Add code
Oct 14, 2023
Figure 1 for Scene Text Recognition Models Explainability Using Local Features
Figure 2 for Scene Text Recognition Models Explainability Using Local Features
Figure 3 for Scene Text Recognition Models Explainability Using Local Features
Figure 4 for Scene Text Recognition Models Explainability Using Local Features
Viaarxiv icon

Learning 3D Robotics Perception using Inductive Priors

Add code
May 30, 2024
Viaarxiv icon

Scene Text Image Super-resolution based on Text-conditional Diffusion Models

Add code
Nov 16, 2023
Viaarxiv icon

Symmetrical Linguistic Feature Distillation with CLIP for Scene Text Recognition

Add code
Oct 10, 2023
Viaarxiv icon

Adversarial Training with OCR Modality Perturbation for Scene-Text Visual Question Answering

Add code
Mar 14, 2024
Viaarxiv icon