Scene Text Recognition


Scene text recognition is the process of identifying and transcribing text in natural scenes using computer vision techniques.

LMAD: Integrated End-to-End Vision-Language Model for Explainable Autonomous Driving

Add code
Aug 17, 2025
Figure 1 for LMAD: Integrated End-to-End Vision-Language Model for Explainable Autonomous Driving
Figure 2 for LMAD: Integrated End-to-End Vision-Language Model for Explainable Autonomous Driving
Figure 3 for LMAD: Integrated End-to-End Vision-Language Model for Explainable Autonomous Driving
Figure 4 for LMAD: Integrated End-to-End Vision-Language Model for Explainable Autonomous Driving
Viaarxiv icon

MiDashengLM: Efficient Audio Understanding with General Audio Captions

Add code
Aug 06, 2025
Figure 1 for MiDashengLM: Efficient Audio Understanding with General Audio Captions
Figure 2 for MiDashengLM: Efficient Audio Understanding with General Audio Captions
Figure 3 for MiDashengLM: Efficient Audio Understanding with General Audio Captions
Figure 4 for MiDashengLM: Efficient Audio Understanding with General Audio Captions
Viaarxiv icon

Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis

Add code
Jul 15, 2025
Figure 1 for Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis
Figure 2 for Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis
Figure 3 for Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis
Figure 4 for Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis
Viaarxiv icon

TransLPRNet: Lite Vision-Language Network for Single/Dual-line Chinese License Plate Recognition

Add code
Jul 23, 2025
Viaarxiv icon

Detecting Visual Information Manipulation Attacks in Augmented Reality: A Multimodal Semantic Reasoning Approach

Add code
Jul 27, 2025
Viaarxiv icon

Efficient and Accurate Scene Text Recognition with Cascaded-Transformers

Add code
Mar 24, 2025
Viaarxiv icon

Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation

Add code
Mar 20, 2025
Figure 1 for Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation
Figure 2 for Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation
Figure 3 for Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation
Figure 4 for Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation
Viaarxiv icon

Team RAS in 9th ABAW Competition: Multimodal Compound Expression Recognition Approach

Add code
Jul 02, 2025
Viaarxiv icon

Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition

Add code
Mar 24, 2025
Viaarxiv icon

A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition

Add code
Mar 19, 2025
Figure 1 for A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition
Figure 2 for A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition
Figure 3 for A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition
Figure 4 for A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition
Viaarxiv icon