Scene Text Recognition


Scene text recognition is the process of identifying and transcribing text in natural scenes using computer vision techniques.

TiCLS : Tightly Coupled Language Text Spotter

Add code
Feb 03, 2026
Viaarxiv icon

Text is All You Need for Vision-Language Model Jailbreaking

Add code
Jan 31, 2026
Viaarxiv icon

TIGaussian: Disentangle Gaussians for Spatial-Awared Text-Image-3D Alignment

Add code
Jan 27, 2026
Viaarxiv icon

Text-Pass Filter: An Efficient Scene Text Detector

Add code
Jan 26, 2026
Viaarxiv icon

Evaluating the encoding competence of visual language models using uncommon actions

Add code
Jan 12, 2026
Viaarxiv icon

MTMCS-Bench: Evaluating Contextual Safety of Multimodal Large Language Models in Multi-Turn Dialogues

Add code
Jan 11, 2026
Viaarxiv icon

Leveraging 2D-VLM for Label-Free 3D Segmentation in Large-Scale Outdoor Scene Understanding

Add code
Jan 05, 2026
Viaarxiv icon

EarthVL: A Progressive Earth Vision-Language Understanding and Generation Framework

Add code
Jan 06, 2026
Viaarxiv icon

Bridging Modalities and Transferring Knowledge: Enhanced Multimodal Understanding and Recognition

Add code
Dec 23, 2025
Viaarxiv icon

Text2Graph VPR: A Text-to-Graph Expert System for Explainable Place Recognition in Changing Environments

Add code
Dec 21, 2025
Viaarxiv icon