Picture for Ashish Choithani

Ashish Choithani

Do Thought Streams Matter? Evaluating Reasoning in Gemini Vision-Language Models for Video Scene Understanding

Add code
Apr 13, 2026
Viaarxiv icon

Benchmarking Vision-Language Models on Optical Character Recognition in Dynamic Video Environments

Add code
Feb 10, 2025
Figure 1 for Benchmarking Vision-Language Models on Optical Character Recognition in Dynamic Video Environments
Figure 2 for Benchmarking Vision-Language Models on Optical Character Recognition in Dynamic Video Environments
Figure 3 for Benchmarking Vision-Language Models on Optical Character Recognition in Dynamic Video Environments
Figure 4 for Benchmarking Vision-Language Models on Optical Character Recognition in Dynamic Video Environments
Viaarxiv icon