Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Florian Brand

The ATOM Report: Measuring the Open Language Model Ecosystem

Apr 08, 2026

Nathan Lambert, Florian Brand

Abstract:We present a comprehensive adoption snapshot of the leading open language models and who is building them, focusing on the ~1.5K mainline open models from the likes of Alibaba's Qwen, DeepSeek, Meta's Llama, that are the foundation of an ecosystem crucial to researchers, entrepreneurs, and policy advisors. We document a clear trend where Chinese models overtook their counterparts built in the U.S. in the summer of 2025 and subsequently widened the gap over their western counterparts. We study a mix of Hugging Face downloads and model derivatives, inference market share, performance metrics and more to make a comprehensive picture of the ecosystem.

* 23 pages, 17 figures

Via

Access Paper or Ask Questions

ReadBench: Measuring the Dense Text Visual Reading Ability of Vision-Language Models

May 25, 2025

Benjamin Clavié, Florian Brand

Figure 1 for ReadBench: Measuring the Dense Text Visual Reading Ability of Vision-Language Models

Figure 2 for ReadBench: Measuring the Dense Text Visual Reading Ability of Vision-Language Models

Figure 3 for ReadBench: Measuring the Dense Text Visual Reading Ability of Vision-Language Models

Figure 4 for ReadBench: Measuring the Dense Text Visual Reading Ability of Vision-Language Models

Abstract:Recent advancements in Large Vision-Language Models (VLMs), have greatly enhanced their capability to jointly process text and images. However, despite extensive benchmarks evaluating visual comprehension (e.g., diagrams, color schemes, OCR tasks...), there is limited assessment of VLMs' ability to read and reason about text-rich images effectively. To fill this gap, we introduce ReadBench, a multimodal benchmark specifically designed to evaluate the reading comprehension capabilities of VLMs. ReadBench transposes contexts from established text-only benchmarks into images of text while keeping textual prompts and questions intact. Evaluating leading VLMs with ReadBench, we find minimal-but-present performance degradation on short, text-image inputs, while performance sharply declines for longer, multi-page contexts. Our experiments further reveal that text resolution has negligible effects on multimodal performance. These findings highlight needed improvements in VLMs, particularly their reasoning over visually presented extensive textual content, a capability critical for practical applications. ReadBench is available at https://github.com/answerdotai/ReadBench .

Via

Access Paper or Ask Questions