Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiangyun Chen

Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models

Mar 22, 2026

Jinghan Cao, Yu Ma, Xinjin Li, Qingyang Ren, Xiangyun Chen

Abstract:Large Language Models achieve remarkable performance but incur substantial computational costs unsuitable for resource-constrained deployments. This paper presents the first comprehensive task-specific efficiency analysis comparing 16 language models across five diverse NLP tasks. We introduce the Performance-Efficiency Ratio (PER), a novel metric integrating accuracy, throughput, memory, and latency through geometric mean normalization. Our systematic evaluation reveals that small models (0.5--3B parameters) achieve superior PER scores across all given tasks. These findings establish quantitative foundations for deploying small models in production environments prioritizing inference efficiency over marginal accuracy gains.

* Accepted for publication at ESANN 2025. This is a task-specific efficiency analysis comparing small language models

Via

Access Paper or Ask Questions