Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach

Jun 06, 2025

James Ford, Anthony Rios

Figure 1 for Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach

Figure 2 for Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach

Figure 3 for Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach

Figure 4 for Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach

Share this with someone who'll enjoy it:

Abstract:Large language models can translate natural-language chart descriptions into runnable code, yet approximately 15\% of the generated scripts still fail to execute, even after supervised fine-tuning and reinforcement learning. We investigate whether this persistent error rate stems from model limitations or from reliance on a single-prompt design. To explore this, we propose a lightweight multi-agent pipeline that separates drafting, execution, repair, and judgment, using only an off-the-shelf GPT-4o-mini model. On the \textsc{Text2Chart31} benchmark, our system reduces execution errors to 4.5\% within three repair iterations, outperforming the strongest fine-tuned baseline by nearly 5 percentage points while requiring significantly less compute. Similar performance is observed on the \textsc{ChartX} benchmark, with an error rate of 4.6\%, demonstrating strong generalization. Under current benchmarks, execution success appears largely solved. However, manual review reveals that 6 out of 100 sampled charts contain hallucinations, and an LLM-based accessibility audit shows that only 33.3\% (\textsc{Text2Chart31}) and 7.2\% (\textsc{ChartX}) of generated charts satisfy basic colorblindness guidelines. These findings suggest that future work should shift focus from execution reliability toward improving chart aesthetics, semantic fidelity, and accessibility.

* 8 pages

View paper on

Share this with someone who'll enjoy it:

Title:Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach

Paper and Code