Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wenyan Xu

PromptSculptor: Multi-Agent Based Text-to-Image Prompt Optimization

Sep 15, 2025

Dawei Xiang, Wenyan Xu, Kexin Chu, Zixu Shen, Tianqi Ding, Wei Zhang

Figure 1 for PromptSculptor: Multi-Agent Based Text-to-Image Prompt Optimization

Figure 2 for PromptSculptor: Multi-Agent Based Text-to-Image Prompt Optimization

Figure 3 for PromptSculptor: Multi-Agent Based Text-to-Image Prompt Optimization

Figure 4 for PromptSculptor: Multi-Agent Based Text-to-Image Prompt Optimization

Abstract:The rapid advancement of generative AI has democratized access to powerful tools such as Text-to-Image models. However, to generate high-quality images, users must still craft detailed prompts specifying scene, style, and context-often through multiple rounds of refinement. We propose PromptSculptor, a novel multi-agent framework that automates this iterative prompt optimization process. Our system decomposes the task into four specialized agents that work collaboratively to transform a short, vague user prompt into a comprehensive, refined prompt. By leveraging Chain-of-Thought reasoning, our framework effectively infers hidden context and enriches scene and background details. To iteratively refine the prompt, a self-evaluation agent aligns the modified prompt with the original input, while a feedback-tuning agent incorporates user feedback for further refinement. Experimental results demonstrate that PromptSculptor significantly enhances output quality and reduces the number of iterations needed for user satisfaction. Moreover, its model-agnostic design allows seamless integration with various T2I models, paving the way for industrial applications.

* Accepted to EMNLP 2025 System Demonstration Track

Via

Access Paper or Ask Questions