Abstract:This study addresses the issues of semantic entanglement, unclear label structure, and insufficient feature representation in few-shot text classification, and proposes an optimization framework based on structured prompts to enhance semantic understanding and task adaptation under low-resource conditions. The framework first uses a pretrained language model to encode the input text and obtain basic semantic representations. It then introduces structured prompts composed of multi-dimensional semantic factors and integrates them with text features through a learnable combination mechanism, which forms task-related representations with clear boundaries in the latent space. To further strengthen the consistency between text representations and label semantics, the method constructs a structured label embedding matrix and employs a cross-space alignment mechanism to ensure stable matching between textual features and label attributes. In addition, the model applies prompt orthogonality constraints and a joint optimization objective to maintain independence across different semantic factors in the prompts, allowing the structured prompts to provide transparent and controllable guidance for classification decisions. Three types of sensitivity experiments, including learning rate sensitivity, prompt length sensitivity, and data scale sensitivity, are designed to evaluate the stability and robustness of the framework under different conditions. Experimental results show that the proposed structured prompt optimization framework effectively alleviates semantic conflicts and label ambiguity in few-shot text classification. It significantly improves performance on accuracy, precision, recall, and AUC, and demonstrates strong cross-task applicability.
Abstract:Many real-world applications, such as interactive photo retouching, artistic content creation, and product design, require flexible and iterative image editing. However, existing image editing methods primarily focus on achieving the desired modifications in a single step, which often struggles with ambiguous user intent, complex transformations, or the need for progressive refinements. As a result, these methods frequently produce inconsistent outcomes or fail to meet user expectations. To address these challenges, we propose a multi-turn image editing framework that enables users to iteratively refine their edits, progressively achieving more satisfactory results. Our approach leverages flow matching for accurate image inversion and a dual-objective Linear Quadratic Regulators (LQR) for stable sampling, effectively mitigating error accumulation. Additionally, by analyzing the layer-wise roles of transformers, we introduce a adaptive attention highlighting method that enhances editability while preserving multi-turn coherence. Extensive experiments demonstrate that our framework significantly improves edit success rates and visual fidelity compared to existing methods.