Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jupeng Ding

TMP: Tree-structured Mixed-policy Pruning for Large-scale Image Generation and Editing

Jun 25, 2026

Peizhen Zhang, Yang Li, Xunsong Li, Songtao Liu, Zewen Liu, Qiangqiang Hu, Guotong Guo, Jupeng Ding, Yifu Sun, coopersli(+3 more)

Abstract:Modern image generation model rapidly grows their sizes to meet high-fidelity image synthesis. However, they gradually become unaffordable for their enormous parameter consumption and computation budget that lead to massive resources requirement and gpu memory footprint. In this paper, we propose TMP, the first Tree-structured Mixed-policy Pruning framework that generalizes prevalent image tasks (T2I and TI2I) and architectures (Mixture-of-Experts (MoE) and Diffusion transformer (DiT)). It could be applied to the step-distilled models and contribute as the last stage. We perform experiments upon current open-sourced SOTA HunyuanImage-3.0 instruct and a popular efficient model Z-Image turbo. The proposed pruning framework manages to compress HunyuanImage 3.0 from 80B to 20B parameters at 75% reduction ratio, sacrificing limited generation quality. We also optimize to enable the inference of the pruned 20B version of HunyuanImage 3.0 on a single 24GB 4090 GPU by engineering skills. The inference script and model weight have been integrated into the existing HunyuanImage3.0 open-source github and huggingface repository. Besides, we prove the efficacy of TMP by compressing Z-Image turbo from 6B to 4B (33% reduction) with negligible degradation.

* 10 pages, 3 figures, 3 tables, tech report

Via

Access Paper or Ask Questions

To Tune or Not To Tune? How About the Best of Both Worlds?

Jul 09, 2019

Ran Wang, Haibo Su, Chunye Wang, Kailin Ji, Jupeng Ding

Figure 1 for To Tune or Not To Tune? How About the Best of Both Worlds?

Figure 2 for To Tune or Not To Tune? How About the Best of Both Worlds?

Figure 3 for To Tune or Not To Tune? How About the Best of Both Worlds?

Figure 4 for To Tune or Not To Tune? How About the Best of Both Worlds?

Abstract:The introduction of pre-trained language models has revolutionized natural language research communities. However, researchers still know relatively little regarding their theoretical and empirical properties. In this regard, Peters et al. perform several experiments which demonstrate that it is better to adapt BERT with a light-weight task-specific head, rather than building a complex one on top of the pre-trained language model, and freeze the parameters in the said language model. However, there is another option to adopt. In this paper, we propose a new adaptation method which we first train the task model with the BERT parameters frozen and then fine-tune the entire model together. Our experimental results show that our model adaptation method can achieve 4.7% accuracy improvement in semantic similarity task, 0.99% accuracy improvement in sequence labeling task and 0.72% accuracy improvement in the text classification task.

Via

Access Paper or Ask Questions