Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:EAM: Enhancing Anything with Diffusion Transformers for Blind Super-Resolution

May 08, 2025

Haizhen Xie, Kunpeng Du, Qiangyu Yan, Sen Lu, Jianhong Han, Hanting Chen, Hailin Hu, Jie Hu

Figure 1 for EAM: Enhancing Anything with Diffusion Transformers for Blind Super-Resolution

Figure 2 for EAM: Enhancing Anything with Diffusion Transformers for Blind Super-Resolution

Figure 3 for EAM: Enhancing Anything with Diffusion Transformers for Blind Super-Resolution

Figure 4 for EAM: Enhancing Anything with Diffusion Transformers for Blind Super-Resolution

Share this with someone who'll enjoy it:

Abstract:Utilizing pre-trained Text-to-Image (T2I) diffusion models to guide Blind Super-Resolution (BSR) has become a predominant approach in the field. While T2I models have traditionally relied on U-Net architectures, recent advancements have demonstrated that Diffusion Transformers (DiT) achieve significantly higher performance in this domain. In this work, we introduce Enhancing Anything Model (EAM), a novel BSR method that leverages DiT and outperforms previous U-Net-based approaches. We introduce a novel block, $\Psi$-DiT, which effectively guides the DiT to enhance image restoration. This block employs a low-resolution latent as a separable flow injection control, forming a triple-flow architecture that effectively leverages the prior knowledge embedded in the pre-trained DiT. To fully exploit the prior guidance capabilities of T2I models and enhance their generalization in BSR, we introduce a progressive Masked Image Modeling strategy, which also reduces training costs. Additionally, we propose a subject-aware prompt generation strategy that employs a robust multi-modal model in an in-context learning framework. This strategy automatically identifies key image areas, provides detailed descriptions, and optimizes the utilization of T2I diffusion priors. Our experiments demonstrate that EAM achieves state-of-the-art results across multiple datasets, outperforming existing methods in both quantitative metrics and visual quality.

View paper on

Share this with someone who'll enjoy it:

Title:EAM: Enhancing Anything with Diffusion Transformers for Blind Super-Resolution

Paper and Code