Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ishaan Malhi

Google DeepMind

Di3PO -- Diptych Diffusion DPO for Targeted Improvements in Image

Feb 06, 2026

Sanjana Reddy, Ishaan Malhi, Sally Ma, Praneet Dutta

Abstract:Existing methods for preference tuning of text-to-image (T2I) diffusion models often rely on computationally expensive generation steps to create positive and negative pairs of images. These approaches frequently yield training pairs that either lack meaningful differences, are expensive to sample and filter, or exhibit significant variance in irrelevant pixel regions, thereby degrading training efficiency. To address these limitations, we introduce "Di3PO", a novel method for constructing positive and negative pairs that isolates specific regions targeted for improvement during preference tuning, while keeping the surrounding context in the image stable. We demonstrate the efficacy of our approach by applying it to the challenging task of text rendering in diffusion models, showcasing improvements over baseline methods of SFT and DPO.

Via

Access Paper or Ask Questions

Preserving Product Fidelity in Large Scale Image Recontextualization with Diffusion Models

Mar 11, 2025

Ishaan Malhi, Praneet Dutta, Ellie Talius, Sally Ma, Brendan Driscoll, Krista Holden, Garima Pruthi, Arunachalam Narayanaswamy

Figure 1 for Preserving Product Fidelity in Large Scale Image Recontextualization with Diffusion Models

Figure 2 for Preserving Product Fidelity in Large Scale Image Recontextualization with Diffusion Models

Figure 3 for Preserving Product Fidelity in Large Scale Image Recontextualization with Diffusion Models

Figure 4 for Preserving Product Fidelity in Large Scale Image Recontextualization with Diffusion Models

Abstract:We present a framework for high-fidelity product image recontextualization using text-to-image diffusion models and a novel data augmentation pipeline. This pipeline leverages image-to-video diffusion, in/outpainting & negatives to create synthetic training data, addressing limitations of real-world data collection for this task. Our method improves the quality and diversity of generated images by disentangling product representations and enhancing the model's understanding of product characteristics. Evaluation on the ABO dataset and a private product dataset, using automated metrics and human assessment, demonstrates the effectiveness of our framework in generating realistic and compelling product visualizations, with implications for applications such as e-commerce and virtual product showcasing.

Via

Access Paper or Ask Questions

CheXseen: Unseen Disease Detection for Deep Learning Interpretation of Chest X-rays

Mar 08, 2021

Siyu Shi, Ishaan Malhi, Kevin Tran, Andrew Y. Ng, Pranav Rajpurkar

Figure 1 for CheXseen: Unseen Disease Detection for Deep Learning Interpretation of Chest X-rays

Figure 2 for CheXseen: Unseen Disease Detection for Deep Learning Interpretation of Chest X-rays

Figure 3 for CheXseen: Unseen Disease Detection for Deep Learning Interpretation of Chest X-rays

Figure 4 for CheXseen: Unseen Disease Detection for Deep Learning Interpretation of Chest X-rays

Abstract:We systematically evaluate the performance of deep learning models in the presence of diseases not labeled for or present during training. First, we evaluate whether deep learning models trained on a subset of diseases (seen diseases) can detect the presence of any one of a larger set of diseases. We find that models tend to falsely classify diseases outside of the subset (unseen diseases) as "no disease". Second, we evaluate whether models trained on seen diseases can detect seen diseases when co-occurring with diseases outside the subset (unseen diseases). We find that models are still able to detect seen diseases even when co-occurring with unseen diseases. Third, we evaluate whether feature representations learned by models may be used to detect the presence of unseen diseases given a small labeled set of unseen diseases. We find that the penultimate layer of the deep neural network provides useful features for unseen disease detection. Our results can inform the safe clinical deployment of deep learning models trained on a non-exhaustive set of disease classes.

* Accepted to ACM Conference on Health, Inference, and Learning (ACM-CHIL) Workshop 2021

Via

Access Paper or Ask Questions