Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jong Chul Ye

KAIST Graduate School of AI

Improving Diffusion-based Image Translation using Asymmetric Gradient Guidance

Jun 07, 2023

Gihyun Kwon, Jong Chul Ye

Abstract:Diffusion models have shown significant progress in image translation tasks recently. However, due to their stochastic nature, there's often a trade-off between style transformation and content preservation. Current strategies aim to disentangle style and content, preserving the source image's structure while successfully transitioning from a source to a target domain under text or one-shot image conditions. Yet, these methods often require computationally intense fine-tuning of diffusion models or additional neural networks. To address these challenges, here we present an approach that guides the reverse process of diffusion sampling by applying asymmetric gradient guidance. This results in quicker and more stable image manipulation for both text-guided and image-guided image translation. Our model's adaptability allows it to be implemented with both image- and latent-diffusion models. Experiments show that our method outperforms various state-of-the-art models in image translation tasks.

Via

Access Paper or Ask Questions

Unpaired Deep Learning for Pharmacokinetic Parameter Estimation from Dynamic Contrast-Enhanced MRI

Jun 07, 2023

Gyutaek Oh, Won-Jin Moon, Jong Chul Ye

Figure 1 for Unpaired Deep Learning for Pharmacokinetic Parameter Estimation from Dynamic Contrast-Enhanced MRI

Figure 2 for Unpaired Deep Learning for Pharmacokinetic Parameter Estimation from Dynamic Contrast-Enhanced MRI

Figure 3 for Unpaired Deep Learning for Pharmacokinetic Parameter Estimation from Dynamic Contrast-Enhanced MRI

Figure 4 for Unpaired Deep Learning for Pharmacokinetic Parameter Estimation from Dynamic Contrast-Enhanced MRI

Abstract:DCE-MRI provides information about vascular permeability and tissue perfusion through the acquisition of pharmacokinetic parameters. However, traditional methods for estimating these pharmacokinetic parameters involve fitting tracer kinetic models, which often suffer from computational complexity and low accuracy due to noisy arterial input function (AIF) measurements. Although some deep learning approaches have been proposed to tackle these challenges, most existing methods rely on supervised learning that requires paired input DCE-MRI and labeled pharmacokinetic parameter maps. This dependency on labeled data introduces significant time and resource constraints, as well as potential noise in the labels, making supervised learning methods often impractical. To address these limitations, here we present a novel unpaired deep learning method for estimating both pharmacokinetic parameters and the AIF using a physics-driven CycleGAN approach. Our proposed CycleGAN framework is designed based on the underlying physics model, resulting in a simpler architecture with a single generator and discriminator pair. Crucially, our experimental results indicate that our method, which does not necessitate separate AIF measurements, produces more reliable pharmacokinetic parameters than other techniques.

Via

Access Paper or Ask Questions

Direct Diffusion Bridge using Data Consistency for Inverse Problems

May 31, 2023

Hyungjin Chung, Jeongsol Kim, Jong Chul Ye

Figure 1 for Direct Diffusion Bridge using Data Consistency for Inverse Problems

Figure 2 for Direct Diffusion Bridge using Data Consistency for Inverse Problems

Figure 3 for Direct Diffusion Bridge using Data Consistency for Inverse Problems

Figure 4 for Direct Diffusion Bridge using Data Consistency for Inverse Problems

Abstract:Diffusion model-based inverse problem solvers have shown impressive performance, but are limited in speed, mostly as they require reverse diffusion sampling starting from noise. Several recent works have tried to alleviate this problem by building a diffusion process, directly bridging the clean and the corrupted for specific inverse problems. In this paper, we first unify these existing works under the name Direct Diffusion Bridges (DDB), showing that while motivated by different theories, the resulting algorithms only differ in the choice of parameters. Then, we highlight a critical limitation of the current DDB framework, namely that it does not ensure data consistency. To address this problem, we propose a modified inference procedure that imposes data consistency without the need for fine-tuning. We term the resulting method data Consistent DDB (CDDB), which outperforms its inconsistent counterpart in terms of both perception and distortion metrics, thereby effectively pushing the Pareto-frontier toward the optimum. Our proposed method achieves state-of-the-art results on both evaluation criteria, showcasing its superiority over existing methods.

* 16 pages, 6 figures

Via

Access Paper or Ask Questions

Score-based Diffusion Models for Bayesian Image Reconstruction

May 25, 2023

Michael T. McCann, Hyungjin Chung, Jong Chul Ye, Marc L. Klasky

Abstract:This paper explores the use of score-based diffusion models for Bayesian image reconstruction. Diffusion models are an efficient tool for generative modeling. Diffusion models can also be used for solving image reconstruction problems. We present a simple and flexible algorithm for training a diffusion model and using it for maximum a posteriori reconstruction, minimum mean square error reconstruction, and posterior sampling. We present experiments on both a linear and a nonlinear reconstruction problem that highlight the strengths and limitations of the approach.

* 5 pages, 3 figures

Via

Access Paper or Ask Questions

Data Topology-Dependent Upper Bounds of Neural Network Widths

May 25, 2023

Sangmin Lee, Jong Chul Ye

Figure 1 for Data Topology-Dependent Upper Bounds of Neural Network Widths

Figure 2 for Data Topology-Dependent Upper Bounds of Neural Network Widths

Figure 3 for Data Topology-Dependent Upper Bounds of Neural Network Widths

Figure 4 for Data Topology-Dependent Upper Bounds of Neural Network Widths

Abstract:This paper investigates the relationship between the universal approximation property of deep neural networks and topological characteristics of datasets. Our primary contribution is to introduce data topology-dependent upper bounds on the network width. Specifically, we first show that a three-layer neural network, applying a ReLU activation function and max pooling, can be designed to approximate an indicator function over a compact set, one that is encompassed by a tight convex polytope. This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure. Further, we calculate upper bounds in relation to the Betti numbers of select topological spaces. Finally, we prove the universal approximation property of three-layer ReLU networks using our topological approach. We also verify that gradient descent converges to the network structure proposed in our study.

Via

Access Paper or Ask Questions

Unpaired Image-to-Image Translation via Neural Schrödinger Bridge

May 24, 2023

Beomsu Kim, Gihyun Kwon, Kwanyoung Kim, Jong Chul Ye

Abstract:Diffusion models are a powerful class of generative models which simulate stochastic differential equations (SDEs) to generate data from noise. Although diffusion models have achieved remarkable progress in recent years, they have limitations in the unpaired image-to-image translation tasks due to the Gaussian prior assumption. Schr\"odinger Bridge (SB), which learns an SDE to translate between two arbitrary distributions, have risen as an attractive solution to this problem. However, none of SB models so far have been successful at unpaired translation between high-resolution images. In this work, we propose the Unpaired Neural Schr\"odinger Bridge (UNSB), which combines SB with adversarial training and regularization to learn a SB between unpaired data. We demonstrate that UNSB is scalable, and that it successfully solves various unpaired image-to-image translation tasks. Code: \url{https://github.com/cyclomon/UNSB}

Via

Access Paper or Ask Questions

LLM Itself Can Read and Generate CXR Images

May 24, 2023

Suhyeon Lee, Won Jun Kim, Jong Chul Ye

Abstract:Building on the recent remarkable development of large language models (LLMs), active attempts are being made to extend the utility of LLMs to multimodal tasks. There have been previous efforts to link language and visual information, and attempts to add visual capabilities to LLMs are ongoing as well. However, existing attempts use LLMs only as image decoders and no attempt has been made to generate images in the same line as the natural language. By adopting a VQ-GAN framework in which latent representations of images are treated as a kind of text tokens, we present a novel method to fine-tune a pre-trained LLM to read and generate images like text without any structural changes, extra training objectives, or the need for training an ad-hoc network while still preserving the of the instruction-following capability of the LLM. We apply this framework to chest X-ray (CXR) image and report generation tasks as it is a domain in which translation of complex information between visual and language domains is important. The code is available at https://github.com/hyn2028/llm-cxr.

* 17 pages, 7 figures

Via

Access Paper or Ask Questions

Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion

Mar 15, 2023

Inhwa Han, Serin Yang, Taesung Kwon, Jong Chul Ye

Figure 1 for Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion

Figure 2 for Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion

Figure 3 for Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion

Figure 4 for Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion

Abstract:Diffusion models have shown superior performance in image generation and manipulation, but the inherent stochasticity presents challenges in preserving and manipulating image content and identity. While previous approaches like DreamBooth and Textual Inversion have proposed model or latent representation personalization to maintain the content, their reliance on multiple reference images and complex training limits their practicality. In this paper, we present a simple yet highly effective approach to personalization using highly personalized (HiPer) text embedding by decomposing the CLIP embedding space for personalization and content manipulation. Our method does not require model fine-tuning or identifiers, yet still enables manipulation of background, texture, and motion with just a single image and target text. Through experiments on diverse target texts, we demonstrate that our approach produces highly personalized and complex semantic image edits across a wide range of tasks. We believe that the novel understanding of the text embedding space presented in this work has the potential to inspire further research across various tasks.

Via

Access Paper or Ask Questions

Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer

Mar 15, 2023

Serin Yang, Hyunmin Hwang, Jong Chul Ye

Figure 1 for Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer

Figure 2 for Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer

Figure 3 for Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer

Figure 4 for Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer

Abstract:Diffusion models have shown great promise in text-guided image style transfer, but there is a trade-off between style transformation and content preservation due to their stochastic nature. Existing methods require computationally expensive fine-tuning of diffusion models or additional neural network. To address this, here we propose a zero-shot contrastive loss for diffusion models that doesn't require additional fine-tuning or auxiliary networks. By leveraging patch-wise contrastive loss between generated samples and original image embeddings in the pre-trained diffusion model, our method can generate images with the same semantic content as the source image in a zero-shot manner. Our approach outperforms existing methods while preserving content and requiring no additional training, not only for image style transfer but also for image-to-image translation and manipulation. Our experimental results validate the effectiveness of our proposed method.

Via

Access Paper or Ask Questions

Improving 3D Imaging with Pre-Trained Perpendicular 2D Diffusion Models

Mar 15, 2023

Suhyeon Lee, Hyungjin Chung, Minyoung Park, Jonghyuk Park, Wi-Sun Ryu, Jong Chul Ye

Figure 1 for Improving 3D Imaging with Pre-Trained Perpendicular 2D Diffusion Models

Figure 2 for Improving 3D Imaging with Pre-Trained Perpendicular 2D Diffusion Models

Figure 3 for Improving 3D Imaging with Pre-Trained Perpendicular 2D Diffusion Models

Figure 4 for Improving 3D Imaging with Pre-Trained Perpendicular 2D Diffusion Models

Abstract:Diffusion models have become a popular approach for image generation and reconstruction due to their numerous advantages. However, most diffusion-based inverse problem-solving methods only deal with 2D images, and even recently published 3D methods do not fully exploit the 3D distribution prior. To address this, we propose a novel approach using two perpendicular pre-trained 2D diffusion models to solve the 3D inverse problem. By modeling the 3D data distribution as a product of 2D distributions sliced in different directions, our method effectively addresses the curse of dimensionality. Our experimental results demonstrate that our method is highly effective for 3D medical image reconstruction tasks, including MRI Z-axis super-resolution, compressed sensing MRI, and sparse-view CT. Our method can generate high-quality voxel volumes suitable for medical applications.

* 14 pages

Via

Access Paper or Ask Questions