Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kfir Goldberg

Piece it Together: Part-Based Concepting with IP-Priors

Mar 13, 2025

Elad Richardson, Kfir Goldberg, Yuval Alaluf, Daniel Cohen-Or

Abstract:Advanced generative models excel at synthesizing images but often rely on text-based conditioning. Visual designers, however, often work beyond language, directly drawing inspiration from existing visual elements. In many cases, these elements represent only fragments of a potential concept-such as an uniquely structured wing, or a specific hairstyle-serving as inspiration for the artist to explore how they can come together creatively into a coherent whole. Recognizing this need, we introduce a generative framework that seamlessly integrates a partial set of user-provided visual components into a coherent composition while simultaneously sampling the missing parts needed to generate a plausible and complete concept. Our approach builds on a strong and underexplored representation space, extracted from IP-Adapter+, on which we train IP-Prior, a lightweight flow-matching model that synthesizes coherent compositions based on domain-specific priors, enabling diverse and context-aware generations. Additionally, we present a LoRA-based fine-tuning strategy that significantly improves prompt adherence in IP-Adapter+ for a given task, addressing its common trade-off between reconstruction quality and prompt adherence.

* Project page available at https://eladrich.github.io/PiT/

Via

Access Paper or Ask Questions

ConceptLab: Creative Generation using Diffusion Prior Constraints

Aug 03, 2023

Elad Richardson, Kfir Goldberg, Yuval Alaluf, Daniel Cohen-Or

Abstract:Recent text-to-image generative models have enabled us to transform our words into vibrant, captivating imagery. The surge of personalization techniques that has followed has also allowed us to imagine unique concepts in new scenes. However, an intriguing question remains: How can we generate a new, imaginary concept that has never been seen before? In this paper, we present the task of creative text-to-image generation, where we seek to generate new members of a broad category (e.g., generating a pet that differs from all existing pets). We leverage the under-studied Diffusion Prior models and show that the creative generation problem can be formulated as an optimization process over the output space of the diffusion prior, resulting in a set of "prior constraints". To keep our generated concept from converging into existing members, we incorporate a question-answering model that adaptively adds new constraints to the optimization problem, encouraging the model to discover increasingly more unique creations. Finally, we show that our prior constraints can also serve as a strong mixing mechanism allowing us to create hybrids between generated concepts, introducing even more flexibility into the creative process.

* Project page: https://kfirgoldberg.github.io/ConceptLab/

Via

Access Paper or Ask Questions

Rethinking FUN: Frequency-Domain Utilization Networks

Dec 06, 2020

Kfir Goldberg, Stav Shapiro, Elad Richardson, Shai Avidan

Figure 1 for Rethinking FUN: Frequency-Domain Utilization Networks

Figure 2 for Rethinking FUN: Frequency-Domain Utilization Networks

Figure 3 for Rethinking FUN: Frequency-Domain Utilization Networks

Figure 4 for Rethinking FUN: Frequency-Domain Utilization Networks

Abstract:The search for efficient neural network architectures has gained much focus in recent years, where modern architectures focus not only on accuracy but also on inference time and model size. Here, we present FUN, a family of novel Frequency-domain Utilization Networks. These networks utilize the inherent efficiency of the frequency-domain by working directly in that domain, represented with the Discrete Cosine Transform. Using modern techniques and building blocks such as compound-scaling and inverted-residual layers we generate a set of such networks allowing one to balance between size, latency and accuracy while outperforming competing RGB-based models. Extensive evaluations verifies that our networks present strong alternatives to previous approaches. Moreover, we show that working in frequency domain allows for dynamic compression of the input at inference time without any explicit change to the architecture.

* 9 pages, 7 figures

Via

Access Paper or Ask Questions