Abstract:Painting embodies a unique form of visual storytelling, where the creation process is as significant as the final artwork. Although recent advances in generative models have enabled visually compelling painting synthesis, most existing methods focus solely on final image generation or patch-based process simulation, lacking explicit stroke structure and failing to produce smooth, realistic shading. In this work, we present a differentiable stroke reconstruction framework that unifies painting, stylized texturing, and smudging to faithfully reproduce the human painting-smudging loop. Given an input image, our framework first optimizes single- and dual-color Bezier strokes through a parallel differentiable paint renderer, followed by a style generation module that synthesizes geometry-conditioned textures across diverse painting styles. We further introduce a differentiable smudge operator to enable natural color blending and shading. Coupled with a coarse-to-fine optimization strategy, our method jointly optimizes stroke geometry, color, and texture under geometric and semantic guidance. Extensive experiments on oil, watercolor, ink, and digital paintings demonstrate that our approach produces realistic and expressive stroke reconstructions, smooth tonal transitions, and richly stylized appearances, offering a unified model for expressive digital painting creation. See our project page for more demos: https://yingjiang96.github.io/DiffPaintWebsite/.




Abstract:Decoupling the illumination in 3D scenes is crucial for novel view synthesis and relighting. In this paper, we propose a novel method for representing a scene illuminated by a point light using a set of relightable 3D Gaussian points. Inspired by the Blinn-Phong model, our approach decomposes the scene into ambient, diffuse, and specular components, enabling the synthesis of realistic lighting effects. To facilitate the decomposition of geometric information independent of lighting conditions, we introduce a novel bilevel optimization-based meta-learning framework. The fundamental idea is to view the rendering tasks under various lighting positions as a multi-task learning problem, which our meta-learning approach effectively addresses by generalizing the learned Gaussian geometries not only across different viewpoints but also across diverse light positions. Experimental results demonstrate the effectiveness of our approach in terms of training efficiency and rendering quality compared to existing methods for free-viewpoint relighting.