Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mikhail Tamm

Hessian Geometry of Latent Space in Generative Models

Jun 12, 2025

Alexander Lobashev, Dmitry Guskov, Maria Larchenko, Mikhail Tamm

Figure 1 for Hessian Geometry of Latent Space in Generative Models

Figure 2 for Hessian Geometry of Latent Space in Generative Models

Figure 3 for Hessian Geometry of Latent Space in Generative Models

Figure 4 for Hessian Geometry of Latent Space in Generative Models

Abstract:This paper presents a novel method for analyzing the latent space geometry of generative models, including statistical physics models and diffusion models, by reconstructing the Fisher information metric. The method approximates the posterior distribution of latent variables given generated samples and uses this to learn the log-partition function, which defines the Fisher metric for exponential families. Theoretical convergence guarantees are provided, and the method is validated on the Ising and TASEP models, outperforming existing baselines in reconstructing thermodynamic quantities. Applied to diffusion models, the method reveals a fractal structure of phase transitions in the latent space, characterized by abrupt changes in the Fisher metric. We demonstrate that while geodesic interpolations are approximately linear within individual phases, this linearity breaks down at phase boundaries, where the diffusion model exhibits a divergent Lipschitz constant with respect to the latent space. These findings provide new insights into the complex structure of diffusion model latent spaces and their connection to phenomena like phase transitions. Our source code is available at https://github.com/alobashev/hessian-geometry-of-diffusion-models.

* ICML 2025

Via

Access Paper or Ask Questions

fruit-SALAD: A Style Aligned Artwork Dataset to reveal similarity perception in image embeddings

Jun 03, 2024

Tillmann Ohm, Andres Karjus, Mikhail Tamm, Maximilian Schich

Figure 1 for fruit-SALAD: A Style Aligned Artwork Dataset to reveal similarity perception in image embeddings

Figure 2 for fruit-SALAD: A Style Aligned Artwork Dataset to reveal similarity perception in image embeddings

Figure 3 for fruit-SALAD: A Style Aligned Artwork Dataset to reveal similarity perception in image embeddings

Figure 4 for fruit-SALAD: A Style Aligned Artwork Dataset to reveal similarity perception in image embeddings

Abstract:The notion of visual similarity is essential for computer vision, and in applications and studies revolving around vector embeddings of images. However, the scarcity of benchmark datasets poses a significant hurdle in exploring how these models perceive similarity. Here we introduce Style Aligned Artwork Datasets (SALADs), and an example of fruit-SALAD with 10,000 images of fruit depictions. This combined semantic category and style benchmark comprises 100 instances each of 10 easy-to-recognize fruit categories, across 10 easy distinguishable styles. Leveraging a systematic pipeline of generative image synthesis, this visually diverse yet balanced benchmark demonstrates salient differences in semantic category and style similarity weights across various computational models, including machine learning models, feature extraction algorithms, and complexity measures, as well as conceptual models for reference. This meticulously designed dataset offers a controlled and balanced platform for the comparative analysis of similarity perception. The SALAD framework allows the comparison of how these models perform semantic category and style recognition task to go beyond the level of anecdotal knowledge, making it robustly quantifiable and qualitatively interpretable.

Via

Access Paper or Ask Questions