Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shreyansh Modi

How Many Counterfactuals Does It Take? Probing VLM Hallucinations Through Circuits and Causal Effects

Jun 07, 2026

Abhivansh Gupta, Simardeep Singh, Advika Sinha, Shreyansh Modi, Akshat Tomar

Abstract:Visual Language Models (VLMs) are known to produce hallucinated predictions that are not grounded in visual evidence, yet existing approaches lack a principled understanding of how robust such predictions are under counterfactual perturbations. In this work, we study the sample complexity of counterfactual robustness for hallucinated outputs in VLMs. We define a causal influence metric based on log-probability differences between factual, counterfactual, and activation-patched runs, and use it to characterize the stability of hallucinated predictions. By leveraging circuit discovery techniques (CD-T), we identify model components responsible for these predictions and track their activation differences across counterfactual samples. We then derive empirical bounds on the minimum number of counterfactual samples m required to reliably detect instability in hallucinated outputs, using concentration inequalities and variance estimates of the causal influence distribution.

Via

Access Paper or Ask Questions

Guidance for Low-Level Perceptual Editing in Unconditional Diffusion Models

May 29, 2026

Shreyansh Modi, Akshat Tomar, Aarush Aggarwal

Abstract:Unconditional diffusion models offer powerful generative priors, yet steering them toward aesthetically enhanced outputs remains largely unexplored. We show that h-space patching, the dominant paradigm for training-free diffusion editing, systematically fails for global, low-level transformations required for aesthetic and perceptual refinement. We introduce a novel, generalized framework for image-editing in unconditional diffusion models without explicit training. This inference-time mechanism operates on low-level features by extracting degradation concept vectors and combining bottleneck patching with classifier-free guidance to guide sampling away from the degraded manifold, producing consistently improved images without any model retraining.

* 11 pages, 12 figures, Generative Models for Computer Vision Workshop CVPR 2026

Via

Access Paper or Ask Questions