Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lukas Kuhn

Beyond Overconfidence: Foundation Models Redefine Calibration in Deep Neural Networks

Jun 11, 2025

Achim Hekler, Lukas Kuhn, Florian Buettner

Abstract:Reliable uncertainty calibration is essential for safely deploying deep neural networks in high-stakes applications. Deep neural networks are known to exhibit systematic overconfidence, especially under distribution shifts. Although foundation models such as ConvNeXt, EVA and BEiT have demonstrated significant improvements in predictive performance, their calibration properties remain underexplored. This paper presents a comprehensive investigation into the calibration behavior of foundation models, revealing insights that challenge established paradigms. Our empirical analysis shows that these models tend to be underconfident in in-distribution predictions, resulting in higher calibration errors, while demonstrating improved calibration under distribution shifts. Furthermore, we demonstrate that foundation models are highly responsive to post-hoc calibration techniques in the in-distribution setting, enabling practitioners to effectively mitigate underconfidence bias. However, these methods become progressively less reliable under severe distribution shifts and can occasionally produce counterproductive results. Our findings highlight the complex, non-monotonic effects of architectural and training innovations on calibration, challenging established narratives of continuous improvement.

Via

Access Paper or Ask Questions

Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers

Jan 01, 2025

Lukas Kuhn, Sari Sadiya, Jorg Schlotterer, Christin Seifert, Gemma Roig

Figure 1 for Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers

Figure 2 for Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers

Figure 3 for Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers

Figure 4 for Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers

Abstract:Shortcut learning, i.e., a model's reliance on undesired features not directly relevant to the task, is a major challenge that severely limits the applications of machine learning algorithms, particularly when deploying them to assist in making sensitive decisions, such as in medical diagnostics. In this work, we leverage recent advancements in machine learning to create an unsupervised framework that is capable of both detecting and mitigating shortcut learning in transformers. We validate our method on multiple datasets. Results demonstrate that our framework significantly improves both worst-group accuracy (samples misclassified due to shortcuts) and average accuracy, while minimizing human annotation effort. Moreover, we demonstrate that the detected shortcuts are meaningful and informative to human experts, and that our framework is computationally efficient, allowing it to be run on consumer hardware.

Via

Access Paper or Ask Questions