Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nick Oh

Monitor-Generate-Verify (MGV): Formalising Metacognitive Theory for Language Model Reasoning

Nov 07, 2025

Nick Oh, Fernand Gobet

Abstract:Test-time reasoning architectures such as those following the Generate-Verify paradigm -- where a model iteratively refines or verifies its own generated outputs -- prioritise generation and verification but exclude the monitoring processes that determine when and how reasoning should begin. This omission may contribute to the prefix dominance trap, in which models commit early to suboptimal reasoning paths and seldom recover, yielding roughly 20% accuracy loss. We address this architectural gap by formalising Flavell's and Nelson and Narens' metacognitive theories into computational specifications, proposing the Monitor-Generate-Verify (MGV) framework. MGV extends the Generate-Verify paradigm by adding explicit monitoring that captures metacognitive experiences (from difficulty assessments to confidence judgements) before generation begins and refines future monitoring through verification feedback. Though we present no empirical validation, this work provides the first systematic computational translation of foundational metacognitive theories, offering a principled vocabulary for understanding reasoning system failures and suggesting specific architectural interventions for future test-time reasoning designs.

* To-be presented at the Workshop on the Foundations of Reasoning in Language Models at NeurIPS 2025 (non-archival)

Via

Access Paper or Ask Questions

In Defence of Post-hoc Explainability

Dec 23, 2024

Nick Oh

Abstract:The widespread adoption of machine learning in scientific research has created a fundamental tension between model opacity and scientific understanding. Whilst some advocate for intrinsically interpretable models, we introduce Computational Interpretabilism (CI) as a philosophical framework for post-hoc interpretability in scientific AI. Drawing parallels with human expertise, where post-hoc rationalisation coexists with reliable performance, CI establishes that scientific knowledge emerges through structured model interpretation when properly bounded by empirical validation. Through mediated understanding and bounded factivity, we demonstrate how post-hoc methods achieve epistemically justified insights without requiring complete mechanical transparency, resolving tensions between model complexity and scientific comprehension.

* Presented at the Interpretable AI: Past, Present, and Future Workshop at NeurIPS 2024 (non-archival)

Via

Access Paper or Ask Questions