Abstract:A key issue in cognitive science concerns the fundamental psychological processes that underlie the formation and retrieval of multiple types of concepts in short-term and long-term memory (STM and LTM, respectively). We propose that chunking mechanisms play an essential role and show how the CogAct computational model grounds concept learning in fundamental cognitive processes and structures (such as chunking, attention, STM and LTM). First are the in-principle demonstrations, with CogAct automatically adapting to learn a range of categories from simple logical functions, to artificial categories, to natural raw (as opposed to natural pre-processed) concepts in the dissimilar domains of literature, chess and music. This kind of adaptive learning is difficult for most other psychological models, e.g., with cognitive models stopping at modelling artificial categories and (non-GPT) models based on deep learning requiring task-specific changes to the architecture. Secondly, we offer novel ways of designing human benchmarks for concept learning experiments and simulations accounting for subjectivity, ways to control for individual human experiences, all while keeping to real-life complex categories. We ground CogAct in simulations of subjective conceptual spaces of individual human participants, capturing humans subjective judgements in music, with the models learning from raw music score data without bootstrapping to pre-built knowledge structures. The CogAct simulations are compared to those obtained by a deep-learning model. These findings integrate concept learning and adaptation to complexity into the broader theories of cognitive psychology. Our approach may also be used in psychological applications that move away from modelling the average participant and towards capturing subjective concept space.
Abstract:Test-time reasoning architectures such as those following the Generate-Verify paradigm -- where a model iteratively refines or verifies its own generated outputs -- prioritise generation and verification but exclude the monitoring processes that determine when and how reasoning should begin. This omission may contribute to the prefix dominance trap, in which models commit early to suboptimal reasoning paths and seldom recover, yielding roughly 20% accuracy loss. We address this architectural gap by formalising Flavell's and Nelson and Narens' metacognitive theories into computational specifications, proposing the Monitor-Generate-Verify (MGV) framework. MGV extends the Generate-Verify paradigm by adding explicit monitoring that captures metacognitive experiences (from difficulty assessments to confidence judgements) before generation begins and refines future monitoring through verification feedback. Though we present no empirical validation, this work provides the first systematic computational translation of foundational metacognitive theories, offering a principled vocabulary for understanding reasoning system failures and suggesting specific architectural interventions for future test-time reasoning designs.