Abstract:What is it about human brains that allows us to reason symbolically whereas most other animals cannot? There is evidence that dynamic binding, the ability to combine neurons into groups on the fly, is necessary for symbolic thought, but there is also evidence that it is not sufficient. We propose that two kinds of hierarchical integration (integration of multiple role-bindings into multiplace predicates, and integration of multiple correspondences into structure mappings) are minimal requirements, on top of basic dynamic binding, to realize symbolic thought. We tested this hypothesis in a systematic collection of 17 simulations that explored the ability of cognitive architectures with and without the capacity for multi-place predicates and structure mapping to perform various kinds of tasks. The simulations were as generic as possible, in that no task could be performed based on any diagnostic features, depending instead on the capacity for multi-place predicates and structure mapping. The results are consistent with the hypothesis that, along with dynamic binding, multi-place predicates and structure mapping are minimal requirements for basic symbolic thought. These results inform our understanding of how human brains give rise to symbolic thought and speak to the differences between biological intelligence, which tends to generalize broadly from very few training examples, and modern approaches to machine learning, which typically require millions or billions of training examples. The results we report also have important implications for bio-inspired artificial intelligence.
Abstract:Multiple benchmarks have been developed to assess the alignment between deep neural networks (DNNs) and human vision. In almost all cases these benchmarks are observational in the sense they are composed of behavioural and brain responses to naturalistic images that have not been manipulated to test hypotheses regarding how DNNs or humans perceive and identify objects. Here we introduce the toolbox MindSet: Vision, consisting of a collection of image datasets and related scripts designed to test DNNs on 30 psychological findings. In all experimental conditions, the stimuli are systematically manipulated to test specific hypotheses regarding human visual perception and object recognition. In addition to providing pre-generated datasets of images, we provide code to regenerate these datasets, offering many configurable parameters which greatly extend the dataset versatility for different research contexts, and code to facilitate the testing of DNNs on these image datasets using three different methods (similarity judgments, out-of-distribution classification, and decoder method), accessible at https://github.com/MindSetVision/mindset-vision. We test ResNet-152 on each of these methods as an example of how the toolbox can be used.
Abstract:The following is a dissertation aimed at understanding what the various phenomena in visual search teach us about the nature of human visual representations and processes. I first review some of the major empirical findings in the study of visual search. I next present a theory of visual search in terms of what I believe these findings suggest about the representations and processes underlying ventral visual processing. These principles are instantiated in a computational model called CASPER (Concurrent Attention: Serial and Parallel Evaluation with Relations), originally developed by Hummel, that I have adapted to account for a range of phenomena in visual search. I then describe an extension of the CASPER model to account for our ability to search for visual items defined not simply by the features composing those items but by the spatial relations among those features. Seven experiments (four main experiments and three replications) are described that test CASPER's predictions about relational search. Finally, I evaluate the fit between CASPER's predictions and the empirical findings and show with three additional simulations that CASPER can account for negative acceleration in search functions for relational stimuli if one postulates that the visual system is leveraging an emergent feature that bypasses relational processing.