Abstract:The fundamental goal of Information Retrieval (IR) systems lies in their capacity to effectively satisfy human information needs - a challenge that encompasses not just the technical delivery of information, but the nuanced understanding of human cognition during information seeking. Contemporary IR platforms rely primarily on observable interaction signals, creating a fundamental gap between system capabilities and users' cognitive processes. Brain-Machine Interface (BMI) technologies now offer unprecedented potential to bridge this gap through direct measurement of previously inaccessible aspects of information-seeking behaviour. This perspective paper offers a broad examination of the IR landscape, providing a comprehensive analysis of how BMI technology could transform IR systems, drawing from advances at the intersection of both neuroscience and IR research. We present our analysis through three identified fundamental vertices: (1) understanding the neural correlates of core IR concepts to advance theoretical models of search behaviour, (2) enhancing existing IR systems through contextual integration of neurophysiological signals, and (3) developing proactive IR capabilities through direct neurophysiological measurement. For each vertex, we identify specific research opportunities and propose concrete directions for developing BMI-enhanced IR systems. We conclude by examining critical technical and ethical challenges in implementing these advances, providing a structured roadmap for future research at the intersection of neuroscience and IR.
Abstract:This research examines the congruence between neural activity and advanced transformer models, emphasizing the semantic significance of punctuation in text understanding. Utilizing an innovative approach originally proposed by Toneva and Wehbe, we evaluate four advanced transformer models RoBERTa, DistiliBERT, ALBERT, and ELECTRA against neural activity data. Our findings indicate that RoBERTa exhibits the closest alignment with neural activity, surpassing BERT in accuracy. Furthermore, we investigate the impact of punctuation removal on model performance and neural alignment, revealing that BERT's accuracy enhances in the absence of punctuation. This study contributes to the comprehension of how neural networks represent language and the influence of punctuation on semantic processing within the human brain.
Abstract:The rapid advancement of Artificial Intelligence has resulted in the advent of Large Language Models (LLMs) with the capacity to produce text that closely resembles human communication. These models have been seamlessly integrated into diverse applications, enabling interactive and responsive communication across multiple platforms. The potential utility of chatbots transcends these traditional applications, particularly in research contexts, wherein they can offer valuable insights and facilitate the design of innovative experiments. In this study, we present a Customizable LLM-Powered Chatbot (CLPC), a web-based chatbot system designed to assist in behavioral science research. The system is meticulously designed to function as an experimental instrument rather than a conventional chatbot, necessitating users to input a username and experiment code upon access. This setup facilitates precise data cross-referencing, thereby augmenting the integrity and applicability of the data collected for research purposes. It can be easily expanded to accommodate new basic events as needed; and it allows researchers to integrate their own logging events without the necessity of implementing a separate logging mechanism. It is worth noting that our system was built to assist primarily behavioral science research but is not limited to it, it can easily be adapted to assist information retrieval research or interacting with chat bot agents in general.
Abstract:Brain decoding has emerged as a rapidly advancing and extensively utilized technique within neuroscience. This paper centers on the application of raw electroencephalogram (EEG) signals for decoding human brain activity, offering a more expedited and efficient methodology for enhancing our understanding of the human brain. The investigation specifically scrutinizes the efficacy of brain-computer interfaces (BCI) in deciphering neural signals associated with speech production, with particular emphasis on the impact of vocabulary size, electrode density, and training data on the framework's performance. The study reveals the competitive word error rates (WERs) achievable on the Librispeech benchmark through pre-training on unlabelled data for speech processing. Furthermore, the study evaluates the efficacy of voice recognition under configurations with limited labeled data, surpassing previous state-of-the-art techniques while utilizing significantly fewer labels. Additionally, the research provides a comprehensive analysis of error patterns in voice recognition and the influence of model size and unlabelled training data. It underscores the significance of factors such as vocabulary size and electrode density in enhancing BCI performance, advocating for an increase in microelectrodes and refinement of language models.
Abstract:The rapid advancement of artificial intelligence has resulted in the advent of large language models (LLMs) with the capacity to produce text that closely resembles human communication. These models have been seamlessly integrated into diverse applications, enabling interactive and responsive communication across multiple platforms. The potential utility of chatbots transcends these traditional applications, particularly in research contexts, wherein they can offer valuable insights and facilitate the design of innovative experiments. In this study, we present NSChat, a web-based chatbot system designed to assist in neuroscience research. The system is meticulously designed to function as an experimental instrument rather than a conventional chatbot, necessitating users to input a username and experiment code upon access. This setup facilitates precise data cross-referencing, thereby augmenting the integrity and applicability of the data collected for research purposes. It can be easily expanded to accommodate new basic events as needed; and it allows researchers to integrate their own logging events without the necessity of implementing a separate logging mechanism. It is worth noting that our system was built to assist primarily neuroscience research but is not limited to it, it can easily be adapted to assist information retrieval research or interacting with chat bot agents in general.
Abstract:Information retrieval systems have historically relied on explicit query formulation, requiring users to translate their information needs into text. This process is particularly disruptive during reading tasks, where users must interrupt their natural flow to formulate queries. We present DEEPER (Dense Electroencephalography Passage Retrieval), a novel framework that enables direct retrieval of relevant passages from users' neural signals during naturalistic reading without intermediate text translation. Building on dense retrieval architectures, DEEPER employs a dual-encoder approach with specialised components for processing neural data, mapping EEG signals and text passages into a shared semantic space. Through careful architecture design and cross-modal negative sampling strategies, our model learns to align neural patterns with their corresponding textual content. Experimental results on the ZuCo dataset demonstrate that direct brain-to-passage retrieval significantly outperforms current EEG-to-text baselines, achieving a 571% improvement in Precision@1. Our ablation studies reveal that the model successfully learns aligned representations between EEG and text modalities (0.29 cosine similarity), while our hard negative sampling strategy contributes to overall performance increases.
Abstract:Auditing Large Language Models (LLMs) to discover their biases and preferences is an emerging challenge in creating Responsible Artificial Intelligence (AI). While various methods have been proposed to elicit the preferences of such models, countermeasures have been taken by LLM trainers, such that LLMs hide, obfuscate or point blank refuse to disclosure their positions on certain subjects. This paper presents PRISM, a flexible, inquiry-based methodology for auditing LLMs - that seeks to illicit such positions indirectly through task-based inquiry prompting rather than direct inquiry of said preferences. To demonstrate the utility of the methodology, we applied PRISM on the Political Compass Test, where we assessed the political leanings of twenty-one LLMs from seven providers. We show LLMs, by default, espouse positions that are economically left and socially liberal (consistent with prior work). We also show the space of positions that these models are willing to espouse - where some models are more constrained and less compliant than others - while others are more neutral and objective. In sum, PRISM can more reliably probe and audit LLMs to understand their preferences, biases and constraints.
Abstract:In recent years, much interdisciplinary research has been conducted exploring potential use cases of neuroscience to advance the field of information retrieval. Initial research concentrated on the use of fMRI data, but fMRI was deemed to be not suitable for real-world applications, and soon, research shifted towards using EEG data. In this paper, we try to improve the original performance of a first attempt at generating text using EEG by focusing on the less explored area of optimising neural network performance. We test a set of different activation functions and compare their performance. Our results show that introducing a higher degree polynomial activation function can enhance model performance without changing the model architecture. We also show that the learnable 3rd-degree activation function performs better on the 1-gram evaluation compared to a 3rd-degree non-learnable function. However, when evaluating the model on 2-grams and above, the polynomial function lacks in performance, whilst the leaky ReLU activation function outperforms the baseline.
Abstract:Addressing the challenge of automated geometry math problem-solving in artificial intelligence (AI) involves understanding multi-modal information and mathematics. Current methods struggle with accurately interpreting geometry diagrams, which hinders effective problem-solving. To tackle this issue, we present the Geometry problem sOlver with natural Language Description (GOLD) model. GOLD enhances the extraction of geometric relations by separately processing symbols and geometric primitives within the diagram. Subsequently, it converts the extracted relations into natural language descriptions, efficiently utilizing large language models to solve geometry math problems. Experiments show that the GOLD model outperforms the Geoformer model, the previous best method on the UniGeo dataset, by achieving accuracy improvements of 12.7% and 42.1% in calculation and proving subsets. Additionally, it surpasses the former best model on the PGPS9K and Geometry3K datasets, PGPSNet, by obtaining accuracy enhancements of 1.8% and 3.2%, respectively.
Abstract:Measuring transient functional connectivity is an important challenge in Electroencephalogram (EEG) research. Here, the rich potential for insightful, discriminative information of brain activity offered by high temporal resolution is confounded by the inherent noise of the medium and the spurious nature of correlations computed over short temporal windows. We propose a novel methodology to overcome these problems called Filter Average Short-Term (FAST) functional connectivity. First, long-term, stable, functional connectivity is averaged across an entire study cohort for a given pair of Visual Short Term Memory (VSTM) tasks. The resulting average connectivity matrix, containing information on the strongest general connections for the tasks, is used as a filter to analyse the transient high temporal resolution functional connectivity of individual subjects. In simulations, we show that this method accurately discriminates differences in noisy Event-Related Potentials (ERPs) between two conditions where standard connectivity and other comparable methods fail. We then apply this to analyse activity related to visual short-term memory binding deficits in two cohorts of familial and sporadic Alzheimer's disease. Reproducible significant differences were found in the binding task with no significant difference in the shape task in the P300 ERP range. This allows new sensitive measurements of transient functional connectivity, which can be implemented to obtain results of clinical significance.