Abstract:This study explores the explainability capabilities of large language models (LLMs), when employed to autonomously generate machine learning (ML) solutions. We examine two classification tasks: (i) a binary classification problem focused on predicting driver alertness states, and (ii) a multilabel classification problem based on the yeast dataset. Three state-of-the-art LLMs (i.e. OpenAI GPT, Anthropic Claude, and DeepSeek) are prompted to design training pipelines for four common classifiers: Random Forest, XGBoost, Multilayer Perceptron, and Long Short-Term Memory networks. The generated models are evaluated in terms of predictive performance (recall, precision, and F1-score) and explainability using SHAP (SHapley Additive exPlanations). Specifically, we measure Average SHAP Fidelity (Mean Squared Error between SHAP approximations and model outputs) and Average SHAP Sparsity (number of features deemed influential). The results reveal that LLMs are capable of producing effective and interpretable models, achieving high fidelity and consistent sparsity, highlighting their potential as automated tools for interpretable ML pipeline generation. The results show that LLMs can produce effective, interpretable pipelines with high fidelity and consistent sparsity, closely matching manually engineered baselines.
Abstract:As large language models (LLMs) shape AI development, ensuring ethical prompt recommendations is crucial. LLMs offer innovation but risk bias, fairness issues, and accountability concerns. Traditional oversight methods struggle with scalability, necessitating dynamic solutions. This paper proposes using collaborative filtering, a technique from recommendation systems, to enhance ethical prompt selection. By leveraging user interactions, it promotes ethical guidelines while reducing bias. Contributions include a synthetic dataset for prompt recommendations and the application of collaborative filtering. The work also tackles challenges in ethical AI, such as bias mitigation, transparency, and preventing unethical prompt engineering.
Abstract:In an era defined by rapid data evolution, traditional machine learning (ML) models often fall short in adapting to dynamic environments. Evolving Machine Learning (EML) has emerged as a critical paradigm, enabling continuous learning and adaptation in real-time data streams. This survey presents a comprehensive analysis of EML, focusing on five core challenges: data drift, concept drift, catastrophic forgetting, skewed learning, and network adaptation. We systematically review over 120 studies, categorizing state-of-the-art methods across supervised, unsupervised, and semi-supervised approaches. The survey explores diverse evaluation metrics, benchmark datasets, and real-world applications, offering a comparative lens on the effectiveness and limitations of current techniques. Additionally, we highlight the growing role of adaptive neural architectures, meta-learning, and ensemble strategies in addressing evolving data complexities. By synthesizing insights from recent literature, this work not only maps the current landscape of EML but also identifies critical gaps and opportunities for future research. Our findings aim to guide researchers and practitioners in developing robust, ethical, and scalable EML systems for real-world deployment.
Abstract:The communication of potential students with a university department is performed manually and it is a very time consuming procedure. The opportunity to communicate with on a one-to-one basis is highly valued. However with many hundreds of applications each year, one-to-one conversations are not feasible in most cases. The communication will require a member of academic staff to expend several hours to find suitable answers and contact each student. It would be useful to reduce his costs and time. The project aims to reduce the burden on the head of admissions, and potentially other users, by developing a convincing chatbot. A suitable algorithm must be devised to search through the set of data and find a potential answer. The program then replies to the user and provides a relevant web link if the user is not satisfied by the answer. Furthermore a web interface is provided for both users and an administrator. The achievements of the project can be summarised as follows. To prepare the background of the project a literature review was undertaken, together with an investigation of existing tools, and consultation with the head of admissions. The requirements of the system were established and a range of algorithms and tools were investigated, including keyword and template matching. An algorithm that combines keyword matching with string similarity has been developed. A usable system using the proposed algorithm has been implemented. The system was evaluated by keeping logs of questions and answers and by feedback received by potential students that used it.