Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Masoud Makrehchi

Beyond Morphology: Quantifying the Diagnostic Power of Color Features in Cancer Classification

May 18, 2026

Farnaz Kheiri, Shahryar Rahnamayan, Masoud Makrehchi

Abstract:In histopathology, human experts primarily rely on color as a means of enhancing contrast to interpret tissue morphology, whereas machine vision models process color as raw statistical information. This distinction raises a fundamental question: to what extent can pixel intensity alone, independent of structural and morphological cues, support cancer classification? To address this question, we systematically evaluated the standalone discriminative power of global color features while deliberately excluding all morphological information. Specifically, we extracted statistical color moments and discretized RGB and HSV color histograms, and assessed their performance across ten diverse experimental settings using classical machine learning classifiers. Our results demonstrate that color features alone can achieve strong performance in binary diagnostic tasks (e.g., benign versus malignant), with classification accuracies reaching up to 89%. This performance is likely attributable to global chromatic shifts associated with malignancy. Importantly, these simple color-based representations consistently outperformed random baselines by a substantial margin, indicating that raw color distributions encode a non-random and diagnostically relevant signal for cancer detection. Consequently, this study suggests that simple, computationally efficient color features can serve as an effective pre-screening tool. By identifying samples with strong chromatic indicators of malignancy, these lightweight models could function as a first-pass triage system, reducing the computational burden on complex deep learning architectures.

Via

Access Paper or Ask Questions

Enhancing Diversity in Multi-objective Feature Selection

Jul 25, 2024

Sevil Zanjani Miyandoab, Shahryar Rahnamayan, Azam Asilian Bidgoli, Sevda Ebrahimi, Masoud Makrehchi

Abstract:Feature selection plays a pivotal role in the data preprocessing and model-building pipeline, significantly enhancing model performance, interpretability, and resource efficiency across diverse domains. In population-based optimization methods, the generation of diverse individuals holds utmost importance for adequately exploring the problem landscape, particularly in highly multi-modal multi-objective optimization problems. Our study reveals that, in line with findings from several prior research papers, commonly employed crossover and mutation operations lack the capability to generate high-quality diverse individuals and tend to become confined to limited areas around various local optima. This paper introduces an augmentation to the diversity of the population in the well-established multi-objective scheme of the genetic algorithm, NSGA-II. This enhancement is achieved through two key components: the genuine initialization method and the substitution of the worst individuals with new randomly generated individuals as a re-initialization approach in each generation. The proposed multi-objective feature selection method undergoes testing on twelve real-world classification problems, with the number of features ranging from 2,400 to nearly 50,000. The results demonstrate that replacing the last front of the population with an equivalent number of new random individuals generated using the genuine initialization method and featuring a limited number of features substantially improves the population's quality and, consequently, enhances the performance of the multi-objective algorithm.

* 8 pages, 3 figures, accepted to be published in IEEE WCCI 2024 conference

Via

Access Paper or Ask Questions

Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions

Jun 20, 2024

Hamdireza Rouzegar, Masoud Makrehchi

Figure 1 for Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions

Figure 2 for Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions

Figure 3 for Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions

Abstract:This study investigates how LLMs, specifically GPT-3.5 and GPT-4, can develop tailored questions for Grade 9 math, aligning with active learning principles. By utilizing an iterative method, these models adjust questions based on difficulty and content, responding to feedback from a simulated 'student' model. A novel aspect of the research involved using GPT-4 as a 'teacher' to create complex questions, with GPT-3.5 as the 'student' responding to these challenges. This setup mirrors active learning, promoting deeper engagement. The findings demonstrate GPT-4's superior ability to generate precise, challenging questions and notable improvements in GPT-3.5's ability to handle more complex problems after receiving instruction from GPT-4. These results underscore the potential of LLMs to mimic and enhance active learning scenarios, offering a promising path for AI in customized education. This research contributes to understanding how AI can support personalized learning experiences, highlighting the need for further exploration in various educational contexts

* Proceedings of The 37th Canadian Conference on Artificial Intelligence, 2024
* Publisher: Canadian Artificial Intelligence Association. URL: https://caiac.pubpub.org/pub/kmn55wd2#nssvokovikx

Via

Access Paper or Ask Questions

Enhancing Text Classification through LLM-Driven Active Learning and Human Annotation

Jun 17, 2024

Hamidreza Rouzegar, Masoud Makrehchi

Figure 1 for Enhancing Text Classification through LLM-Driven Active Learning and Human Annotation

Figure 2 for Enhancing Text Classification through LLM-Driven Active Learning and Human Annotation

Figure 3 for Enhancing Text Classification through LLM-Driven Active Learning and Human Annotation

Figure 4 for Enhancing Text Classification through LLM-Driven Active Learning and Human Annotation

Abstract:In the context of text classification, the financial burden of annotation exercises for creating training data is a critical issue. Active learning techniques, particularly those rooted in uncertainty sampling, offer a cost-effective solution by pinpointing the most instructive samples for manual annotation. Similarly, Large Language Models (LLMs) such as GPT-3.5 provide an alternative for automated annotation but come with concerns regarding their reliability. This study introduces a novel methodology that integrates human annotators and LLMs within an Active Learning framework. We conducted evaluations on three public datasets. IMDB for sentiment analysis, a Fake News dataset for authenticity discernment, and a Movie Genres dataset for multi-label classification.The proposed framework integrates human annotation with the output of LLMs, depending on the model uncertainty levels. This strategy achieves an optimal balance between cost efficiency and classification performance. The empirical results show a substantial decrease in the costs associated with data annotation while either maintaining or improving model accuracy.

* Proceedings of The 18th Linguistic Annotation Workshop (LAW-XVIII), 2024, pp. 98-111
* Publisher: Association for Computational Linguistics URL: https://aclanthology.org/2024.law-1.10

Via

Access Paper or Ask Questions

Making a Computational Attorney

Mar 07, 2023

Dell Zhang, Frank Schilder, Jack G. Conrad, Masoud Makrehchi, David von Rickenbach, Isabelle Moulinier

Abstract:This "blue sky idea" paper outlines the opportunities and challenges in data mining and machine learning involving making a computational attorney -- an intelligent software agent capable of helping human lawyers with a wide range of complex high-level legal tasks such as drafting legal briefs for the prosecution or defense in court. In particular, we discuss what a ChatGPT-like Large Legal Language Model (L$^3$M) can and cannot do today, which will inspire researchers with promising short-term and long-term research objectives.

* To be published in the Proceedings of the 2023 SIAM International Conference on Data Mining (SDM'23)

Via

Access Paper or Ask Questions