Alert button
Picture for Christian Geißler

Christian Geißler

Alert button

Recommendations on test datasets for evaluating AI solutions in pathology

Apr 21, 2022
André Homeyer, Christian Geißler, Lars Ole Schwen, Falk Zakrzewski, Theodore Evans, Klaus Strohmenger, Max Westphal, Roman David Bülow, Michaela Kargl, Aray Karjauv, Isidre Munné-Bertran, Carl Orge Retzlaff, Adrià Romero-López, Tomasz Sołtysiński, Markus Plass, Rita Carvalho, Peter Steinbach, Yu-Chia Lan, Nassim Bouteldja, David Haber, Mateo Rojas-Carulla, Alireza Vafaei Sadr, Matthias Kraft, Daniel Krüger, Rutger Fick, Tobias Lang, Peter Boor, Heimo Müller, Peter Hufnagl, Norman Zerbe

Figure 1 for Recommendations on test datasets for evaluating AI solutions in pathology
Figure 2 for Recommendations on test datasets for evaluating AI solutions in pathology
Figure 3 for Recommendations on test datasets for evaluating AI solutions in pathology
Figure 4 for Recommendations on test datasets for evaluating AI solutions in pathology

Artificial intelligence (AI) solutions that automatically extract information from digital histology images have shown great promise for improving pathological diagnosis. Prior to routine use, it is important to evaluate their predictive performance and obtain regulatory approval. This assessment requires appropriate test datasets. However, compiling such datasets is challenging and specific recommendations are missing. A committee of various stakeholders, including commercial AI developers, pathologists, and researchers, discussed key aspects and conducted extensive literature reviews on test datasets in pathology. Here, we summarize the results and derive general recommendations for the collection of test datasets. We address several questions: Which and how many images are needed? How to deal with low-prevalence subsets? How can potential bias be detected? How should datasets be reported? What are the regulatory requirements in different countries? The recommendations are intended to help AI developers demonstrate the utility of their products and to help regulatory agencies and end users verify reported performance measures. Further research is needed to formulate criteria for sufficiently representative test datasets so that AI solutions can operate with less user intervention and better support diagnostic workflows in the future.

Viaarxiv icon

Evaluating Generic Auto-ML Tools for Computational Pathology

Dec 07, 2021
Lars Ole Schwen, Daniela Schacherer, Christian Geißler, André Homeyer

Figure 1 for Evaluating Generic Auto-ML Tools for Computational Pathology
Figure 2 for Evaluating Generic Auto-ML Tools for Computational Pathology
Figure 3 for Evaluating Generic Auto-ML Tools for Computational Pathology
Figure 4 for Evaluating Generic Auto-ML Tools for Computational Pathology

Image analysis tasks in computational pathology are commonly solved using convolutional neural networks (CNNs). The selection of a suitable CNN architecture and hyperparameters is usually done through exploratory iterative optimization, which is computationally expensive and requires substantial manual work. The goal of this article is to evaluate how generic tools for neural network architecture search and hyperparameter optimization perform for common use cases in computational pathology. For this purpose, we evaluated one on-premises and one cloud-based tool for three different classification tasks for histological images: tissue classification, mutation prediction, and grading. We found that the default CNN architectures and parameterizations of the evaluated AutoML tools already yielded classification performance on par with the original publications. Hyperparameter optimization for these tasks did not substantially improve performance, despite the additional computational effort. However, performance varied substantially between classifiers obtained from individual AutoML runs due to non-deterministic effects. Generic CNN architectures and AutoML tools could thus be a viable alternative to manually optimizing CNN architectures and parametrizations. This would allow developers of software solutions for computational pathology to focus efforts on harder-to-automate tasks such as data curation.

Viaarxiv icon

Graduated Optimization of Black-Box Functions

Jun 04, 2019
Weijia Shao, Christian Geißler, Fikret Sivrikaya

Figure 1 for Graduated Optimization of Black-Box Functions
Figure 2 for Graduated Optimization of Black-Box Functions

Motivated by the problem of tuning hyperparameters in machine learning, we present a new approach for gradually and adaptively optimizing an unknown function using estimated gradients. We validate the empirical performance of the proposed idea on both low and high dimensional problems. The experimental results demonstrate the advantages of our approach for tuning high dimensional hyperparameters in machine learning.

* Accepted Workshop Submission for the 6th ICML Workshop on Automated Machine Learning 
Viaarxiv icon