Abstract:We propose DIVERSE, a framework for systematically exploring the Rashomon set of deep neural networks, the collection of models that match a reference model's accuracy while differing in their predictive behavior. DIVERSE augments a pretrained model with Feature-wise Linear Modulation (FiLM) layers and uses Covariance Matrix Adaptation Evolution Strategy (CMA-ES) to search a latent modulation space, generating diverse model variants without retraining or gradient access. Across MNIST, PneumoniaMNIST, and CIFAR-10, DIVERSE uncovers multiple high-performing yet functionally distinct models. Our experiments show that DIVERSE offers a competitive and efficient exploration of the Rashomon set, making it feasible to construct diverse sets that maintain robustness and performance while supporting well-balanced model multiplicity. While retraining remains the baseline to generate Rashomon sets, DIVERSE achieves comparable diversity at reduced computational cost.




Abstract:We present an approach, AI-Spectra, to leverage model multiplicity for interactive systems. Model multiplicity means using slightly different AI models yielding equally valid outcomes or predictions for the same task, thus relying on many simultaneous "expert advisors" that can have different opinions. Dealing with multiple AI models that generate potentially divergent results for the same task is challenging for users to deal with. It helps users understand and identify AI models are not always correct and might differ, but it can also result in an information overload when being confronted with multiple results instead of one. AI-Spectra leverages model multiplicity by using a visual dashboard designed for conveying what AI models generate which results while minimizing the cognitive effort to detect consensus among models and what type of models might have different opinions. We use a custom adaptation of Chernoff faces for AI-Spectra; Chernoff Bots. This visualization technique lets users quickly interpret complex, multivariate model configurations and compare predictions across multiple models. Our design is informed by building on established Human-AI Interaction guidelines and well know practices in information visualization. We validated our approach through a series of experiments training a wide variation of models with the MNIST dataset to perform number recognition. Our work contributes to the growing discourse on making AI systems more transparent, trustworthy, and effective through the strategic use of multiple models.