Recent work has demonstrated the promise of combining local explanations with active learning for understanding and supervising black-box models. Here we show that, under specific conditions, these algorithms may misrepresent the quality of the model being learned. The reason is that the machine illustrates its beliefs by predicting and explaining the labels of the query instances: if the machine is unaware of its own mistakes, it may end up choosing queries on which it performs artificially well. This biases the "narrative" presented by the machine to the user.We address this narrative bias by introducing explanatory guided learning, a novel interactive learning strategy in which: i) the supervisor is in charge of choosing the query instances, while ii) the machine uses global explanations to illustrate its overall behavior and to guide the supervisor toward choosing challenging, informative instances. This strategy retains the key advantages of explanatory interaction while avoiding narrative bias and compares favorably to active learning in terms of sample complexity. An initial empirical evaluation with a clustering-based prototype highlights the promise of our approach.
* Accepted at TAILOR workshop at ECAI 2020, the 24th European
Conference on Artificial Intelligence