Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Evaluating Understanding on Conceptual Abstraction Benchmarks

Jun 28, 2022

Victor Vikram Odouard, Melanie Mitchell

Figure 1 for Evaluating Understanding on Conceptual Abstraction Benchmarks

Figure 2 for Evaluating Understanding on Conceptual Abstraction Benchmarks

Figure 3 for Evaluating Understanding on Conceptual Abstraction Benchmarks

Figure 4 for Evaluating Understanding on Conceptual Abstraction Benchmarks

Share this with someone who'll enjoy it:

Abstract:A long-held objective in AI is to build systems that understand concepts in a humanlike way. Setting aside the difficulty of building such a system, even trying to evaluate one is a challenge, due to present-day AI's relative opacity and its proclivity for finding shortcut solutions. This is exacerbated by humans' tendency to anthropomorphize, assuming that a system that can recognize one instance of a concept must also understand other instances, as a human would. In this paper, we argue that understanding a concept requires the ability to use it in varied contexts. Accordingly, we propose systematic evaluations centered around concepts, by probing a system's ability to use a given concept in many different instantiations. We present case studies of such an evaluations on two domains -- RAVEN (inspired by Raven's Progressive Matrices) and the Abstraction and Reasoning Corpus (ARC) -- that have been used to develop and assess abstraction abilities in AI systems. Our concept-based approach to evaluation reveals information about AI systems that conventional test sets would have left hidden.

* EBeM'22: AI Evaluation Beyond Metrics, July 24, 2022, Vienna, Austria

View paper on

Share this with someone who'll enjoy it:

Title:Evaluating Understanding on Conceptual Abstraction Benchmarks

Paper and Code