This paper achieves significant progress in the field of abstract reasoning, particularly in addressing Raven's Progressive Matrices (RPM) and Bongard-Logo problems. We propose the D2C approach, which redefines conceptual boundaries in these domains and bridges the gap between high-level concepts and their low-dimensional representations. Based on this, we further introduce the D3C method that handles Bongard-Logo problems and significantly improves reasoning accuracy by estimating the distribution of image representations and measuring their Sinkhorn distance. To enhance computational efficiency, we introduce the D3C-cos variant, which provides an efficient and accurate solution for RPM problems by constraining distribution distances. Additionally, we present Lico-Net, a network that combines D3C and D3C-cos to achieve state-of-the-art performance in both problem-solving and interpretability. Finally, we extend our approach to D4C, employing adversarial strategies to further refine conceptual boundaries and demonstrate notable improvements for both RPM and Bongard-Logo problems. Overall, our contributions offer a new perspective and practical solutions to the field of abstract reasoning.
Abstract reasoning problems challenge the perceptual and cognitive abilities of AI algorithms, demanding deeper pattern discernment and inductive reasoning beyond explicit image features. This study introduces PMoC, a tailored probability model for the Bongard-Logo problem, achieving high reasoning accuracy by constructing independent probability models. Additionally, we present Pose-Transformer, an enhanced Transformer-Encoder designed for complex abstract reasoning tasks, including Bongard-Logo, RAVEN, I-RAVEN, and PGM. Pose-Transformer incorporates positional information learning, inspired by capsule networks' pose matrices, enhancing its focus on local positional relationships in image data processing. When integrated with PMoC, it further improves reasoning accuracy. Our approach effectively addresses reasoning difficulties associated with abstract entities' positional changes, outperforming previous models on the OIG, D3$\times$3 subsets of RAVEN, and PGM databases. This research contributes to advancing AI's capabilities in abstract reasoning and cognitive pattern recognition.
Abstract reasoning problems pose significant challenges to artificial intelligence algorithms, demanding cognitive capabilities beyond those required for perception tasks. This study introduces the Triple-CFN approach to tackle the Bongard-Logo problem, achieving notable reasoning accuracy by implicitly reorganizing the concept space of conflicting instances. Additionally, the Triple-CFN paradigm proves effective for the RPM problem with necessary modifications, yielding competitive results. To further enhance performance on the RPM issue, we develop the Meta Triple-CFN network, which explicitly structures the problem space while maintaining interpretability on progressive patterns. The success of Meta Triple-CFN is attributed to its paradigm of modeling the conceptual space, equivalent to normalizing reasoning information. Based on this ideology, we introduce the Re-space layer, enhancing the performance of both Meta Triple-CFN and Triple-CFN. This paper aims to contribute to advancements in machine intelligence by exploring innovative network designs for addressing abstract reasoning problems, paving the way for further breakthroughs in this domain.