Abstract:Learning semantics is essential for deep learning models to be interpretable and better aligned with human reasoning. Concept-based models approach this by representing classes through meaningful semantic abstractions, but typically treat all concepts as a flat, unstructured set learned at a single neural network layer. This overlooks a fundamental property of human semantic understanding: concepts being organized hierarchically, from general to specific. While deep networks do learn a hierarchy of visual features, this structure is rarely aligned with explicit semantic hierarchies. Drawing on Formal Concept Analysis, we demonstrate that formal concept lattices provide principled semantic scaffolds to guide neural network learning. These lattices naturally identify where in the network concepts should be learned based on their level of generality. This allows the model to develop staged, semantically grounded representations throughout its depth. Empirical results on real-world datasets show that our models produce more interpretable embeddings, support more effective interventions, and learn concept representations that are both meaningful and hierarchically structured.




Abstract:Contextual information plays a critical role in object recognition models within computer vision, where changes in context can significantly affect accuracy, underscoring models' dependence on contextual cues. This study investigates how context manipulation influences both model accuracy and feature attribution, providing insights into the reliance of object recognition models on contextual information as understood through the lens of feature attribution methods. We employ a range of feature attribution techniques to decipher the reliance of deep neural networks on context in object recognition tasks. Using the ImageNet-9 and our curated ImageNet-CS datasets, we conduct experiments to evaluate the impact of contextual variations, analyzed through feature attribution methods. Our findings reveal several key insights: (a) Correctly classified images predominantly emphasize object volume attribution over context volume attribution. (b) The dependence on context remains relatively stable across different context modifications, irrespective of classification accuracy. (c) Context change exerts a more pronounced effect on model performance than Context perturbations. (d) Surprisingly, context attribution in `no-information' scenarios is non-trivial. Our research moves beyond traditional methods by assessing the implications of broad-level modifications on object recognition, either in the object or its context.