Hammer Lab, Bielefeld University

Shapley Interaction of Concepts

Many existing methods for automatic concept extraction rely on activations from the penultimate layer of deep neural networks. While this has produced promising results in explainable AI (XAI), such approaches most likely overlook complex interactions between concepts due to the linear classification layer. Additionally, methods using Shapley typically perturb features in the input space. This of course makes sense when working with tabular data but becomes more complex when dealing with high-dimensional data such as images and text. This project aims to develop methods to capture and analyze concept interactions— using Shapley-based interaction methods—and to investigate how these interactions can provide more nuanced explanations of model behavior applied specifically to Concept Bottleneck Models (CBMs) or another layer in a deep network.

Literature

Fumagalli, Fabian et al. “SHAP-IQ: Unified Approximation of any-order Shapley Interactions”
Fel, Thomas, et al. “Craft: Concept recursive activation factorization for explainability.”

[BA/MA/Project]

Shapley Interaction of Concepts