Concept

Related Publications:
Avinash Kori, Parth Natekar, Ganapathy Krishnamurthi, and Balaji Srinivasan. "Abstracting Deep Neural Networks into Concept Graphs for Concept Level Interpretability." Accepted to AAAI 2021 International Workshop on Health Intelligence (W3PHIAI-21), arXiv preprint arXiv:2008.06457 (2020). (Link)

Most interpretability techniques do not capture the concept-based reasoning that human beings follow. This project aims to provide an alternative graphical representation of the deep learning models by formulating an abstract, higher-level concept graph. We use a clustering based approach to find active inference trails in the model. We test our approach on deep learning models for Brain-Tumor Segmentation and Diabetic Retinopathy Classification. We work with radiologists and ophthalmologists to understand the obtained inference trails from a medical perspective and show that medically relevant concept trails are obtained which highlight the hierarchy of the decision-making process followed by the model.

Below is shown a visual depiction of our method.

We posit that groups of weights in the model are responsible for identifying individual concepts in the input image. Identifying these concepts and the relationship between them would allow us to find active concept-trails in the network. The proposed technique contains the following steps:

Concept Formation:
In this step, we use a clustering based approach to group weights which may be responsible for identifying distinct concepts in the input image.

Concept Identification:
In this step, we try to associate formed weight clusters with some region in the input image which corresponds to a human-understandable concept.

Graph Formation and Information Flow:
Given a set of concepts, we now have the means to build our equivalent graphical representation. We use a mutual information based metric to quantify relationships between concepts. At the end of this step, we can identify active concept trails that tell us the inference steps a model may have used to make a prediction. The image below shows an example concept trail for 'Severe Diabetic Retinopathy' extracted using our method and verified by an ophthalmologist. The caption contains descriptions of the specific concepts extracted in the trail (attention regions in green).

Abstracting Deep Neural Networks into Interpretable Concept GraphsRepurposing Trained Deep Networks into Concept Graphs for Concept-Level Interpretability

Abstracting Deep Neural Networks into Interpretable Concept Graphs
Repurposing Trained Deep Networks into Concept Graphs for Concept-Level Interpretability