Concept Activation Regions: A Generalized Framework For Concept-Based Explanations

Formalism (music) Feature vector Deep Neural Networks
DOI: 10.48550/arxiv.2209.11222 Publication Date: 2022-01-01
ABSTRACT
Concept-based explanations permit to understand the predictions of a deep neural network (DNN) through lens concepts specified by users. Existing methods assume that examples illustrating concept are mapped in fixed direction DNN's latent space. When this holds true, can be represented activation vector (CAV) pointing direction. In work, we propose relax assumption allowing scattered across different clusters Each is then region space includes these and call (CAR). To formalize idea, introduce an extension CAV formalism based on kernel trick support classifiers. This CAR yields global concept-based local feature importance. We prove built with radial kernels invariant under isometries. way, assigns same spaces have geometry. further demonstrate empirically CARs offer (1) more accurate descriptions how space; (2) closer human annotations (3) importance meaningfully relate each other. Finally, use show DNNs autonomously rediscover known scientific concepts, such as prostate cancer grading system.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....