NFDI4DS | UHH-SEMS - Publication Details

Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

FOS: Computer and information sciences Statistics - Machine Learning 0202 electrical engineering, electronic engineering, information engineering Machine Learning (stat.ML) 02 engineering and technology

DOI: 10.48550/arxiv.1711.11279 Publication Date: 2017-01-01

Abstract Supplemental Material References Cited by

AUTHORS (7)

Kim, Been

Wattenberg, Martin

Gilmer, Justin

Cai, Carrie

Wexler, James

Viegas, Fernanda

Sayres, Rory

ABSTRACT

The interpretation of deep learning models is a challenge due to their size, complexity, and often opaque internal state. In addition, many systems, such as image classifiers, operate on low-level features rather than high-level concepts. To address these challenges, we introduce Concept Activation Vectors (CAVs), which provide an interpretation of a neural net's internal state in terms of human-friendly concepts. The key idea is to view the high-dimensional internal state of a neural net as an aid, not an obstacle. We show how to use CAVs as part of a technique, Testing with CAVs (TCAV), that uses directional derivatives to quantify the degree to which a user-defined concept is important to a classification result--for example, how sensitive a prediction of "zebra" is to the presence of stripes. Using the domain of image classification as a testing ground, we describe how CAVs may be used to explore hypotheses and generate insights for a standard image classification network as well as a medical application.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products

PlumX Metrics

Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....