Sanity Checks for Saliency Metrics

FOS: Computer and information sciences Computer Science - Machine Learning Computer Vision and Pattern Recognition (cs.CV) Image and Video Processing (eess.IV) Computer Science - Computer Vision and Pattern Recognition Machine Learning (stat.ML) 02 engineering and technology Electrical Engineering and Systems Science - Image and Video Processing Machine Learning (cs.LG) Statistics - Machine Learning FOS: Electrical engineering, electronic engineering, information engineering 0202 electrical engineering, electronic engineering, information engineering
DOI: 10.1609/aaai.v34i04.6064 Publication Date: 2020-06-29T20:53:53Z
ABSTRACT
Saliency maps are a popular approach to creating post-hoc explanations of image classifier outputs. These methods produce estimates of the relevance of each pixel to the classification output score, which can be displayed as a saliency map that highlights important pixels. Despite a proliferation of such methods, little effort has been made to quantify how good these saliency maps are at capturing the true relevance of the pixels to the classifier output (i.e. their “fidelity”). We therefore investigate existing metrics for evaluating the fidelity of saliency methods (i.e. saliency metrics). We find that there is little consistency in the literature in how such metrics are calculated, and show that such inconsistencies can have a significant effect on the measured fidelity. Further, we apply measures of reliability developed in the psychometric testing literature to assess the consistency of saliency metrics when applied to individual saliency maps. Our results show that saliency metrics can be statistically unreliable and inconsistent, indicating that comparative rankings between saliency methods generated using such metrics can be untrustworthy.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (69)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....