Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift

Benchmark (surveying) Uncertainty Quantification Predictive power
DOI: 10.48550/arxiv.1906.02530 Publication Date: 2019-01-01
ABSTRACT
Modern machine learning methods including deep have achieved great success in predictive accuracy for supervised tasks, but may still fall short giving useful estimates of their {\em uncertainty}. Quantifying uncertainty is especially critical real-world settings, which often involve input distributions that are shifted from the training distribution due to a variety factors sample bias and non-stationarity. In such well calibrated convey information about when model's output should (or not) be trusted. Many probabilistic methods, Bayesian-and non-Bayesian been proposed literature quantifying uncertainty, our knowledge there has not previously rigorous large-scale empirical comparison these under dataset shift. We present benchmark existing state-of-the-art on classification problems investigate effect shift calibration. find traditional post-hoc calibration does indeed short, as do several other previous methods. However, some marginalize over models give surprisingly strong results across broad spectrum tasks.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....