The receiver operating characteristic curve accurately assesses imbalanced datasets
Haystack
Binary classification
Area under curve
DOI:
10.1016/j.patter.2024.100994
Publication Date:
2024-05-31T14:36:41Z
AUTHORS (6)
ABSTRACT
Many problems in biology require looking for a "needle haystack," corresponding to binary classification where there are few positives within much larger set of negatives, which is referred as class imbalance. The receiver operating characteristic (ROC) curve and the associated area under (AUC) have been reported ill-suited evaluate prediction performance on imbalanced more interest positive minority class, while precision-recall (PR) preferable. We show via simulation real case study that this misinterpretation difference between ROC PR spaces, showing robust imbalance, highly sensitive Furthermore, we imbalance cannot be easily disentangled from classifier measured PR-AUC.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (43)
CITATIONS (24)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....