The Best Way to Select Features? Comparing MDA, LIME, and SHAP

Interpretability Benchmark (surveying) Feature (linguistics)
DOI: 10.3905/jfds.2020.1.047 Publication Date: 2020-12-04T13:55:15Z
ABSTRACT
Feature selection in machine learning is subject to the intrinsic randomness of feature algorithms (e.g., random permutations during MDA). Stability selected features with respect such essential human interpretability a algorithm. The authors propose rank-based stability metric called <i>instability index</i> compare stabilities three algorithms—MDA, LIME, and SHAP—as applied forests. Typically, are by averaging many iterations Although variability does decrease as number increases, it not go zero, do necessarily converge same set. LIME SHAP found be more stable than MDA, at least for top-ranked features. Hence, overall, best suited interpretability. However, set from all significantly improves various predictive metrics out sample, their performance differ significantly. Experiments were conducted on synthetic datasets, two public benchmark an S&amp;P 500 dataset, proprietary data active investment strategy. <b>TOPICS:</b>Big data/machine learning, simulations, statistical methods <b>Key Findings</b> ▪ novel ranked-based instability index measure SHAP. MDA high importance scores. exemplified trading strategy both Sharpe ratio cumulative return.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (53)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....