Coverage-based resampling: Building robust consolidated decision trees

Resampling Robustness
DOI: 10.1016/j.knosys.2014.12.023 Publication Date: 2015-01-09T12:22:10Z
ABSTRACT
Coverage-based resampling determines the number of samples to be used based on dataset's class distribution.Consolidated trees achieve better results for most performance measures when using higher coverage values.CTC ranks first against multiple genetics-based and classical algorithms for rule induction.CTC combined with SMOTE tops state of the art techniques designed to tackle class imbalance. The class imbalance problem has attracted a lot of attention from the data mining community recently, becoming a current trend in machine learning research. The Consolidated Tree Construction (CTC) algorithm was proposed as an algorithm to solve a classification problem involving a high degree of class imbalance without losing the explaining capacity, a desirable characteristic of single decision trees and rule sets. CTC works by resampling the training sample and building a tree from each subsample, in a similar manner to ensemble classifiers, but applying the ensemble process during the tree construction phase, resulting in a unique final tree. In the ECML/PKDD 2013 conference the term "Inner Ensembles" was coined to refer to such methodologies. In this paper we propose a resampling strategy for classification algorithms that use multiple subsamples. This strategy is based on the class distribution of the training sample to ensure a minimum representation of all classes when resampling. This strategy has been applied to CTC over different classification contexts. A robust classification algorithm should not just be able to rank in the top positions for certain classification problems but should be able to excel when faced with a broad range of problems. In this paper we establish the robustness of the CTC algorithm against a wide set of classification algorithms with explaining capacity.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (68)
CITATIONS (33)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....