An oversampling framework for imbalanced classification based on Laplacian eigenmaps
Oversampling
Benchmark (surveying)
DOI:
10.1016/j.neucom.2020.02.081
Publication Date:
2020-02-25T07:41:05Z
AUTHORS (4)
ABSTRACT
Abstract Imbalanced classification is a challenging problem in machine learning and data mining. Oversampling methods, such as the Synthetic Minority Oversampling Technique (SMOTE), generate synthetic data to achieve data balance for imbalanced classification. However, such kind of oversampling methods generates unnecessary noise when the data are not well separated. On the other hand, there are many applications with inadequate training data and vast testing data, making the imbalanced classification much more challenging. In this paper, we propose a novel oversampling framework to achieve the following two objectives. (1) Improving the classification results of the SMOTE based oversampling methods; (2) Making the SMOTE based oversampling methods applicable when the training data are inadequate. The proposed framework utilizes the Laplacian eigenmaps to find an optimal dimensional space, where the data are well separated and the generation of noise by SMOTE based oversampling methods can be avoided. The construction of graph Laplacian not only explores the useful information from the unlabeled testing data to facilitate imbalanced learning, but also makes the learning process incremental. Experimental results on several benchmark datasets demonstrate the effectiveness of the proposed framework.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (35)
CITATIONS (38)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....