A new Centroid-Based Classification model for text categorization

Centroid Text Categorization Feature (linguistics)
DOI: 10.1016/j.knosys.2017.08.020 Publication Date: 2017-08-30T18:47:06Z
ABSTRACT
The automatic text categorization technique has gained significant attention among researchers because of the increasing availability online information. Therefore, many different learning approaches have been designed in field. Among them, widely used method is Centroid-Based Classifier (CBC) due to its theoretical simplicity and computational efficiency. However, classification accuracy CBC greatly depends on data distribution. Thus it leads a misfit model also poor performance when distribution highly skewed. In this paper, new named as Gravitation Model (GM) proposed solve class-imbalanced problem. training phase, each class weighted by mass factor, which can be learned from data, indicate corresponding class. testing document will assigned particular with max gravitational force. comparisons variants based results experiments conducted twelve real datasets show that gravitation consistently outperforms together Class-Feature-Centroid (CFC). Also, obtains competitive DragPushing (DP) while maintains more stable performance. Thus, proved less over-fitting higher ability than model.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (83)
CITATIONS (23)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....