Robust proportional overlapping analysis for feature selection in binary classification within functional genomic experiments
0301 basic medicine
Artificial intelligence
Support vector machine
Symbolic Regression
Robustness (evolution)
Pattern recognition (psychology)
Gene
Agricultural and Biological Sciences
03 medical and health sciences
Artificial Intelligence
Biochemistry, Genetics and Molecular Biology
Microarray Data Analysis and Gene Expression Profiling
FOS: Mathematics
Genetics
Feature Selection
Viral Diseases in Livestock and Poultry
Binary classification
Molecular Biology
Biology
Life Sciences
Discriminative model
QA75.5-76.95
Computer science
Overlapping analysis
Functional genomic
Algorithms and Analysis of Algorithms
Electronic computers. Computer science
Application of Genetic Programming in Machine Learning
FOS: Biological sciences
Computer Science
Physical Sciences
Feature selection
Animal Science and Zoology
Mathematics
Random forest
DOI:
10.7717/peerj-cs.562
Publication Date:
2021-06-01T09:46:26Z
AUTHORS (7)
ABSTRACT
In this paper, a novel feature selection method called Robust Proportional Overlapping Score (RPOS), for microarray gene expression datasets has been proposed, by utilizing the robust measure of dispersion, i.e., Median Absolute Deviation (MAD). This method robustly identifies the most discriminative genes by considering the overlapping scores of the gene expression values for binary class problems. Genes with a high degree of overlap between classes are discarded and the ones that discriminate between the classes are selected. The results of the proposed method are compared with five state-of-the-art gene selection methods based on classification error, Brier score, and sensitivity, by considering eleven gene expression datasets. Classification of observations for different sets of selected genes by the proposed method is carried out by three different classifiers, i.e., random forest, k-nearest neighbors (k-NN), and support vector machine (SVM). Box-plots and stability scores of the results are also shown in this paper. The results reveal that in most of the cases the proposed method outperforms the other methods.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (59)
CITATIONS (13)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....