Systematic benchmarking of microarray data classification: assessing the role of non-linearity and dimensionality reduction
Overfitting
Kernel (algebra)
Benchmarking
Linear classifier
DOI:
10.1093/bioinformatics/bth383
Publication Date:
2004-07-02T00:24:20Z
AUTHORS (4)
ABSTRACT
Abstract Motivation: Microarrays are capable of determining the expression levels thousands genes simultaneously. In combination with classification methods, this technology can be useful to support clinical management decisions for individual patients, e.g. in oncology. The aim paper is systematically benchmark role non-linear versus linear techniques and dimensionality reduction methods. Results: A systematic benchmarking study performed by comparing versions standard their based on kernel functions a radial basis function (RBF) kernel. total 9 binary cancer problems, derived from 7 publicly available microarray datasets, 20 randomizations each problem examined. Conclusions: Three main conclusions formulated performances independent test sets. (1) When performing least squares vector machines (LS-SVMs) (without reduction), RBF kernels used without risking too much overfitting. results obtained well-tuned never worse sometimes even statistically significantly better compared terms set receiver operating characteristic accuracy performances. (2) Even classifiers like LS-SVM kernel, using regularization very important. (3) principal component analysis (kernel PCA) before classification, an PCA tends result overfitting, especially when supervised feature selection. It has been observed that optimal selection large number features often indication Kernel gives results. Availability: Matlab scripts request. Supplementary information: http://www.esat.kuleuven.ac.be/~npochet/Bioinformatics/
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (163)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....