Comparison and development of machine learning tools for the prediction of chronic obstructive pulmonary disease in the Chinese population

0301 basic medicine China 0303 health sciences Research Machine learning tools R SNP Allele frequencies Polymorphism, Single Nucleotide AQCI 3. Good health Machine Learning Pulmonary Disease, Chronic Obstructive 03 medical and health sciences COPD Medicine Humans
DOI: 10.1186/s12967-020-02312-0 Publication Date: 2020-04-02T13:32:12Z
ABSTRACT
Chronic obstructive pulmonary disease (COPD) is a major public health problem and cause of mortality worldwide. However, COPD in the early stage usually not recognized diagnosed. It necessary to establish risk model predict development.A total 441 patients 192 control subjects were recruited, 101 single-nucleotide polymorphisms (SNPs) determined using MassArray assay. With 5 clinical features as well SNPs, 6 predictive models established evaluated training set test by confusion matrix AU-ROC, AU-PRC, sensitivity (recall), specificity, accuracy, F1 score, MCC, PPV (precision) NPV. The selected ranked.Nine SNPs significantly associated with COPD. Among them, (rs1007052, OR = 1.671, P 0.010; rs2910164, 1.416, < 0.037; rs473892, 1.473, 0.044; rs161976, 1.594, rs159497, 1.445, 0.045; rs9296092, 1.832, 0.045) factors for COPD, while 3 (rs8192288, 0.593, 0.015; rs20541, 0.669, 0.018; rs12922394, 0.651, 0.022) protective development. In set, KNN, LR, SVM, DT XGboost obtained AU-ROC values above 0.82 AU-PRC 0.92. these models, highest (0.94), (0.97), accuracy (0.91), precision (0.95), score MCC (0.77) specificity (0.85), MLP (recall) (0.99) NPV (0.87). validation LR 0.80 0.85, respectively. KNN had (0.82), both same (0.81), (0.86). Both 0.94 0.84, feature importance analyses, we identified that AQCI, age, BMI greatest impact on abilities sex smoking less important.The showed excellent overall power, use machine learning tools combining SNP was suitable predicting
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (81)
CITATIONS (40)