Application of data mining methods to improve screening for the risk of early gastric cancer

Early gastric cancer China Research Computer applications to medicine. Medical informatics R858-859.7 C5.0 decision tree Logistic regression Risk Assessment 3. Good health 03 medical and health sciences Logistic Models 0302 clinical medicine Stomach Neoplasms Multilayer perceptron Data Mining Humans Neural Networks, Computer SMOTE Early Detection of Cancer Tree augmented naive bayesian network
DOI: 10.1186/s12911-018-0689-4 Publication Date: 2018-12-07T09:41:22Z
ABSTRACT
Although gastric cancer is a malignancy with high morbidity and mortality in China, the survival rate of patients early (EGC) after surgical resection. To strengthen diagnosing screening key to improve life quality EGC. This study applied data mining methods for risk EGC on basis noninvasive factors, displayed important influence factors The dataset was derived from project First Hospital Affiliated Guangdong Pharmaceutical University. A series questionnaire surveys, serological examinations endoscopy plus pathology biopsy were conducted 618 diseases. Their categorized into low by results biopsy. synthetic minority oversampling technique (SMOTE) used solve imbalance categories Four classification models established, including logistic regression (LR) three algorithms. had higher accuracy than LR model. Gain curves convexes more closer ideal contrast that AUC larger model as well. predicted effectively comparison Moreover, this found 16 EGC, such occupations, helicobacter pylori infection, drinking hot water so on. have optimal predictive behaviors over model, therefore can evaluate assist clinicians improving diagnosis Sixteen illustrated, which may helpfully assess carcinogenesis, remind prevention detection cancer. also be conducive clinical researchers selecting conducting models.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (42)
CITATIONS (24)