An ensemble-based feature selection framework to select risk factors of childhood obesity for policy decision making
Interpretability
Discriminative model
Feature (linguistics)
Lasso
Ensemble Learning
DOI:
10.1186/s12911-021-01580-0
Publication Date:
2021-07-21T04:02:58Z
AUTHORS (7)
ABSTRACT
Abstract Background The increasing prevalence of childhood obesity makes it essential to study the risk factors with a sample representative population covering more health topics for better preventive policies and interventions. It is aimed develop an ensemble feature selection framework large-scale data identify good interpretability clinical relevance. Methods We analyzed collected from 426,813 children under 18 during 2000–2019. A BMI above 90th percentile same age gender was defined as overweight. An framework, Bagging-based Feature Selection integrating MapReduce (BFSMR), proposed factors. comprises 5 models (filter mutual information/SVM-RFE/Lasso/Ridge/Random Forest) filter, wrapper, embedded methods. Each model identified 10 variables based on variable importance. Considering accuracy, F-score, characteristics, were classified into 3 levels different weights: Lasso/Ridge, Filter/SVM-RFE, Random Forest. voting strategy applied aggregate selected features, both weights taken consideration. compared our another two selecting top-ranked features in terms 6 dimensions interpretability. Results Our method performed best select top by BFSMR are age, sex, birth year, breastfeeding type, smoking habit diet-related knowledge mothers, exercise, Mother’s systolic blood pressure. Conclusion provides solution identifying diverse interpretable set without bias data, which can help potentially some other diseases future interventions or policies.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (45)
CITATIONS (11)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....