Accelerating the identification of the allergenic potential of plant proteins using a stacked ensemble-learning framework
Matthews correlation coefficient
Feature (linguistics)
DOI:
10.1080/07391102.2024.2318482
Publication Date:
2024-02-22T12:32:41Z
AUTHORS (4)
ABSTRACT
Plant-allergenic proteins (PAPs) have the potential to induce allergic reactions in certain individuals. While these are generally innocuous for majority of people, they can elicit an immune response those with particular sensitivities. Thus, screening and prioritizing allergenic plant is indispensable development diagnostic tools, therapeutic interventions or medications treat reactions. However, investigating based on experimental methods costly labour-intensive. Therefore, we develop StackPAP, a three-layer stacking ensemble framework accurate large-scale identification PAPs. In at first layer, conducted comprehensive analysis extensive set feature descriptors. Subsequently, selected fused five sequence-based descriptors, including amphiphilic pseudo-amino acid composition, dipeptide deviation from expected mean, amino pseudo composition composition. Additionally, applied efficient genetic algorithm (GA-SAR) determine informative sets. second 12 powerful machine learning (ML) methods, combination all sets, were employed construct pool base classifiers. Finally, 13 classifiers using GA-SAR method combined final meta-classifier. Our results revealed promising prediction performance accuracy, Matthew's correlation coefficient AUC 0.984, 0.969 0.993, respectively, as judged by independent test dataset. conclusion, both cross-validation indicated superior StackPAP compared several ML-based To accelerate allergenicity proteins, developed user-friendly web server (https://pmlabqsar.pythonanywhere.com/StackPAP). We anticipate that will be useful tool rapidly PAPs vast number proteins.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (75)
CITATIONS (3)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....