Quartet Based Gene Tree Imputation Using Deep Learning Improves Phylogenomic Analyses Despite Missing Data
Imputation (statistics)
Tree (set theory)
Data set
Phylogenomics
DOI:
10.1089/cmb.2022.0212
Publication Date:
2022-09-01T16:06:45Z
AUTHORS (6)
ABSTRACT
Species tree estimation is frequently based on phylogenomic approaches that use multiple genes from throughout the genome. However, for a combination of reasons (ranging sampling biases to more biological causes, as in gene birth and loss), trees are often incomplete, meaning not all species interest have common set genes. Incomplete can potentially impact accuracy inference. We, first time, introduce problem imputing quartet distribution induced by incomplete trees, which involves adding missing quartets back distribution. We present Quartet Gene Imputation using Deep Learning (QT-GILD), an automated specially tailored unsupervised deep learning technique, accompanied cues natural language processing, learns given generates complete accordingly. QT-GILD general-purpose technique needing no explicit modeling subject system or data heterogeneity. Experimental studies collection simulated empirical datasets suggest effectively impute distribution, results dramatic improvement accuracy. Remarkably, only imputes but also account error. Therefore, advances state-of-the-art face data.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (63)
CITATIONS (8)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....