NJst and ASTRID are not statistically consistent under a random model of missing data

FOS: Biological sciences 0206 medical engineering 92D15 (primary), Populations and Evolution (q-bio.PE) 02 engineering and technology Quantitative Biology - Populations and Evolution
DOI: 10.48550/arxiv.2001.07844 Publication Date: 2020-01-01
ABSTRACT
6 pages, no figures, provides counterexample to theorem (the first and corresponding author are both co-authors on this paper)<br/>Species tree estimation from multi-locus datasets is statistically challenging for multiple reasons, including gene tree heterogeneity across the genome due to incomplete lineage sorting (ILS). Species tree estimation methods have been developed that operate by estimating gene trees and then using those gene trees to estimate the species tree. Several of these methods (e.g., ASTRAL, ASTRID, and NJst) are provably statistically consistent under the multi-species coalescent (MSC) model, provided that the gene trees are estimated correctly, and there is no missing data. Recently, Nute et al. (BMC Genomics 2018) addressed the question of whether these methods remain statistically consistent under random models of taxon deletion, and asserted that they do so. Here we provide a counterexample to one of these theorems, and establish that ASTRID and NJst are not statistically consistent under an i.i.d. model of taxon deletion.<br/>
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....