A Universal Probe Set for Targeted Sequencing of 353 Nuclear Genes from Any Flowering Plant Designed Using k-medoids Clustering
Phylogenomics
Sequence (biology)
Lineage (genetic)
DOI:
10.1101/361618
Publication Date:
2018-07-04T14:45:26Z
AUTHORS (18)
ABSTRACT
Abstract Sequencing of target-enriched libraries is an efficient and cost-effective method for obtaining DNA sequence data from hundreds nuclear loci phylogeny reconstruction. Much the cost associated with developing targeted sequencing approaches preliminary needed identifying orthologous probe design. In plants, has proven difficult due to a large number whole-genome duplication events, especially in angiosperms (flowering plants). We used multiple alignments over 600 353 putatively single-copy protein-coding genes design set probes phylogenetic studies any angiosperm lineage. To maximize potential while minimizing production, we introduce k-medoids clustering approach identify minimum sequences necessary represent each coding final set. Using this method, five 15 representative were selected per locus, representing diversity more efficiently than if designed using available sequenced genomes alone. test our approximately 80,000 probes, hybridized 42 species spanning all higher-order lineages angiosperms, focus on taxa not present probes. Out possible sequences, recovered average 283 at least 100 species. Differences among recovery could be explained by relatedness design, suggesting that there no bias Our set, which 260 kbp sequence, achieved median 137 taxon regions, maximum 250 kbp, additional 212 flanking non-coding regions across These results suggest Angiosperms353 described here effective group flowering plants would useful level lineages, including angiosperms.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (62)
CITATIONS (9)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....