NFDI4DS | UHH-SEMS - Publication Details

Optimized sample selection for cost-efficient long-read population sequencing

1000 Genomes Project Exome

DOI: 10.1101/gr.264879.120 Publication Date: 2021-04-02T21:05:58Z

Abstract Supplemental Material References Cited by

AUTHORS (9)

T. Rhyker Ranallo...

Zachary Lemmon

Sebastian Soyk

Sergey Aganezov

William J. Salerno

Rajiv C. McCoy

Zachary B. Lippman

Michael C. Schatz

Fritz J. Sedlazeck

ABSTRACT

An increasingly important scenario in population genetics is when a large cohort has been genotyped using low-resolution approach (e.g., microarrays, exome capture, short-read WGS), from which few individuals are resequenced more comprehensive approach, especially long-read sequencing. The subset of selected should ensure that the captured genetic diversity fully representative and includes variants across all subpopulations. For example, human variation historically focused on with European ancestry, but this represents small fraction overall diversity. Addressing this, SVCollector identifies optimal for resequencing by analyzing population-level VCF files genotyping studies. It then computes ranked list samples maximizes total number present within given size. To solve optimization problem, implements fast, greedy heuristic an exact algorithm integer linear programming. We apply simulated data, 2504 genomes 1000 Genomes Project, 3024 3000 Rice Project show rankings it than alternative naive strategies. When selecting 100 these cohorts, every subpopulation, whereas methods yield unbalanced selection. Finally, we cohorts follows power-law distribution naturally related to concept allele frequency spectrum, allowing us estimate increasing numbers samples.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (26)

CITATIONS (6)

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products CROSSREF - Publications

PlumX Metrics

Optimized sample selection for cost-efficient long-read population sequencing

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....