CONGA: Copy number variation genotyping in ancient genomes and low-coverage sequencing data

1000 Genomes Project Structural Variation
DOI: 10.1371/journal.pcbi.1010788 Publication Date: 2022-12-14T18:42:56Z
ABSTRACT
To date, ancient genome analyses have been largely confined to the study of single nucleotide polymorphisms (SNPs). Copy number variants (CNVs) are a major contributor disease and evolutionary adaptation, but identifying CNVs in shotgun-sequenced genomes is hampered by typical low coverage (<1×) short fragments (<80 bps), precluding standard CNV detection software be effectively applied genomes. Here we present CONGA, tailored for genotyping at coverage. Simulations down-sampling experiments suggest that CONGA can genotype deletions >1 kbps with F-scores >0.75 ≥1×, distinguish between heterozygous homozygous states. We used 10,002 outgroup-ascertained across heterogenous set 71 human spanning last 50,000 years, produced using variable experimental protocols. A fraction these (21/71) display divergent deletion profiles unrelated their population origin, attributable technical factors such as read length. The majority sample (50/71), despite originating from nine different laboratories having coverages ranging 0.44×-26× (median 4×) average lengths 52-121 bps 69), exhibit coherent frequencies. Across 50 genomes, inter-individual genetic diversity measured SNPs CONGA-genotyped highly correlated. also purifying selection signatures, expected. thus paves way systematic challenges posed
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (127)
CITATIONS (5)