Deconvoluting the diversity of within-host pathogen strains in a multi-locus sequence typing framework
Bioinformatics
QH301-705.5
Multi-Locus Sequence Typing
Computer applications to medicine. Medical informatics
R858-859.7
610
Models, Biological
Multi-locus sequence typing
03 medical and health sciences
Databases, Genetic
Computer Simulation
Biology (General)
01 Mathematical Sciences
Alleles
Integer Linear Programming
0303 health sciences
Research
Genetic Variation
06 Biological Sciences
004
Integer linear programming
Bacterial diversity
Genetic Loci
Borrelia burgdorferi
Host-Pathogen Interactions
08 Information and Computing Sciences
Multilocus Sequence Typing
DOI:
10.1186/s12859-019-3204-8
Publication Date:
2019-12-17T01:02:24Z
AUTHORS (4)
ABSTRACT
Abstract
Background
Bacterial pathogens exhibit an impressive amount of genomic diversity. This diversity can be informative of evolutionary adaptations, host-pathogen interactions, and disease transmission patterns. However, capturing this diversity directly from biological samples is challenging.
Results
We introduce a framework for understanding the within-host diversity of a pathogen using multi-locus sequence types (MLST) from whole-genome sequencing (WGS) data. Our approach consists of two stages. First we process each sample individually by assigning it, for each locus in the MLST scheme, a set of alleles and a proportion for each allele. Next, we associate to each sample a set of strain types using the alleles and the strain proportions obtained in the first step. We achieve this by using the smallest possible number of previously unobserved strains across all samples, while using those unobserved strains which are as close to the observed ones as possible, at the same time respecting the allele proportions as closely as possible. We solve both problems using mixed integer linear programming (MILP). Our method performs accurately on simulated data and generates results on a real data set of Borrelia burgdorferi genomes suggesting a high level of diversity for this pathogen.
Conclusions
Our approach can apply to any bacterial pathogen with an MLST scheme, even though we developed it with Borrelia burgdorferi, the etiological agent of Lyme disease, in mind. Our work paves the way for robust strain typing in the presence of within-host heterogeneity, overcoming an essential challenge currently not addressed by any existing methodology for pathogen genomics.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (24)
CITATIONS (1)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....