Bit-parallel sequence-to-graph alignment

0301 basic medicine Computer and information sciences Genome DE-BRUIJN GRAPHS 0206 medical engineering Sequence Analysis, DNA 02 engineering and technology Original Papers 03 medical and health sciences APPROXIMATE ALGORITHM ACCURATE Sequence Alignment Biochemistry, cell and molecular biology Mathematics Algorithms
DOI: 10.1093/bioinformatics/btz162 Publication Date: 2019-03-07T20:40:01Z
ABSTRACT
Abstract Motivation Graphs are commonly used to represent sets of sequences. Either edges or nodes can be labeled by sequences, so that each path in the graph spells a concatenated sequence. Examples include graphs to represent genome assemblies, such as string graphs and de Bruijn graphs, and graphs to represent a pan-genome and hence the genetic variation present in a population. Being able to align sequencing reads to such graphs is a key step for many analyses and its applications include genome assembly, read error correction and variant calling with respect to a variation graph. Results We generalize two linear sequence-to-sequence algorithms to graphs: the Shift-And algorithm for exact matching and Myers’ bitvector algorithm for semi-global alignment. These linear algorithms are both based on processing w sequence characters with a constant number of operations, where w is the word size of the machine (commonly 64), and achieve a speedup of up to w over naive algorithms. For a graph with |V| nodes and |E| edges and a sequence of length m, our bitvector-based graph alignment algorithm reaches a worst case runtime of O(|V|+⌈mw⌉|E| log w) for acyclic graphs and O(|V|+m|E| log w) for arbitrary cyclic graphs. We apply it to five different types of graphs and observe a speedup between 3-fold and 20-fold compared with a previous (asymptotically optimal) alignment algorithm. Availability and implementation https://github.com/maickrau/GraphAligner Supplementary information Supplementary data are available at Bioinformatics online.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (41)
CITATIONS (53)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....