Bit-parallel sequence-to-graph alignment
0301 basic medicine
Computer and information sciences
Genome
DE-BRUIJN GRAPHS
0206 medical engineering
Sequence Analysis, DNA
02 engineering and technology
Original Papers
03 medical and health sciences
APPROXIMATE
ALGORITHM
ACCURATE
Sequence Alignment
Biochemistry, cell and molecular biology
Mathematics
Algorithms
DOI:
10.1093/bioinformatics/btz162
Publication Date:
2019-03-07T20:40:01Z
AUTHORS (3)
ABSTRACT
Abstract
Motivation
Graphs are commonly used to represent sets of sequences. Either edges or nodes can be labeled by sequences, so that each path in the graph spells a concatenated sequence. Examples include graphs to represent genome assemblies, such as string graphs and de Bruijn graphs, and graphs to represent a pan-genome and hence the genetic variation present in a population. Being able to align sequencing reads to such graphs is a key step for many analyses and its applications include genome assembly, read error correction and variant calling with respect to a variation graph.
Results
We generalize two linear sequence-to-sequence algorithms to graphs: the Shift-And algorithm for exact matching and Myers’ bitvector algorithm for semi-global alignment. These linear algorithms are both based on processing w sequence characters with a constant number of operations, where w is the word size of the machine (commonly 64), and achieve a speedup of up to w over naive algorithms. For a graph with |V| nodes and |E| edges and a sequence of length m, our bitvector-based graph alignment algorithm reaches a worst case runtime of O(|V|+⌈mw⌉|E| log w) for acyclic graphs and O(|V|+m|E| log w) for arbitrary cyclic graphs. We apply it to five different types of graphs and observe a speedup between 3-fold and 20-fold compared with a previous (asymptotically optimal) alignment algorithm.
Availability and implementation
https://github.com/maickrau/GraphAligner
Supplementary information
Supplementary data are available at Bioinformatics online.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (41)
CITATIONS (53)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....