Compressed Spaced Suffix Arrays
0301 basic medicine
FOS: Computer and information sciences
Computer and information sciences
Similarity search
Applied Mathematics
0206 medical engineering
suffix array
02 engineering and technology
Compressed data structures; Relative compression; Similarity search; Spaced seeds; Spaced suffix arrays
004
Compressed data structure
Spaced suffix array
03 medical and health sciences
spaced seeds
Computational Theory and Mathematic
Computational Mathematic
Computer Science - Data Structures and Algorithms
Spaced seed
Data Structures and Algorithms (cs.DS)
Relative compression
Mathematics
DOI:
10.1007/s11786-016-0283-z
Publication Date:
2017-02-02T06:16:24Z
AUTHORS (3)
ABSTRACT
Spaced seeds are important tools for similarity search in bioinformatics, and using several seeds together often significantly improves their performance. With existing approaches, however, for each seed we keep a separate linear-size data structure, either a hash table or a spaced suffix array (SSA). In this paper we show how to compress SSAs relative to normal suffix arrays (SAs) and still support fast random access to them. We first prove a theoretical upper bound on the space needed to store an SSA when we already have the SA. We then present experiments indicating that our approach works even better in practice.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (25)
CITATIONS (2)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....