With the rise of improved sequencing technologies, genomics is expanding from a single reference per species paradigm into a more comprehensive pan-genome approach with multiple individuals represented and analyzed together. Here we introduce a novel O(n log n) time and space algorithm called splitMEM, that directly constructs the compressed de Bruijn graph for a pan-genome of total length n. To achieve this time complexity, we augment the suffix tree with suffix skips, a new construct that allows us to traverse several suffix links in constant time, and use them to efficiently decompose maximal exact matches (MEMs) during a suffix tree traversal.
Categories
Bio-InformaticsLicense
Apache License V2.0Follow SplitMEM
Other Useful Business Software
Find Hidden Risks in Windows Task Scheduler
Windows Task Scheduler might be hiding critical failures. Download the free JAMS diagnostic tool to uncover problems before they impact production—get a color-coded risk report with clear remediation steps in minutes.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of SplitMEM!