With the rise of improved sequencing technologies, genomics is expanding from a single reference per species paradigm into a more comprehensive pan-genome approach with multiple individuals represented and analyzed together. Here we introduce a novel O(n log n) time and space algorithm called splitMEM, that directly constructs the compressed de Bruijn graph for a pan-genome of total length n. To achieve this time complexity, we augment the suffix tree with suffix skips, a new construct that allows us to traverse several suffix links in constant time, and use them to efficiently decompose maximal exact matches (MEMs) during a suffix tree traversal.

Project Activity

See All Activity >

Categories

Bio-Informatics

License

Apache License V2.0

Follow SplitMEM

SplitMEM Web Site

Other Useful Business Software
G-P - Global EOR Solution Icon
G-P - Global EOR Solution

Companies searching for an Employer of Record solution to mitigate risk and manage compliance, taxes, benefits, and payroll anywhere in the world

With G-P's industry-leading Employer of Record (EOR) and Contractor solutions, you can hire, onboard and manage teams in 180+ countries — quickly and compliantly — without setting up entities.
Learn More
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of SplitMEM!

Additional Project Details

Operating Systems

Linux

Intended Audience

Science/Research

User Interface

Command-line

Programming Language

C++

Related Categories

C++ Bio-Informatics Software

Registered

2014-04-04