Re: [Denovoassembler-devel] clean up / simplify CoverageDistribution function
Ray -- Parallel genome assemblies for parallel DNA sequencing
Brought to you by:
sebhtml
From: Eccles, D. <dav...@mp...> - 2011-07-23 12:46:52
|
From: Eccles, David > ... and [as I've just noticed] I get a segfault on the 454 data with > my code when doing this.... I'll try to nail that down in the next day or so. Because there weren't already unit tests for Kmer a.compareTo(&b), I didn't think to put them in my code. I only discovered the bug from a segfault that happened because Ray was looking for a palindromic sequence with different starting bases, and couldn't find it (something like A300330303033003A vs the reverse-complement T300330303033003T). The problem was that it was clearing the first/last bases when it wasn't meant to (i.e. first bases known), and not clearing them when it was meant to (i.e. at least one first base unknown). https://github.com/gringer/ray/commit/451f766a5ce8b7da4ce6d340c234e7d4c5c7ca4 f ... I guess Kmer unit tests will be my next mini-project for Ray. I've been thinking again about unknown first bases and assembly, and think that probably the best thing to do would be to only store unknown first bases in the Kmer academy. In other words, graduation would require both a known first-base, and for the k-mer to occur more than once. It would mean that there's no messing round with unknown bases when doing the assembly. However, it would also mean that an assembly couldn't be done on sequences that all have a misread at their second position. -- David |