Re: [Denovoassembler-devel] Confused about coding -- 500k phiX simulation still fails
Ray -- Parallel genome assemblies for parallel DNA sequencing
Brought to you by:
sebhtml
From: Eccles, D. <dav...@mp...> - 2011-07-13 22:44:14
|
From: Sébastien Boisvert [mailto:seb...@ul...] > I think you generated too few reads. > That is why it fails. > You need to get the peak coverage at least above 20. Okay, so I used the VirtualNextGenSequencer to create 500,000 reads, which has a substantially higher coverage: http://pastebin.com/18SwshYE It actually registers as having a peak now, which is nice, because it can then proceed with the assembly. The statistics are still a little odd, finding a peak at 1415 (neighbourhood counts 58,52,70,74*,56,72,68) when 2862 has a higher count (70,54,72,82*,56,50,52), as does 2873 (56,58,60,76*,64,76,64), but I'll put that down to smoothing -- it's close enough. So now that I've muscled my way past the coverage, the assembly should be a breeze: http://pastebin.com/xJJ9yycT ... or maybe not. It [still] looks like it's failing on seed creation. Would it make sense to include a coverage calculation in the seeding tests done by the workers (SeedWorker.cpp)? What happens if a vertex (with >5 coverage) has a parent that has 1 edge in, one edge out, but the coverage of that parent is below 5? -- David |