[Denovoassembler-devel] RE : Confused about coding -- 500k phiX simulation still fails
Ray -- Parallel genome assemblies for parallel DNA sequencing
Brought to you by:
sebhtml
From: Sébastien B. <seb...@ul...> - 2011-07-14 03:07:08
|
Hello, Can you rerun with -p tests/phix/phix_500k_1.fasta tests/phix/phix_500k_2.fasta Otherwise I think your graph won't be connected enough because of the random number generator used to simulate reads. Also, what is the read length ? I added a system test for phiX and it works just fine on my branch master. see system-tests/tests/phix seb@fault:~/git-clones/ray/system-tests$ ./run-test-smp.sh phix|tee 1 I ran the assembly on my Samsung netbook with an Intel Atom processor. [1,0]<stdout>:Number of contigs: 1 [1,0]<stdout>:Total length of contigs: 5384 [1,0]<stdout>:Number of contigs >= 500 nt: 1 [1,0]<stdout>:Total length of contigs >= 500 nt: 5384 [1,0]<stdout>:Number of scaffolds: 1 [1,0]<stdout>:Total length of scaffolds: 5384 [1,0]<stdout>:Number of scaffolds >= 500 nt: 1 [1,0]<stdout>:Total length of scaffolds >= 500: 5384 [1,0]<stdout>: [1,0]<stdout>:Rank 0 wrote phix.Contigs.fasta [1,0]<stdout>:Rank 0 wrote phix.Scaffolds.fasta [1,0]<stdout>:Check for phix.* Sébastien > ________________________________________ > De : Eccles, David [dav...@mp...] > Date d'envoi : 13 juillet 2011 18:40 > À : Sébastien Boisvert; den...@li... > Objet : RE: Confused about coding -- 500k phiX simulation still fails > > From: Sébastien Boisvert [mailto:seb...@ul...] >> I think you generated too few reads. >> That is why it fails. >> You need to get the peak coverage at least above 20. > > Okay, so I used the VirtualNextGenSequencer to create 500,000 reads, which > has a substantially higher coverage: > > http://pastebin.com/18SwshYE > > It actually registers as having a peak now, which is nice, because it can > then proceed with the assembly. The statistics are still a little odd, > finding a peak at 1415 (neighbourhood counts 58,52,70,74*,56,72,68) when 2862 > has a higher count (70,54,72,82*,56,50,52), as does 2873 > (56,58,60,76*,64,76,64), but I'll put that down to smoothing -- it's close > enough. > > So now that I've muscled my way past the coverage, the assembly should be a > breeze: > > http://pastebin.com/xJJ9yycT > > ... or maybe not. It [still] looks like it's failing on seed creation. > > Would it make sense to include a coverage calculation in the seeding tests > done by the workers (SeedWorker.cpp)? What happens if a vertex (with >5 > coverage) has a parent that has 1 edge in, one edge out, but the coverage of > that parent is below 5? > > -- David > |