Works fine on my branch master.
Commands:
~/git-clones/ray/readSimulator/VirtualNextGenSequencer ~/data-for-system-tests/phix/phix.fasta 0 200 10 500000 50 phix_500k_1.fasta phix_500k_2.fasta
mpirun -np 3 ~/Ray -p phix_500k_1.fasta phix_500k_2.fasta |tee 1
Result:
Number of contigs: 1
Total length of contigs: 5384
Number of contigs >= 500 nt: 1
Total length of contigs >= 500 nt: 5384
Number of scaffolds: 1
Total length of scaffolds: 5384
Number of scaffolds >= 500 nt: 1
Total length of scaffolds >= 500: 5384
Your distribution has 2 peaks.
http://imgur.com/WoksH
Should be like:
http://imgur.com/lm84R
Distribution: http://pastebin.com/qmJtWqj9
k-mer length: 21
Lowest coverage observed: 98
MinimumCoverage: 98
PeakCoverage: 5779
RepeatCoverage: 11460
Number of k-mers with at least MinimumCoverage: 10732 k-mers
Estimated genome length: 5366 nucleotides
Percentage of vertices with coverage 98: 0.0186359 %
DistributionFile: RayOutput.CoverageDistribution.txt
Did you provide the good input files ?
Sébastien
> ________________________________________
> De : Eccles, David [dav...@mp...]
> Date d'envoi : 14 juillet 2011 02:48
> À : Sébastien Boisvert; den...@li...
> Objet : RE: RE : Confused about coding -- completed seeds without distributions
>
> From: Sébastien Boisvert [mailto:seb...@ul...]
>> VirtualNextGenSequencer dumps pairs of reads so you should provide both.
>> Can you rerun with -p tests/phix/phix_500k_1.fasta
> tests/phix/phix_500k_2.fasta
>> Otherwise I think your graph won't be connected enough because of the
> random number generator used to simulate reads.
>
> Okay, done. Coverage:
>
> http://pastebin.com/p6v7m1jA
>
> And output:
>
> http://pastebin.com/rEjqveCz
>
> Still failing to assemble.
>
> I must say that these results surprised me, because the minimum coverage from
> Coverage Distribution has bumped up to 96 now.
>
>> Also, what is the read length ?
>
> Reads were generated using this command:
> ../readSimulator/VirtualNextGenSequencer tests/phix/phix_genome.fasta 0 200
> 10 500000 50 phix_500k_1.fasta phix_500k_2.fasta
>
> In other words, no sequence errors, 200bp outer distance (SD = 10bp), 50bp
> read length. I chose those outer distance / read length values (but SD was a
> guess) because they are the same as the parameters used on a human genome
> sequence that was done at MPI a month or so ago (which included a phiX lane).
> I guess I could choose the reads from that as input, but then I'd be testing
> a different thing, because those reads would likely have errors.
>
>> I added a system test for phiX and it works just fine on my branch master.
>> see system-tests/tests/phix
>
> You're simulating errors / SNPs (substitution rate = 0.005), which is a
> different test from the one I'm doing.
>
> -- David
>
|