I have a draft genome (~3 Mbp in 60 contigs), some paired 454 reads (~30x) and some paired Illumina reads (~200x).
When I use gap5 with just the 454 reads alone, or the Illumina reads alone, it all works fine. If I create a database of them both, I can load it, but when I "Edit contig" the CPU goes to 100% and memory use continually increases (I let it go to 48 GB and then killed it).
To generate the 454 database I use gsMapper to align the .SFF files to the draft contigs, and it produces a .ace file:
% tg_index -o EF.db -A EF.454.ace -t
To append the Illumina database I create a .bam using SHRiMP2+samtools and add it in:
% tg_index -o EF.db -b EF.bam -a -g -t
Am I doing anything wrong? Is the gsMapper (Newbler) producing a dodgy .ace file? It has lots of pads in it, and gap5 seems to import the contig itself as a 'read' too. Is "-g" the correct thing to do. The 100%CPU/RAM++ looks like an infinite loop with memory allocation it it?
Thanks for any help!
Log in to post a comment.