From: Walenz, B. <bw...@jc...> - 2012-07-30 16:44:29
|
Hi, Heiner- Working backwards through your email: We've also noticed the 'large scaffold gets lots of little contigs added' problem. This seems to be dominating our run time. I'm working on this problem at the moment. Our previous solution was basically what you did: let it run until we get impatient, then kill it and restart from the next checkpoint label. The CVS tip has a slight improvement in cgw, committed around the 20th. I hope to have much more within the next week. You can ignore the mates in the library, but not the reads. To ignore the mates, simply delete the mate link from gkpStore. At the very bottom of the 'gatekeeper' page on the wiki is 'allfragsunmated' which will remove the mate link from all reads in a single library. This is a destructive operation! Save a backup of gkpStore/fnm and gkpStore/fpk if you want to revert. (these two files store metadata for long and short fragments resp.) http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Gatekeep er FYI- The 5-consensus-insert-size directory has a plot of the insert size histogram for each library. These are based on unitigs, and so the 20k library might not be represented well. tigStore (the command) can also analyze mate pairs for contigs/unitigs in the store with -d matepair. b On 7/26/12 5:39 PM, "kuhl" <ku...@mo...> wrote: > Hi Brian et al., > > I am currently running a huge assembly with CA7 (2.5Gb 30x Illumina + 454, > cgw takes 150-300Gb RAM). It is now in step 7-2 and I have just stopped cgw > at MergeScaffoldsAggressive iteration 1641 and restarted it at ckp08-2SM. I > did this also in 7-0 at iteration 2xxx. Now I am not sure, if I should > maybe rerun scaffolding without 20 kb mate pairs, which I think are > responsible for this mess. So I have two questions: > > How can I convince cgw to ignore a certain library without doing steps 0-5 > again? > > Is there a rule of thumb, when MergeScaffoldsAggressive should be stopped? > > > In my case it looks like cgw is only very slightly progressing with each > iteration and there is one large scaffold that is growing more and more... > > ExamineUsableSEdges()- maxWeightEdge from 0 to 32 at idx 3355 out of 60498 > ExamineUsableSEdges()- maxWeightEdge from 0 to 19 at idx 8774 out of 60498 > ExamineUsableSEdges()- maxWeightEdge from 0 to 55 at idx 286 out of 60500 > ExamineUsableSEdges()- maxWeightEdge from 0 to 32 at idx 3355 out of 60500 > ExamineUsableSEdges()- maxWeightEdge from 0 to 16 at idx 10594 out of > 60500 > ExamineUsableSEdges()- maxWeightEdge from 0 to 7 at idx 20348 out of 60500 > ExamineUsableSEdges()- maxWeightEdge from 0 to 55 at idx 286 out of 60489 > ExamineUsableSEdges()- maxWeightEdge from 0 to 32 at idx 3355 out of 60489 > ExamineUsableSEdges()- maxWeightEdge from 0 to 19 at idx 8773 out of 60489 > ExamineUsableSEdges()- maxWeightEdge from 0 to 9 at idx 16854 out of 60489 > ExamineUsableSEdges()- maxWeightEdge from 0 to 55 at idx 286 out of 60486 > ExamineUsableSEdges()- maxWeightEdge from 0 to 32 at idx 3355 out of 60486 > ExamineUsableSEdges()- maxWeightEdge from 0 to 16 at idx 10593 out of > 60486 > ExamineUsableSEdges()- maxWeightEdge from 0 to 7 at idx 20428 out of 60486 > > Regards, Heiner > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |