From: kuhl <ku...@mo...> - 2012-07-31 08:13:56
|
Hello Brian, thanks for the help. Fortunately, in step 7-4 cgw successfully finished MergeScaffoldsAggressive after iteration 564. Best wishes, Heiner On Mon, 30 Jul 2012 12:44:16 -0400, "Walenz, Brian" <bw...@jc...> wrote: > Hi, Heiner- > > Working backwards through your email: > > We've also noticed the 'large scaffold gets lots of little contigs added' > problem. This seems to be dominating our run time. I'm working on this > problem at the moment. Our previous solution was basically what you did: > let it run until we get impatient, then kill it and restart from the next > checkpoint label. > > The CVS tip has a slight improvement in cgw, committed around the 20th. I > hope to have much more within the next week. > > You can ignore the mates in the library, but not the reads. To ignore the > mates, simply delete the mate link from gkpStore. At the very bottom of > the > 'gatekeeper' page on the wiki is 'allfragsunmated' which will remove the > mate link from all reads in a single library. This is a destructive > operation! Save a backup of gkpStore/fnm and gkpStore/fpk if you want to > revert. (these two files store metadata for long and short fragments > resp.) > > http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Gatekeep > er > > FYI- The 5-consensus-insert-size directory has a plot of the insert size > histogram for each library. These are based on unitigs, and so the 20k > library might not be represented well. tigStore (the command) can also > analyze mate pairs for contigs/unitigs in the store with -d matepair. > > b > > > On 7/26/12 5:39 PM, "kuhl" <ku...@mo...> wrote: > >> Hi Brian et al., >> >> I am currently running a huge assembly with CA7 (2.5Gb 30x Illumina + >> 454, >> cgw takes 150-300Gb RAM). It is now in step 7-2 and I have just stopped >> cgw >> at MergeScaffoldsAggressive iteration 1641 and restarted it at >> ckp08-2SM. I >> did this also in 7-0 at iteration 2xxx. Now I am not sure, if I should >> maybe rerun scaffolding without 20 kb mate pairs, which I think are >> responsible for this mess. So I have two questions: >> >> How can I convince cgw to ignore a certain library without doing steps >> 0-5 >> again? >> >> Is there a rule of thumb, when MergeScaffoldsAggressive should be >> stopped? >> >> >> In my case it looks like cgw is only very slightly progressing with each >> iteration and there is one large scaffold that is growing more and >> more... >> >> ExamineUsableSEdges()- maxWeightEdge from 0 to 32 at idx 3355 out of >> 60498 >> ExamineUsableSEdges()- maxWeightEdge from 0 to 19 at idx 8774 out of >> 60498 >> ExamineUsableSEdges()- maxWeightEdge from 0 to 55 at idx 286 out of 60500 >> ExamineUsableSEdges()- maxWeightEdge from 0 to 32 at idx 3355 out of >> 60500 >> ExamineUsableSEdges()- maxWeightEdge from 0 to 16 at idx 10594 out of >> 60500 >> ExamineUsableSEdges()- maxWeightEdge from 0 to 7 at idx 20348 out of >> 60500 >> ExamineUsableSEdges()- maxWeightEdge from 0 to 55 at idx 286 out of 60489 >> ExamineUsableSEdges()- maxWeightEdge from 0 to 32 at idx 3355 out of >> 60489 >> ExamineUsableSEdges()- maxWeightEdge from 0 to 19 at idx 8773 out of >> 60489 >> ExamineUsableSEdges()- maxWeightEdge from 0 to 9 at idx 16854 out of >> 60489 >> ExamineUsableSEdges()- maxWeightEdge from 0 to 55 at idx 286 out of 60486 >> ExamineUsableSEdges()- maxWeightEdge from 0 to 32 at idx 3355 out of >> 60486 >> ExamineUsableSEdges()- maxWeightEdge from 0 to 16 at idx 10593 out of >> 60486 >> ExamineUsableSEdges()- maxWeightEdge from 0 to 7 at idx 20428 out of >> 60486 >> >> Regards, Heiner >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users -- --------------------------------------------------------------- Dr. Heiner Kuhl MPI Molecular Genetics Tel: + 49 + 30 / 8413 1551 Next Generation Sequencing Ihnestrasse 73 email: ku...@mo... D-14195 Berlin http://www.molgen.mpg.de --------------------------------------------------------------- |