From: Christoph H. <chr...@gm...> - 2012-07-03 20:02:48
|
Hi Brian, Thanks for your reply! I am using CA7. I am afraid updating is not really an option at the moment - I am running it on a cluster and updating CVS might be complicated because the cluster administrators are always very busy and it would thus for sure take a while.. Therefore, it would be great if you could give me a tip on how to handle that in CA7 for now. In my latest attempt I used 64 GB RAM and it killed the node after some 2 hours. I ran the following: CA version 7.0 ($Id: deduplicate.C,v 1.15 2011/12/29 09:26:03 brianwalenz Exp $). Error Rates: AS_OVL_ERROR_RATE 0.060000 AS_CNS_ERROR_RATE 0.100000 AS_CGW_ERROR_RATE 0.100000 AS_MAX_ERROR_RATE 0.250000 Current Working Directory: /projects/nn9201k/Celera/work2/salaris1/0-overlaptrim Command: /xanadu/home/chrishah/programmes/wgs-7.0/Linux-amd64/bin/deduplicate \ -gkp /projects/nn9201k/Celera/work2/salaris1/salaris.gkpStore \ -ovs /projects/nn9201k/Celera/work2/salaris1/0-overlaptrim/salaris.obtStore \ -ovs /projects/nn9201k/Celera/work2/salaris1/0-overlaptrim/salaris.dupStore \ -report /projects/nn9201k/Celera/work2/salaris1/0-overlaptrim/salaris.deduplicate.log \ -summary /projects/nn9201k/Celera/work2/salaris1/0-overlaptrim/salaris.deduplicate.summary Here are the first and last few lines of salaris.deduplicate.log (it has 384855 lines, *.deduplicate.summary and *.deduplicate.err are empty): Delete 28 DUPof 3462651 a 0,76 b 0,76 hang 0,0 diff 0,0 error 0.000000 Delete 76 DUPof 10667558 a 0,76 b 0,76 hang 0,0 diff 0,0 error 0.000000 Delete 210 DUPof 8142147 a 0,70 b 0,70 hang 0,0 diff 0,0 error 0.000000 Delete 216 DUPof 9129559 a 0,76 b 0,76 hang 0,0 diff 0,0 error 0.000000 Delete 228 DUPof 7781271 a 0,76 b 0,76 hang 0,0 diff 0,0 error 0.013200 Delete 297 DUPof 11757250 a 0,76 b 0,76 hang 0,0 diff 0,0 error 0.000000 Delete 319 DUPof 11174680 a 0,73 b 0,73 hang 0,0 diff 0,0 error 0.000000 . . . Delete 132295695 DUPof 211765973 a 0,76 b 0,76 hang 0,0 diff 0,0 error 0.000000 Delete 132296968 DUPof 181491499 a 0,76 b 0,76 hang 0,0 diff 0,0 error 0.000000 Delete 132297966 DUPof 159665067 a 0,76 b 0,76 hang 0,0 diff 0,0 error 0.000000 Delete 132304543 DUPof 155518568 a 0,76 b 0,76 hang 0,0 diff 0,0 error 0.000000 Delete 132307934 DUPof 134266938 a 0,76 b 0,76 hang 0,0 diff 0,0 error 0.000000 Delete 132309546 DUPof 179301753 a 0,76 b 0,76 hang 0,0 diff 0,0 error 0.000000 Delete 132313400 DUPof 153142824 a 0,76 b 0,76 hang 0,0 diff 0,0 error 0.000000 Delete 132319681 DUPof 132368976 a 0,76 b 0,76 hang 0,0 diff 0,0 error 0.000000 Delete 132323752 DUPof 165992623 a 0,76 (this is exactly how it stopped..) Can I maybe run the deduplicate command manually and only make use of the overlaps in the dupStore? When I tried to start CA again it continued with finalTrim, so I removed the *.deduplicate.log, etc. files before I restarted CA. It would be great if you could help me out! Thanks!! cheers, Christoph On 07/03/2012 06:44 PM, Walenz, Brian wrote: > Hi, Christoph- > > Are you using CA7 or CVS? > > This behavior was introduced to CVS on May 21, and fixed on the 29th. The > bug was after an optimization in loading overlaps was made - only overlaps > in the 'dupStore' are needed, the 'obtStore' can be ignored. This > eliminated a huge amount of I/O and overhead from the dedupe compute. > > If updating CVS doesn't fix the problem, can you send some of the logging > from deduplicate? > > b > > > On 7/3/12 6:28 AM, "Christoph Hahn" <chr...@gm...> wrote: > >> Dear developers and users, >> >> I am encountering some problems in the deduplicate step. Unfortunately, >> the memory usage is steadily increasing until the process dies because >> of exceeding memory limit. So far, I used up to 32 GB. I could of course >> just further increase the available memory, but I was wondering if there >> is a possibility to fix and/or predict the maximum memory usage for this >> step (and maybe also for the next steps) beforehand. >> >> Thanks for your help! >> >> much obliged, >> Christoph >> >> Universtiy of Oslo, Norway >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |