Hello,
I am Ted Kalbfleisch with the University of Louisville.
I am running wgs-8.2. My command line inputs were
/home/tskalb01/wgs-8.2/Linux-amd64/bin/runCA ovlMerThreshold=75 gkpFixInsertSizes=0 ovlMerSize=30 cgwErrorRate=0.15 ovlHashBits=24 ovlHashBlockLength=110000000 ovlCorrBatchSi
ze=145904 unitigger=bog -p genome -d CA merylThreads=32 merylMemory=668467 frgCorrThreads=1 frgCorrConcurrency=32 cnsConcurrency=9 ovlCorrConcurrency=32 ovlConcurrency=32 ovl
Threads=1 doFragmentCorrection=1 doOverlapBasedTrimming=1 doExtendClearRanges=1 ovlMerSize=22 /scratch/large/tskalb01/Twilight/TwilightSanger/10k.frg /scratch/large/tskalb01/
Twilight/TwilightSanger/40k.frg /scratch/large/tskalb01/Twilight/TwilightSanger/4k.frg /scratch/large/tskalb01/Twilight/TwilightSanger/singletons.frg /scratch/large/tskalb01/
Twilight/assembly/2015_02_10/superReadSequences_shr.frg
I am running it on a fat node with 32 cores and 720Gb of ram. For the last week, I have been in the extendClearRange step. The tail of the genome.tigstore directory is below.
tskalb01@public$ ls -lrt /scratch/large/tskalb01/Twilight/assembly/2015_02_13/CA/genome.tigStore/ | tail
-rw------- 1 tskalb01 unixuser 156402308 Feb 27 17:50 seqDB.v014.utg
-rw------- 1 tskalb01 unixuser 157446068 Feb 27 17:50 seqDB.v014.ctg
-rw------- 1 tskalb01 unixuser 223433774 Feb 27 17:50 seqDB.v014.dat
-rw------- 1 tskalb01 unixuser 156402308 Mar 3 05:20 seqDB.v015.utg
-rw------- 1 tskalb01 unixuser 157482116 Mar 3 05:20 seqDB.v015.ctg
-rw------- 1 tskalb01 unixuser 1549260086 Mar 3 05:20 seqDB.v015.dat
-rw------- 1 tskalb01 unixuser 156402308 Mar 6 15:11 seqDB.v016.utg
-rw------- 1 tskalb01 unixuser 157513268 Mar 6 15:11 seqDB.v016.ctg
-rw------- 1 tskalb01 unixuser 1842914932 Mar 6 15:11 seqDB.v016.dat
-rw------- 1 tskalb01 unixuser 442769720 Mar 8 12:02 seqDB.v017.dat
What I see in top on the fat node is
top - 12:09:46 up 78 days, 16:02, 1 user, load average: 1.00, 1.00, 1.00
Tasks: 486 total, 1 running, 485 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.5%sy, 0.0%ni, 99.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 794025324k total, 17937868k used, 776087456k free, 1968k buffers
Swap: 0k total, 0k used, 0k free, 2053244k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
20830 tskalb01 20 0 9.9g 9.7g 2088 D 6.6 1.3 293:03.07 extendClearRang
26303 tskalb01 20 0 17384 1664 1016 R 0.3 0.0 0:00.05 top
4959 tskalb01 20 0 13396 1556 1272 S 0.0 0.0 0:00.00 bash
4988 tskalb01 20 0 9196 1248 1044 S 0.0 0.0 0:00.00 12393601.queues
4989 tskalb01 20 0 42400 9048 2264 S 0.0 0.0 0:00.06 perl
20828 tskalb01 20 0 9196 1232 1040 S 0.0 0.0 0:00.00 sh
20829 tskalb01 20 0 9196 1232 1032 S 0.0 0.0 0:00.00 extendClearRang
Which seems like very low cpu usage. Any suggestions you could provide would be greatly appreciated.
Best regards,
Ted
Ted Kalbfleisch
Assistant Professor
School of Medicine
University of Louisville
221 F Baxter II
580 South Preston Street
Louisville, KY
40202
ted.kalbfleisch@louisville.edu
502-852-7495
Hi-
It seems to be suffering from a poorly cached gkpStore. To load it into cache:
% cat gkpStore/s?? > /dev/null
% cat gkpStore/q?? > /dev/null
It should get up to one CPU - sadly, this isn't multi-threaded.
Is this equivalent?
cat genome.gkpStore/s > /dev/null
cat genome.gkpStore/q > /dev/null
The formatter deleted the asterisk behind the /s and /q. Sorry about that.
This still isn't working (see top listing below). How much bang for the buck am I getting out of the ExtendClearRanges step anyway? I would certainly prefer that it ran through to completion, but is it reasonable for step seven to take on the order of two weeks?
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
20830 tskalb01 20 0 10.5g 10g 2088 D 2.7 1.4 407:31.74 extendClearRang