Dear Support,
I am using WGS7.0 runCA pipeline having corrected long Pacbio reads using Illumina paired end (1 library) and mate pair (3 libraries). I am currently in the unitigging step & there has no output after May 18. However, the process info shows the following command being executed:
/Apps/smrtanalysis/analysis/bin/wgs-7.0/Linux-amd64/bin/buildUnitigs -O /home/pacbioNewB/pacbioNewB.ovlStore -G /home/pacbioNewB.gkpStore -T /home/pacbioNewB/pacbioNewB.tigStore -B 2796663 -e 0.25 -E 4.5 -b -m 7 -U -o /home/pacbioNewB/4-unitigger/pacbioNewB
The 4-unitigger directory contents are as follows:
-rw-r--r-- 1 neeraja root 4.1G May 16 02:13 pacbioNewB.fragmentInfo
-rw-r--r-- 1 neeraja root 51 May 16 19:26 pacbioNewB.001.bestoverlapgraph-containments.log
-rw-r--r-- 1 neeraja root 0 May 16 19:26 pacbioNewB.002.bestoverlapgraph-dovetails.log
-rw-r--r-- 1 neeraja root 548M May 17 02:50 best.singletons
-rw-r--r-- 1 neeraja root 6.2G May 17 02:50 best.contains
-rw-r--r-- 1 neeraja root 1.1G May 17 02:50 best.edges
-rw-r--r-- 1 neeraja root 87G May 17 03:51 pacbioNewB.bog
-rw-r--r-- 1 neeraja root 0 May 17 20:10 pacbioNewB.003.ChunkGraph.log
-rw-r--r-- 1 neeraja root 95 May 17 20:20 pacbioNewB.004.buildUnitigs.log
-rw-r--r-- 1 neeraja root 34K May 18 11:32 pacbioNewB.005.bubblePopping.log
-rw-r--r-- 1 neeraja root 28M May 18 11:35 pacbioNewB.breaks.ovl
-rw-r--r-- 1 neeraja root 22 May 18 11:35 pacbioNewB.006.intersectionBreaking.log
-rw-r--r-- 1 neeraja root 1.5K May 18 11:40 pacbioNewB.007.placeContains.log
-rw-r--r-- 1 neeraja root 3.0M May 18 11:40 pacbioNewB.008.placeZombies.log
-rw-r--r-- 1 neeraja root 2.4K May 18 13:10 pacbioNewB.009.bubblePopping.log
-rw-r--r-- 1 neeraja root 404 May 18 13:11 pacbioNewB.010.libraryStats.log
-rw-r--r-- 1 neeraja root 1.4K May 18 13:21 pacbioNewB.011.evaluateMates.log
-rw-r--r-- 1 neeraja root 1.3K May 18 13:39 pacbioNewB.012.moveContains1.log
-rw-r--r-- 1 neeraja root 1.3K May 18 13:47 pacbioNewB.013.splitDiscontinuous1.log
-rw-r--r-- 1 neeraja root 101M May 18 14:09 pacbioNewB.014.splitBadMates.log
-rw-r--r-- 1 neeraja root 1.3K May 18 14:20 pacbioNewB.015.splitDiscontinuous2.log
-rw-r--r-- 1 neeraja root 0 May 18 14:20 pacbioNewB.016.moveContains2.log
-rw-r--r-- 1 neeraja root 2.2K May 18 14:20 unitigger.err
The spec file I am using is as follows:
utgErrorRate = 0.25
utgErrorLimit = 4.5
cnsErrorRate = 0.25
cgwErrorRate = 0.25
ovlErrorRate = 0.25
merSize=14
merylMemory = 128000
merylThreads = 16
ovlStoreMemory = 8192
useGrid = 1
scriptOnGrid = 1
frgCorrOnGrid = 1
ovlCorrOnGrid = 1
sge = -A assembly
sgeScript = -pe mvapich2 16
sgeConsensus = -pe mvapich2 1
sgeOverlap = -pe mvapich2 2
sgeFragmentCorrection = -pe mvapich2 2
sgeOverlapCorrection = -pe mvapich2 1
ovlHashBits = 25
ovlThreads = 2
ovlHashBlockLength = 20000000
ovlRefBlockSize = 50000000
merCompression = 1
merOverlapperSeedBatchSize = 500000
merOverlapperExtendBatchSize = 250000
frgCorrThreads = 2
frgCorrBatchSize = 100000
ovlCorrBatchSize = 100000
merylMemory = 128000
merylThreads = 4
ovlStoreMemory = 8192
ovlConcurrency = 8
cnsConcurrency = 8
merOverlapperThreads = 3
merOverlapperSeedConcurrency = 3
merOverlapperExtendConcurrency = 3
frgCorrConcurrency = 2
ovlCorrConcurrency = 4
cnsConcurrency = 4
Is this step expected to take this long?
Thank you in advance for your help.
Neeraja Krishnan
Sr. Scientist at GANIT Labs
IBAB Campus, Electronic City Phase I
Bangalore 560100
This is very regrettable. The "moveContains2" step should be fast. It seems like an infinite loop since it has not written anything lately. We have not encountered this problem and we don't have a diagnosis. As a work-around, you could try the other unitig module. (Kill the process. Move or delete the 4-unitig directory. Edit the spec file to say utg=bogart instead of utg=bog. Restart runCA.)