Menu

#8 Unitigging going on for more than a week

v1.0_(example)
open
5
2014-07-25
2013-05-27
No

Dear Support,

I am using WGS7.0 runCA pipeline having corrected long Pacbio reads using Illumina paired end (1 library) and mate pair (3 libraries). I am currently in the unitigging step & there has no output after May 18. However, the process info shows the following command being executed:

/Apps/smrtanalysis/analysis/bin/wgs-7.0/Linux-amd64/bin/buildUnitigs -O /home/pacbioNewB/pacbioNewB.ovlStore -G /home/pacbioNewB.gkpStore -T /home/pacbioNewB/pacbioNewB.tigStore -B 2796663 -e 0.25 -E 4.5 -b -m 7 -U -o /home/pacbioNewB/4-unitigger/pacbioNewB

The 4-unitigger directory contents are as follows:
-rw-r--r-- 1 neeraja root 4.1G May 16 02:13 pacbioNewB.fragmentInfo
-rw-r--r-- 1 neeraja root 51 May 16 19:26 pacbioNewB.001.bestoverlapgraph-containments.log
-rw-r--r-- 1 neeraja root 0 May 16 19:26 pacbioNewB.002.bestoverlapgraph-dovetails.log
-rw-r--r-- 1 neeraja root 548M May 17 02:50 best.singletons
-rw-r--r-- 1 neeraja root 6.2G May 17 02:50 best.contains
-rw-r--r-- 1 neeraja root 1.1G May 17 02:50 best.edges
-rw-r--r-- 1 neeraja root 87G May 17 03:51 pacbioNewB.bog
-rw-r--r-- 1 neeraja root 0 May 17 20:10 pacbioNewB.003.ChunkGraph.log
-rw-r--r-- 1 neeraja root 95 May 17 20:20 pacbioNewB.004.buildUnitigs.log
-rw-r--r-- 1 neeraja root 34K May 18 11:32 pacbioNewB.005.bubblePopping.log
-rw-r--r-- 1 neeraja root 28M May 18 11:35 pacbioNewB.breaks.ovl
-rw-r--r-- 1 neeraja root 22 May 18 11:35 pacbioNewB.006.intersectionBreaking.log
-rw-r--r-- 1 neeraja root 1.5K May 18 11:40 pacbioNewB.007.placeContains.log
-rw-r--r-- 1 neeraja root 3.0M May 18 11:40 pacbioNewB.008.placeZombies.log
-rw-r--r-- 1 neeraja root 2.4K May 18 13:10 pacbioNewB.009.bubblePopping.log
-rw-r--r-- 1 neeraja root 404 May 18 13:11 pacbioNewB.010.libraryStats.log
-rw-r--r-- 1 neeraja root 1.4K May 18 13:21 pacbioNewB.011.evaluateMates.log
-rw-r--r-- 1 neeraja root 1.3K May 18 13:39 pacbioNewB.012.moveContains1.log
-rw-r--r-- 1 neeraja root 1.3K May 18 13:47 pacbioNewB.013.splitDiscontinuous1.log
-rw-r--r-- 1 neeraja root 101M May 18 14:09 pacbioNewB.014.splitBadMates.log
-rw-r--r-- 1 neeraja root 1.3K May 18 14:20 pacbioNewB.015.splitDiscontinuous2.log
-rw-r--r-- 1 neeraja root 0 May 18 14:20 pacbioNewB.016.moveContains2.log
-rw-r--r-- 1 neeraja root 2.2K May 18 14:20 unitigger.err

The spec file I am using is as follows:

original asm settings

utgErrorRate = 0.25
utgErrorLimit = 4.5

cnsErrorRate = 0.25
cgwErrorRate = 0.25
ovlErrorRate = 0.25

merSize=14

merylMemory = 128000
merylThreads = 16

ovlStoreMemory = 8192

grid info

useGrid = 1
scriptOnGrid = 1
frgCorrOnGrid = 1
ovlCorrOnGrid = 1

sge = -A assembly
sgeScript = -pe mvapich2 16
sgeConsensus = -pe mvapich2 1
sgeOverlap = -pe mvapich2 2
sgeFragmentCorrection = -pe mvapich2 2
sgeOverlapCorrection = -pe mvapich2 1

ovlMemory=8GB --hashload 0.7

ovlHashBits = 25
ovlThreads = 2
ovlHashBlockLength = 20000000
ovlRefBlockSize = 50000000

for mer overlapper

merCompression = 1
merOverlapperSeedBatchSize = 500000
merOverlapperExtendBatchSize = 250000

frgCorrThreads = 2
frgCorrBatchSize = 100000

ovlCorrBatchSize = 100000

non-Grid settings, if you set useGrid to 0 above these will be used

merylMemory = 128000
merylThreads = 4

ovlStoreMemory = 8192

ovlConcurrency = 8

cnsConcurrency = 8

merOverlapperThreads = 3
merOverlapperSeedConcurrency = 3
merOverlapperExtendConcurrency = 3

frgCorrConcurrency = 2
ovlCorrConcurrency = 4
cnsConcurrency = 4

Is this step expected to take this long?

Thank you in advance for your help.

Neeraja Krishnan
Sr. Scientist at GANIT Labs
IBAB Campus, Electronic City Phase I
Bangalore 560100

Discussion

  • Jason Miller

    Jason Miller - 2013-06-04
    • assigned_to: Jason Miller
    • Priority: 9 --> 5
     
  • Jason Miller

    Jason Miller - 2013-06-04

    This is very regrettable. The "moveContains2" step should be fast. It seems like an infinite loop since it has not written anything lately. We have not encountered this problem and we don't have a diagnosis. As a work-around, you could try the other unitig module. (Kill the process. Move or delete the 4-unitig directory. Edit the spec file to say utg=bogart instead of utg=bog. Restart runCA.)

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.