From: Ole K. T. <o.k...@bi...> - 2012-04-12 07:52:39
|
Hi, Brian. Thank you for your help so far, but I seem to be missing something. I did this: tigStore -g *gkpStore -t *tigStore 14 -c 107652 -d layout3 > ctg1076523 tigStore -g *gkpStore -t *tigStore 15 -cp 36 -R ctg1076523 But when I dump the same contig from version 15: tigStore -g *gkpStore -t *tigStore 15 -c 107652 -d layout3 > ctg1076523_v15 it's without consensus sequence: contig 1076523 len 0 cns qlt data.unitig_coverage_stat -9874.792662 data.unitig_microhet_prob 0.000000 data.unitig_status X data.unitig_unique_rept X data.contig_status U data.num_frags 14302 data.num_unitigs 1 The if I dump it from version 16, it's identical to the one from version 14 (that is, with consensus). I've tried loading it several times, but each time I dump it again it's lost consensus. Do you know what I'm doing wrong? Ole On 11 April 2012 20:54, Walenz, Brian <bw...@jc...> wrote: > Hi, Ole- > > Yes, I overlooked a step. In the contig you insert to the latest version, > update the 'data.contig_status' with what the second to last version has. > > FYI, the tigStore should have versions such as: > > seqDB.v014.ctg > seqDB.v014.dat > seqDB.v014.utg > > seqDB.v015.ctg > seqDB.v015.p001.ctg > seqDB.v015.p001.dat > (etc) > seqDB.v015.utg > > seqDB.v016.ctg > seqDB.v016.p001.ctg > seqDB.v016.p001.dat > (etc) > seqDB.v016.utg > > (the v numbers will of course be different in your assembly) > > v015 contains the output of scaffolder, which is the input to consensus. > Contigs here have no consensus sequence, but otherwise all the data is > present. It is largely just rewriting the data from v014 into partitions > (p###), so each consensus job can load a single file instead of randomly > accessing a large file. The status flag on each unitig/contig is also set. > This flag tells if the unitig/contig was placed in a scaffold, is a > surrogate, degenerate, etc. > > v016 is the output of consensus, the input to terminator. All terminator > does is to repackage this into ASCII files. > > To summarize: grab the contig from v014 (the last with a consensus > sequence), the status flag from v015, change the status flag in the contig > you grabbed, and then insert the contig into v016. > > by doing this, you'll lose VAR records for this contig, but otherwise the > consensus sequence is the same (or largely the same; variant detection can > change it a bit). > > b > > > On 4/11/12 6:23 AM, "Ole Kristian Tørresen" <o.k...@bi...> wrote: > >> Hi Brian, >> ctgcns completed now, but I got an error with asmOutputFasta. From >> 9-terminator/asmOutputFasta.err: >> ERROR: Illegal unitigpos type type value 'X' (CCO) at line 1676575956 >> >> Is this connected with the procedure I did with inserting the contig >> from an older tigStore? >> >> Thank you for your help so far. >> >> Ole >> >> On 11 April 2012 08:13, Ole Kristian Tørresen <o.k...@bi...> wrote: >>> Hi Brian. >>> >>> I've done this, and rerunning ctgcns on that last partition. I'll send >>> the layout and log in a separate email. >>> >>> Ole >>> >>> On 10 April 2012 21:37, Walenz, Brian <bw...@jc...> wrote: >>>> Hi Ole- >>>> >>>> I don't see anything that looks like an error in the log, so I'll have to >>>> assume it crashed. You report it runs for 20 hours, which is odd for contig >>>> consensus, unless that contig is very very deep. If so, the ctgcns process >>>> will also be large. Do you know how big the process was? >>>> >>>> Can you make the full log available? >>>> >>>> It is possible to force the contig to have a consensus sequence. If the job >>>> did crash, the other contigs will still need to have consensus generated. >>>> >>>> The process is the same as editing a unitig in the tigStore: dump the contig >>>> in question, edit the file to have a consensus sequence, then load that >>>> contig back into the tigStore. A consensus sequence for this contig can be >>>> found in one of the earlier tigStore versions; the version just before this >>>> one will probably have it. That makes our process even easier: dump the >>>> version with a consensus sequence, and load it back into the latest version. >>>> >>>> A sketch of the steps: >>>> >>>> 1) Dump the previous version of the contig. check that 'file' does contain >>>> a consensus sequence. >>>> >>>> tigStore -g *gkpStore -t *tigStore <vers-1> -c <ctgID> -d layout > file >>>> >>>> 2) Load that pervious version into the tigStore as the latest version >>>> >>>> tigStore -g *gkpStore -t *tigStore <vers> <part> -c <ctgID> -R file >>>> >>>> Notice that this tigStore command specifies both a version and a partition >>>> for the tigStore. >>>> >>>> 3) Rerun consensus.sh on that partition. It will not attempt to compute the >>>> consensus for that contig. >>>> >>>> I'd be interested in seeing the contig you dump, if only to verify that it >>>> is deep. >>>> >>>> b >>>> >>>> >>>> >>>> On 4/10/12 4:05 AM, "Ole Kristian Tørresen" <o.k...@bi...> wrote: >>>> >>>>> Hi, >>>>> I'm having some problems while doing some low coverage sequencing >>>>> assembly testing. I've tried to assemble about 10x coverage of 150 nt >>>>> paired Illumina reads of 500 bp fragment size. These are from the >>>>> parrot used in the Assemblathon 2 >>>>> (http://assemblathon.org/pages/download-data). Everything seems to run >>>>> fine, until contig consensus, where 1 partition just don't succeed. It >>>>> seems to run for quite some time (20 hours or something) before >>>>> failing. These are the last 20 lines from the output of the ctgcns >>>>> partition that fails: >>>>> Alignment params: 297 333 200 200 0 0.12 1e-06 30 1 >>>>> -- e/l = 7/112 = 6.25% >>>>> A -----+------+----> [] >>>>> B 332 -------> 40 [] >>>>> GetAlignmentTrace()-- Overlap ACCEPTED! accept=1000.000000 >>>>> lScore=0.026087 (112 vs 115) aScore=0.160000 (332 vs 316) >>>>> bScore=0.150000 (-42 vs -27). (CONTIGF) >>>>> GetAlignmentTrace()-- Overlap found between 1076523 (U) and 25763657 >>>>> (R) expected hangs: a=316 b=-27 erate=0.060000 aligner=Local_Overlap >>>>> GetAlignmentTrace()-- Overlap ACCEPTED! accept=1000.000000 >>>>> lScore=0.026087 (112 vs 115) aScore=0.160000 (332 vs 316) >>>>> bScore=0.150000 (-42 vs -27). (CONTIGF) >>>>> Local_Overlap_AS_forCNS found overlap between 1076523 (U) and 25763657 >>>>> (R) ahang: 332, bhang: -42 (expected hang was 316) >>>>> Alignment params: 298 334 200 200 0 0.12 1e-06 30 1 >>>>> -- e/l = 6/112 = 5.36% >>>>> A -----+------+----> [] >>>>> B 332 -------> 42 [] >>>>> GetAlignmentTrace()-- Overlap ACCEPTED! accept=1000.000000 >>>>> lScore=0.009009 (110 vs 111) aScore=0.140000 (332 vs 318) >>>>> bScore=0.130000 (-42 vs -29). (CONTIGF) >>>>> GetAlignmentTrace()-- Overlap found between 1076523 (U) and 57537697 >>>>> (R) expected hangs: a=318 b=-29 erate=0.060000 aligner=Local_Overlap >>>>> GetAlignmentTrace()-- Overlap ACCEPTED! accept=1000.000000 >>>>> lScore=0.009009 (110 vs 111) aScore=0.140000 (332 vs 318) >>>>> bScore=0.130000 (-42 vs -29). (CONTIGF) >>>>> Local_Overlap_AS_forCNS found overlap between 1076523 (U) and 57537697 >>>>> (R) ahang: 332, bhang: -42 (expected hang was 318) >>>>> Alignment params: 300 336 200 200 0 0.12 1e-06 30 1 >>>>> -- e/l = 6/110 = 5.45% >>>>> A -----+------+----> [] >>>>> B 332 -------> 42 [] >>>>> >>>>> This is the error message: >>>>> at /usit/titan/u1/olekto/src/wgs-7.0/Linux-amd64/bin/runCA line 1237 >>>>> main::caFailure('1 consensusAfterScaffolder jobs failed; remove >>>>> 8-consensus/co...', undef) called at >>>>> /usit/titan/u1/olekto/src/wgs-7.0/Linux-amd64/bin/runCA line 5142 >>>>> main::postScaffolderConsensus() called at >>>>> /usit/titan/u1/olekto/src/wgs-7.0/Linux-amd64/bin/runCA line 5885 >>>>> >>>>> ---------------------------------------- >>>>> Failure message: >>>>> >>>>> 1 consensusAfterScaffolder jobs failed; remove >>>>> 8-consensus/consensus.sh to try again >>>>> >>>>> I've tried removing consensus.sh and running again, but get the same error. >>>>> >>>>> This is the spec file: >>>>> utgErrorRate=0.03 >>>>> utgErrorLimit=2.5 >>>>> ovlErrorRate=0.06 >>>>> cnsErrorRate=0.06 >>>>> cgwErrorRate=0.10 >>>>> merSize = 22 >>>>> overlapper=ovl >>>>> unitigger = bogart >>>>> merylMemory = 128000 >>>>> merylThreads = 16 >>>>> merOverlapperThreads = 2 >>>>> merOverlapperExtendConcurrency = 8 >>>>> merOverlapperSeedConcurrency = 8 >>>>> ovlThreads = 2 >>>>> mbtThreads = 2 >>>>> mbtConcurrency = 8 >>>>> ovlConcurrency = 8 >>>>> ovlCorrConcurrency = 16 >>>>> ovlRefBlockSize = 32000000 >>>>> ovlHashBits = 24 >>>>> ovlHashBlockLength = 800000000 >>>>> ovlStoreMemory = 128000 >>>>> frgCorrThreads = 2 >>>>> frgCorrConcurrency = 8 >>>>> ovlCorrBatchSize = 1000000 >>>>> ovlCorrConcurrency = 16 >>>>> cnsConcurrency = 16 >>>>> doExtendClearRanges = 0 >>>>> >>>>> I don't need to have that unitig (1076523 (U)) in my finished >>>>> assembly, so it's possible to just remove it as long as I get a >>>>> finished assembly. I've also tried to just create the .success file, >>>>> but then terminator fails. >>>>> >>>>> Does anyone have any ideas of what I might do different? Can I just >>>>> remove that unitig and proceed? How do I do that? >>>>> >>>>> Sincerely, >>>>> Ole Kristian Tørresen >>>>> PhD student >>>>> University of Oslo >>>>> >>>>> --------------------------------------------------------------------------- >>>>> --- >>>>> Better than sec? Nothing is better than sec when it comes to >>>>> monitoring Big Data applications. Try Boundary one-second >>>>> resolution app monitoring today. Free. >>>>> http://p.sf.net/sfu/Boundary-dev2dev >>>>> _______________________________________________ >>>>> wgs-assembler-users mailing list >>>>> wgs...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >>>> > |