From: Walenz, B. <bw...@jc...> - 2012-04-12 14:51:04
|
Hi Ole- The version numbers will be different in different assemblies. Mine came from a small assembly with little scaffolding work. Larger assemblies can have more than 100 versions. 'ls -l *tigStore' will show the versions - you want to use the last three. b On 4/12/12 3:52 AM, "Ole Kristian Tørresen" <o.k...@bi...> wrote: > Hi, Brian. > > Thank you for your help so far, but I seem to be missing something. > > I did this: > tigStore -g *gkpStore -t *tigStore 14 -c 107652 -d layout3 > ctg1076523 > tigStore -g *gkpStore -t *tigStore 15 -cp 36 -R ctg1076523 > > But when I dump the same contig from version 15: > tigStore -g *gkpStore -t *tigStore 15 -c 107652 -d layout3 > ctg1076523_v15 > it's without consensus sequence: > contig 1076523 > len 0 > cns > qlt > data.unitig_coverage_stat -9874.792662 > data.unitig_microhet_prob 0.000000 > data.unitig_status X > data.unitig_unique_rept X > data.contig_status U > data.num_frags 14302 > data.num_unitigs 1 > > The if I dump it from version 16, it's identical to the one from > version 14 (that is, with consensus). I've tried loading it several > times, but each time I dump it again it's lost consensus. Do you know > what I'm doing wrong? > > Ole > > On 11 April 2012 20:54, Walenz, Brian <bw...@jc...> wrote: >> Hi, Ole- >> >> Yes, I overlooked a step. In the contig you insert to the latest version, >> update the 'data.contig_status' with what the second to last version has. >> >> FYI, the tigStore should have versions such as: >> >> seqDB.v014.ctg >> seqDB.v014.dat >> seqDB.v014.utg >> >> seqDB.v015.ctg >> seqDB.v015.p001.ctg >> seqDB.v015.p001.dat >> (etc) >> seqDB.v015.utg >> >> seqDB.v016.ctg >> seqDB.v016.p001.ctg >> seqDB.v016.p001.dat >> (etc) >> seqDB.v016.utg >> >> (the v numbers will of course be different in your assembly) >> >> v015 contains the output of scaffolder, which is the input to consensus. >> Contigs here have no consensus sequence, but otherwise all the data is >> present. It is largely just rewriting the data from v014 into partitions >> (p###), so each consensus job can load a single file instead of randomly >> accessing a large file. The status flag on each unitig/contig is also set. >> This flag tells if the unitig/contig was placed in a scaffold, is a >> surrogate, degenerate, etc. >> >> v016 is the output of consensus, the input to terminator. All terminator >> does is to repackage this into ASCII files. >> >> To summarize: grab the contig from v014 (the last with a consensus >> sequence), the status flag from v015, change the status flag in the contig >> you grabbed, and then insert the contig into v016. >> >> by doing this, you'll lose VAR records for this contig, but otherwise the >> consensus sequence is the same (or largely the same; variant detection can >> change it a bit). >> >> b >> >> >> On 4/11/12 6:23 AM, "Ole Kristian Tørresen" <o.k...@bi...> wrote: >> >>> Hi Brian, >>> ctgcns completed now, but I got an error with asmOutputFasta. From >>> 9-terminator/asmOutputFasta.err: >>> ERROR: Illegal unitigpos type type value 'X' (CCO) at line 1676575956 >>> >>> Is this connected with the procedure I did with inserting the contig >>> from an older tigStore? >>> >>> Thank you for your help so far. >>> >>> Ole >>> >>> On 11 April 2012 08:13, Ole Kristian Tørresen <o.k...@bi...> >>> wrote: >>>> Hi Brian. >>>> >>>> I've done this, and rerunning ctgcns on that last partition. I'll send >>>> the layout and log in a separate email. >>>> >>>> Ole >>>> >>>> On 10 April 2012 21:37, Walenz, Brian <bw...@jc...> wrote: >>>>> Hi Ole- >>>>> >>>>> I don't see anything that looks like an error in the log, so I'll have to >>>>> assume it crashed. You report it runs for 20 hours, which is odd for >>>>> contig >>>>> consensus, unless that contig is very very deep. If so, the ctgcns >>>>> process >>>>> will also be large. Do you know how big the process was? >>>>> >>>>> Can you make the full log available? >>>>> >>>>> It is possible to force the contig to have a consensus sequence. If the >>>>> job >>>>> did crash, the other contigs will still need to have consensus generated. >>>>> >>>>> The process is the same as editing a unitig in the tigStore: dump the >>>>> contig >>>>> in question, edit the file to have a consensus sequence, then load that >>>>> contig back into the tigStore. A consensus sequence for this contig can >>>>> be >>>>> found in one of the earlier tigStore versions; the version just before >>>>> this >>>>> one will probably have it. That makes our process even easier: dump the >>>>> version with a consensus sequence, and load it back into the latest >>>>> version. >>>>> >>>>> A sketch of the steps: >>>>> >>>>> 1) Dump the previous version of the contig. check that 'file' does >>>>> contain >>>>> a consensus sequence. >>>>> >>>>> tigStore -g *gkpStore -t *tigStore <vers-1> -c <ctgID> -d layout > file >>>>> >>>>> 2) Load that pervious version into the tigStore as the latest version >>>>> >>>>> tigStore -g *gkpStore -t *tigStore <vers> <part> -c <ctgID> -R file >>>>> >>>>> Notice that this tigStore command specifies both a version and a partition >>>>> for the tigStore. >>>>> >>>>> 3) Rerun consensus.sh on that partition. It will not attempt to compute >>>>> the >>>>> consensus for that contig. >>>>> >>>>> I'd be interested in seeing the contig you dump, if only to verify that it >>>>> is deep. >>>>> >>>>> b >>>>> >>>>> >>>>> >>>>> On 4/10/12 4:05 AM, "Ole Kristian Tørresen" <o.k...@bi...> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> I'm having some problems while doing some low coverage sequencing >>>>>> assembly testing. I've tried to assemble about 10x coverage of 150 nt >>>>>> paired Illumina reads of 500 bp fragment size. These are from the >>>>>> parrot used in the Assemblathon 2 >>>>>> (http://assemblathon.org/pages/download-data). Everything seems to run >>>>>> fine, until contig consensus, where 1 partition just don't succeed. It >>>>>> seems to run for quite some time (20 hours or something) before >>>>>> failing. These are the last 20 lines from the output of the ctgcns >>>>>> partition that fails: >>>>>> Alignment params: 297 333 200 200 0 0.12 1e-06 30 1 >>>>>> -- e/l = 7/112 = 6.25% >>>>>> A -----+------+----> [] >>>>>> B 332 -------> 40 [] >>>>>> GetAlignmentTrace()-- Overlap ACCEPTED! accept=1000.000000 >>>>>> lScore=0.026087 (112 vs 115) aScore=0.160000 (332 vs 316) >>>>>> bScore=0.150000 (-42 vs -27). (CONTIGF) >>>>>> GetAlignmentTrace()-- Overlap found between 1076523 (U) and 25763657 >>>>>> (R) expected hangs: a=316 b=-27 erate=0.060000 aligner=Local_Overlap >>>>>> GetAlignmentTrace()-- Overlap ACCEPTED! accept=1000.000000 >>>>>> lScore=0.026087 (112 vs 115) aScore=0.160000 (332 vs 316) >>>>>> bScore=0.150000 (-42 vs -27). (CONTIGF) >>>>>> Local_Overlap_AS_forCNS found overlap between 1076523 (U) and 25763657 >>>>>> (R) ahang: 332, bhang: -42 (expected hang was 316) >>>>>> Alignment params: 298 334 200 200 0 0.12 1e-06 30 1 >>>>>> -- e/l = 6/112 = 5.36% >>>>>> A -----+------+----> [] >>>>>> B 332 -------> 42 [] >>>>>> GetAlignmentTrace()-- Overlap ACCEPTED! accept=1000.000000 >>>>>> lScore=0.009009 (110 vs 111) aScore=0.140000 (332 vs 318) >>>>>> bScore=0.130000 (-42 vs -29). (CONTIGF) >>>>>> GetAlignmentTrace()-- Overlap found between 1076523 (U) and 57537697 >>>>>> (R) expected hangs: a=318 b=-29 erate=0.060000 aligner=Local_Overlap >>>>>> GetAlignmentTrace()-- Overlap ACCEPTED! accept=1000.000000 >>>>>> lScore=0.009009 (110 vs 111) aScore=0.140000 (332 vs 318) >>>>>> bScore=0.130000 (-42 vs -29). (CONTIGF) >>>>>> Local_Overlap_AS_forCNS found overlap between 1076523 (U) and 57537697 >>>>>> (R) ahang: 332, bhang: -42 (expected hang was 318) >>>>>> Alignment params: 300 336 200 200 0 0.12 1e-06 30 1 >>>>>> -- e/l = 6/110 = 5.45% >>>>>> A -----+------+----> [] >>>>>> B 332 -------> 42 [] >>>>>> >>>>>> This is the error message: >>>>>> at /usit/titan/u1/olekto/src/wgs-7.0/Linux-amd64/bin/runCA line 1237 >>>>>> main::caFailure('1 consensusAfterScaffolder jobs failed; remove >>>>>> 8-consensus/co...', undef) called at >>>>>> /usit/titan/u1/olekto/src/wgs-7.0/Linux-amd64/bin/runCA line 5142 >>>>>> main::postScaffolderConsensus() called at >>>>>> /usit/titan/u1/olekto/src/wgs-7.0/Linux-amd64/bin/runCA line 5885 >>>>>> >>>>>> ---------------------------------------- >>>>>> Failure message: >>>>>> >>>>>> 1 consensusAfterScaffolder jobs failed; remove >>>>>> 8-consensus/consensus.sh to try again >>>>>> >>>>>> I've tried removing consensus.sh and running again, but get the same >>>>>> error. >>>>>> >>>>>> This is the spec file: >>>>>> utgErrorRate=0.03 >>>>>> utgErrorLimit=2.5 >>>>>> ovlErrorRate=0.06 >>>>>> cnsErrorRate=0.06 >>>>>> cgwErrorRate=0.10 >>>>>> merSize = 22 >>>>>> overlapper=ovl >>>>>> unitigger = bogart >>>>>> merylMemory = 128000 >>>>>> merylThreads = 16 >>>>>> merOverlapperThreads = 2 >>>>>> merOverlapperExtendConcurrency = 8 >>>>>> merOverlapperSeedConcurrency = 8 >>>>>> ovlThreads = 2 >>>>>> mbtThreads = 2 >>>>>> mbtConcurrency = 8 >>>>>> ovlConcurrency = 8 >>>>>> ovlCorrConcurrency = 16 >>>>>> ovlRefBlockSize = 32000000 >>>>>> ovlHashBits = 24 >>>>>> ovlHashBlockLength = 800000000 >>>>>> ovlStoreMemory = 128000 >>>>>> frgCorrThreads = 2 >>>>>> frgCorrConcurrency = 8 >>>>>> ovlCorrBatchSize = 1000000 >>>>>> ovlCorrConcurrency = 16 >>>>>> cnsConcurrency = 16 >>>>>> doExtendClearRanges = 0 >>>>>> >>>>>> I don't need to have that unitig (1076523 (U)) in my finished >>>>>> assembly, so it's possible to just remove it as long as I get a >>>>>> finished assembly. I've also tried to just create the .success file, >>>>>> but then terminator fails. >>>>>> >>>>>> Does anyone have any ideas of what I might do different? Can I just >>>>>> remove that unitig and proceed? How do I do that? >>>>>> >>>>>> Sincerely, >>>>>> Ole Kristian Tørresen >>>>>> PhD student >>>>>> University of Oslo >>>>>> >>>>>> ------------------------------------------------------------------------- >>>>>> -- >>>>>> --- >>>>>> Better than sec? Nothing is better than sec when it comes to >>>>>> monitoring Big Data applications. Try Boundary one-second >>>>>> resolution app monitoring today. Free. >>>>>> http://p.sf.net/sfu/Boundary-dev2dev >>>>>> _______________________________________________ >>>>>> wgs-assembler-users mailing list >>>>>> wgs...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >>>>> >> |