From: Christoph H. <chr...@gm...> - 2012-07-10 18:15:30
|
Hi Brian, Thanks for your reply! I would be happy to try the new parallel overlap store build, but I think I need the *.ovb.gz outputs for that and unfortunately I dont have them any more. Looks like they were deleted after the ovlStore was build. So I guess I ll need to run the overlapper again, first. Am I understanding that correctly? I have downloaded the cvs and tried to make, but I get: *** No rule to make target `libCA.a', needed by `fragmentDepth'. Stop. I really appreciate your help! cheers, Christoph On 07/10/2012 05:09 PM, Walenz, Brian wrote: > Hi, Christoph- > > The original overlap store build is difficult to resume. I think it can be > done, but it will take code changes that are probably specific to the case > you have. Only if you do not have the *ovb.gz outputs from overlapper will > I suggest this. > > Option 1 is then to restart. > > Option 2 is to use a new 'data-parallel' overlap store build > (AS_RUN/runCA-overlapStoreBuild.pl). It runs as a series of three grid > jobs. The first job is parallel, and transfers the overlapper output into > buckets for sorting. The second job, also parallel, sorts each bucket. The > final job, sequential, builds an index for the store. Since this compute is > just a collection of jobs, it can be restarted/resumed/fixed easily. > > Its performance can be great -- at JCVI we've seen builds that we estimated > would take 2 days using the original sequential build, finish in a few (4?) > hours with the data parallel version. But on our development cluster, it is > slower than the sequential version. It depends on the disk throughput. Our > dev cluster is powered off of a 6-disk ZFS, while the production side has a > big Isilon. > > It is only in CVS. I just added command line help and a bit of > documentation, so do an update first. > > Happy to provide help if you want to try it out. More than happy to accept > better documentation. > > b > > > On 7/10/12 6:47 AM, "Christoph Hahn" <chr...@gm...> wrote: > >> Hei Ole, >> >> Thanks for your reply. I had looked on the preprocessing page you are >> referring to just recently. Sounds like a good approach you are using! >> Will definitely consider that to make the assembly more effective in a >> next try. Thanks for that! >> For now, I think I am pretty much over all the trimming and correction >> steps (once I get this last thing sorted out..). As far as I can see the >> next step is already building the unitigs, so I ll try to finish this >> assembly as it is now. Will try to improve it afterwards. I am really >> curious how a first attempt of a hybrid approach (454+illumina) will >> perform in comparison to the pure illumina assemblies which I have >> pretty much optimized now (and with which I am pretty happy, btw), I think. >> >> I am afraid, your suggestion to do doFragmentCorrection=0 directly now >> will not work. For the next step (the unitigger) I ll need an intact >> overlap store. As it is now, I think it is useless, being only >> half-updated.. I also discovered that just rerunning the previous >> overlapStore command (the one before the frg- and ovlcorrection) is not >> working as I thought it would. >> Seems to be a very unfortunate situation - really dont know how to >> proceed.. It would be fantastic if anyone could give me a tip what to do!! >> >> Thanks for your help! >> >> much obliged, >> Christoph >> >> >> >> >> On 09.07.2012 13:20, Ole Kristian Tørresen wrote: >>> Hi Christoph. >>> >>> This is not an answer to your question, but a suggestion for a >>> work-around. If I remember correctly, you have both Illumina and 454 >>> reads. Celera runs, as you see below, frgcorrection and overlap based >>> trimming to correct 454 reads, and merTrim to correct Illumina reads >>> (can also be used on 454 reads). What I've been doing lately, is to >>> run meryl on a trusted set of Illumina reads, pair end for example, I >>> ran it on some overlapping reads which I had merged with FLASH. Then >>> you can use the set of trusted k-mers to correct different datasets. >>> For example, I first ran CA to the end of OBT (overlap based trimming) >>> for my 454 reads, and then output the result as fastq-files. I used >>> the trusted k-mer set to correct these 454 reads too. If you do this >>> for all your reads, used either merTim or merTrim/OBT, and do >>> deduplication on all the datasets too, then you'll end up with reads >>> that you can use in assemblies where you skip relatively expensive >>> steps as frgcorrection. >>> >>> I don't think frgcorrection is that useful for the type of data you're >>> using anyway. >>> >>> If you have a set of corrected reads, you can use these settings for CA: >>> doOBT=0 >>> doFragmentCorrection=0 >>> >>> When I think of it, you might use doFragmentCorrection=0 on this >>> assembly now. You might have to clean up your directory tree, like >>> removing the 3-overlapcorrection directory and maybe some other steps >>> too. Apply with caution. >>> >>> Most of the stuff I've mentioned I've taken from here: >>> http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Preproces >>> sing >>> and discussion with Brian. >>> >>> Ole >>> >>> On 9 July 2012 12:47, Christoph Hahn<chr...@gm...> wrote: >>>> Dear users and developers, >>>> >>>> I have the following problem: In my assembly process I have just completed >>>> the fragment- and overlap error correction. Unfortunately runCA stopped in >>>> the subsequent updating of the overlapStore, because of an incorrectly set >>>> time limit.. >>>> If I am trying to resume the assembly now, I get the following error: >>>> ----------------------------------------START Mon Jul 9 11:05:53 2012 >>>> /xanadu/home/chrishah/programmes/wgs-7.0/Linux-amd64/bin/overlapStore -u >>>> /projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore >>>> /projects/nn9201k/Celera/work2/salaris1/3-overlapco >>>> rrection/salaris.erates> >>>> /projects/nn9201k/Celera/work2/salaris1/3-overlapcorrection/overlapStore-upd >>>> ate-erates.err >>>> 2>&1 >>>> ----------------------------------------END Mon Jul 9 11:05:54 2012 (1 >>>> seconds) >>>> ERROR: Failed with signal HUP (1) >>>> ============================================================================ >>>> ==== >>>> >>>> runCA failed. >>>> >>>> ---------------------------------------- >>>> Stack trace: >>>> >>>> at /usit/titan/u1/chrishah/programmes/wgs-7.0/Linux-amd64/bin/./runCA line >>>> 1237 >>>> main::caFailure('failed to apply the overlap corrections', >>>> '/projects/nn9201k/Celera/work2/salaris1/3-overlapcorrection/o...') called >>>> at /usit/titan/u1/chrishah/programmes/wgs >>>> -7.0/Linux-amd64/bin/./runCA line 4077 >>>> main::overlapCorrection() called at >>>> /usit/titan/u1/chrishah/programmes/wgs-7.0/Linux-amd64/bin/./runCA line 5880 >>>> >>>> ---------------------------------------- >>>> Last few lines of the relevant log file >>>> (/projects/nn9201k/Celera/work2/salaris1/3-overlapcorrection/overlapStore-up >>>> date-erates.err): >>>> >>>> AS_OVS_openBinaryOverlapFile()-- Failed to open >>>> '/projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore/0001~' for >>>> reading: No such file or directory >>>> >>>> ---------------------------------------- >>>> Failure message: >>>> >>>> failed to apply the overlap corrections >>>> >>>> >>>> >>>> So it can obviously not find the file /salaris.ovlStore/0001~. The reason >>>> is, from what I can see, that the /salaris.ovlStore/0001~ file has already >>>> been updated to /salaris.ovlStore/0001 before it stopped. In fact it seems >>>> to have stopped after updating /salaris.ovlStore/0249 (of 430). Is there a >>>> way to tell runCA to continue from /salaris.ovlStore/0250~, instead of from >>>> 0001~, which is obviously not there any more?? >>>> Another solution I was thinking of is to run the previous overlapStore >>>> command again manually (the one that was done before starting the frgcorr >>>> and ovlcorr: >>>> /xanadu/home/chrishah/programmes/wgs-7.0/Linux-amd64/bin/overlapStore -c >>>> /projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore.BUILDING -g >>>> /projects/nn9201k/Celera/work2/salaris1/salaris.gkpStore -i 0 -M 14000 -L >>>> /projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore.list> >>>> /projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore.err 2>&1) to >>>> restore the status from before the frgcorr and ovlcorr steps, before >>>> resuming runCA. This should restore the 0001~ file, right? The most >>>> important thing is that I want to avoid rerunning the frgcorr and ovlcorr >>>> steps, because these steps were really resource intensive. >>>> >>>> I would really appreciate any comments or suggestions to my problem! Thanks >>>> in advance for your help! >>>> >>>> much obliged, >>>> Christoph >>>> >>>> University of Oslo >>>> >>>> >>>> >>>> >>>> ---------------------------------------------------------------------------- >>>> -- >>>> Live Security Virtual Conference >>>> Exclusive live event will cover all the ways today's security and >>>> threat landscape has changed and how IT managers can respond. Discussions >>>> will include endpoint security, mobile security and the latest in malware >>>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >>>> _______________________________________________ >>>> wgs-assembler-users mailing list >>>> wgs...@li... >>>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >>>> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |