Re: [wgs-assembler-users] runCA stopped while updating overlapStore - how to resume???

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Brian,

Thanks! overlaps are being computed now and CVS version of CA has been 
successfully compiled. Will try the runCA-overlapStoreBuild.pl once the 
overlapper is finished. One question there: I understand that the memory 
usage is regulated by the -jobs j parameter. higher value for j means 
less memory for every job. How can I specify the number of CPUs to be 
used in the parallel steps?

Thanks for your help! I appreciate it!

cheers,
Christoph

On 07/10/2012 10:18 PM, Walenz, Brian wrote:
> Quick guess is that runCA is finding the old ovlStore and assuming it is
> complete, then continuing on to frgcorr.  runCA tests for the existence of
> name.ovlStore to determine if overlaps are finished; it doesn't check that
> the store is valid.  So, delete *ovlStore* too.
>
> Your latest build (from scratch) is suffering from a long standing
> dependency issue.  It needs kmer checked out and 'make install'ed.
>
> make[1]: *** No rule to make target `sweatShop.H', needed by
> `classifyMates.o'.  Stop.
> make[1]: *** Waiting for unfinished jobs....
> make: *** [objs] Error 1
>
> Once kmer is installed, wipe (again) the Linux-amd64 and rebuild.
>
> The kmer included in CA7 is too old for the CVS version of CA, so you'll
> need to grab it from subversion.
>
> http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Check_ou
> t_and_Compile
>
> b
>
>
> On 7/10/12 4:00 PM, "Christoph Hahn" <chr...@gm...> wrote:
>
>> Hi,
>>
>> I actually tried to just rerun the overlapper. I moved the 1-overlapper
>> and the 3-overlapcorrection directories and just ran runCA and it
>> immediately starts with doing frgcorr. Do you mean recompute from the
>> very start? Is there a way to avoid recomputing the initial overlaps at
>> least(it took some 10000 CPUhours)??
>>
>> Tried to compile it again - not successful. Ran make in the src
>> directory (output in makelog) and also in the AS_RUN directory (output
>> AS_RUN-makelog).
>>
>> Thanks,
>> Christoph
>>
>>
>> On 07/10/2012 09:04 PM, Walenz, Brian wrote:
>>> Odd, the *gz should only be deleted after the store is successfully built.
>>> runCA might have been confused by the attempt to rerun.  The easiest will be
>>> to recompute.  :-(
>>>
>>> I've never seen the 'libCA.a' error before.  That particular program is the
>>> first to get built.  Looks like libCA.a wasn't created.  My fix for most
>>> strange compile errors is to remove the entire Linux-amd64 directory and
>>> recompile.  If that fails, send along the complete output of make and I'll
>>> take a look.
>>>
>>> b
>>>
>>>
>>>
>>>
>>> On 7/10/12 2:15 PM, "Christoph Hahn" <chr...@gm...> wrote:
>>>
>>>> Hi Brian,
>>>>
>>>> Thanks for your reply!
>>>>
>>>> I would be happy to try the new parallel overlap store build, but I
>>>> think I need the *.ovb.gz outputs for that and unfortunately I dont have
>>>> them any more. Looks like they were deleted after the ovlStore was
>>>> build. So I guess I ll need to run the overlapper again, first. Am I
>>>> understanding that correctly?
>>>>
>>>> I have downloaded the cvs and tried to make, but I get:
>>>> *** No rule to make target `libCA.a', needed by `fragmentDepth'. Stop.
>>>>
>>>> I really appreciate your help!
>>>>
>>>> cheers,
>>>> Christoph
>>>>
>>>>
>>>> On 07/10/2012 05:09 PM, Walenz, Brian wrote:
>>>>> Hi, Christoph-
>>>>>
>>>>> The original overlap store build is difficult to resume.  I think it can be
>>>>> done, but it will take code changes that are probably specific to the case
>>>>> you have.  Only if you do not have the *ovb.gz outputs from overlapper will
>>>>> I suggest this.
>>>>>
>>>>> Option 1 is then to restart.
>>>>>
>>>>> Option 2 is to use a new 'data-parallel' overlap store build
>>>>> (AS_RUN/runCA-overlapStoreBuild.pl).  It runs as a series of three grid
>>>>> jobs.  The first job is parallel, and transfers the overlapper output into
>>>>> buckets for sorting.  The second job, also parallel, sorts each bucket.
>>>>> The
>>>>> final job, sequential, builds an index for the store.  Since this compute
>>>>> is
>>>>> just a collection of jobs, it can be restarted/resumed/fixed easily.
>>>>>
>>>>> Its performance can be great -- at JCVI we've seen builds that we estimated
>>>>> would take 2 days using the original sequential build, finish in a few (4?)
>>>>> hours with the data parallel version.  But on our development cluster, it
>>>>> is
>>>>> slower than the sequential version.  It depends on the disk throughput.
>>>>> Our
>>>>> dev cluster is powered off of a 6-disk ZFS, while the production side has a
>>>>> big Isilon.
>>>>>
>>>>> It is only in CVS.  I just added command line help and a bit of
>>>>> documentation, so do an update first.
>>>>>
>>>>> Happy to provide help if you want to try it out.  More than happy to accept
>>>>> better documentation.
>>>>>
>>>>> b
>>>>>
>>>>>
>>>>> On 7/10/12 6:47 AM, "Christoph Hahn" <chr...@gm...> wrote:
>>>>>
>>>>>> Hei Ole,
>>>>>>
>>>>>> Thanks for your reply. I had looked on the preprocessing page you are
>>>>>> referring to just recently. Sounds like a good approach you are using!
>>>>>> Will definitely consider that to make the assembly more effective in a
>>>>>> next try. Thanks for that!
>>>>>> For now, I think I am pretty much over all the trimming and correction
>>>>>> steps (once I get this last thing sorted out..). As far as I can see the
>>>>>> next step is already building the unitigs, so I ll try to finish this
>>>>>> assembly as it is now. Will try to improve it afterwards. I am really
>>>>>> curious how a first attempt of a hybrid approach (454+illumina) will
>>>>>> perform in comparison to the pure illumina assemblies which I have
>>>>>> pretty much optimized now (and with which I am pretty happy, btw), I
>>>>>> think.
>>>>>>
>>>>>> I am afraid, your suggestion to do doFragmentCorrection=0 directly now
>>>>>> will not work. For the next step (the unitigger) I ll need an intact
>>>>>> overlap store. As it is now, I think it is useless, being only
>>>>>> half-updated.. I also discovered that just rerunning the previous
>>>>>> overlapStore command (the one before the frg- and ovlcorrection) is not
>>>>>> working as I thought it would.
>>>>>> Seems to be a very unfortunate situation - really dont know how to
>>>>>> proceed.. It would be fantastic if anyone could give me a tip what to do!!
>>>>>>
>>>>>> Thanks for your help!
>>>>>>
>>>>>> much obliged,
>>>>>> Christoph
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 09.07.2012 13:20, Ole Kristian Tørresen wrote:
>>>>>>> Hi Christoph.
>>>>>>>
>>>>>>> This is not an answer to your question, but a suggestion for a
>>>>>>> work-around. If I remember correctly, you have both Illumina and 454
>>>>>>> reads. Celera runs, as you see below, frgcorrection and overlap based
>>>>>>> trimming to correct 454 reads, and merTrim to correct Illumina reads
>>>>>>> (can also be used on 454 reads). What I've been doing lately, is to
>>>>>>> run meryl on a trusted set of Illumina reads, pair end for example, I
>>>>>>> ran it on some overlapping reads which I had merged with FLASH. Then
>>>>>>> you can use the set of trusted k-mers to correct different datasets.
>>>>>>> For example, I first ran CA to the end of OBT (overlap based trimming)
>>>>>>> for my 454 reads, and then output the result as fastq-files. I used
>>>>>>> the trusted k-mer set to correct these 454 reads too. If you do this
>>>>>>> for all your reads, used either merTim or merTrim/OBT, and do
>>>>>>> deduplication on all the datasets too, then you'll end up with reads
>>>>>>> that you can use in assemblies where you skip relatively expensive
>>>>>>> steps as frgcorrection.
>>>>>>>
>>>>>>> I don't think frgcorrection is that useful for the type of data you're
>>>>>>> using anyway.
>>>>>>>
>>>>>>> If you have a set of corrected reads, you can use these settings for CA:
>>>>>>> doOBT=0
>>>>>>> doFragmentCorrection=0
>>>>>>>
>>>>>>> When I think of it, you might use doFragmentCorrection=0 on this
>>>>>>> assembly now. You might have to clean up your directory tree, like
>>>>>>> removing the 3-overlapcorrection directory and maybe some other steps
>>>>>>> too. Apply with caution.
>>>>>>>
>>>>>>> Most of the stuff I've mentioned I've taken from here:
>>>>>>> http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Prepr
>>>>>>> oc
>>>>>>> es
>>>>>>> sing
>>>>>>> and discussion with Brian.
>>>>>>>
>>>>>>> Ole
>>>>>>>
>>>>>>> On 9 July 2012 12:47, Christoph Hahn<chr...@gm...>  wrote:
>>>>>>>> Dear users and developers,
>>>>>>>>
>>>>>>>> I have the following problem: In my assembly process I have just
>>>>>>>> completed
>>>>>>>> the fragment- and overlap error correction. Unfortunately runCA stopped
>>>>>>>> in
>>>>>>>> the subsequent updating of the overlapStore, because of an incorrectly
>>>>>>>> set
>>>>>>>> time limit..
>>>>>>>> If I am trying to resume the assembly now, I get the following error:
>>>>>>>> ----------------------------------------START Mon Jul  9 11:05:53 2012
>>>>>>>> /xanadu/home/chrishah/programmes/wgs-7.0/Linux-amd64/bin/overlapStore
>>>>>>>> -u
>>>>>>>> /projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore
>>>>>>>> /projects/nn9201k/Celera/work2/salaris1/3-overlapco
>>>>>>>> rrection/salaris.erates>
>>>>>>>> /projects/nn9201k/Celera/work2/salaris1/3-overlapcorrection/overlapStore
>>>>>>>> -u
>>>>>>>> pd
>>>>>>>> ate-erates.err
>>>>>>>> 2>&1
>>>>>>>> ----------------------------------------END Mon Jul  9 11:05:54 2012 (1
>>>>>>>> seconds)
>>>>>>>> ERROR: Failed with signal HUP (1)
>>>>>>>> ========================================================================
>>>>>>>> ==
>>>>>>>> ==
>>>>>>>> ====
>>>>>>>>
>>>>>>>> runCA failed.
>>>>>>>>
>>>>>>>> ----------------------------------------
>>>>>>>> Stack trace:
>>>>>>>>
>>>>>>>>      at
>>>>>>>> /usit/titan/u1/chrishah/programmes/wgs-7.0/Linux-amd64/bin/./runCA
>>>>>>>> line
>>>>>>>> 1237
>>>>>>>>             main::caFailure('failed to apply the overlap corrections',
>>>>>>>> '/projects/nn9201k/Celera/work2/salaris1/3-overlapcorrection/o...')
>>>>>>>> called
>>>>>>>> at /usit/titan/u1/chrishah/programmes/wgs
>>>>>>>> -7.0/Linux-amd64/bin/./runCA line 4077
>>>>>>>>             main::overlapCorrection() called at
>>>>>>>> /usit/titan/u1/chrishah/programmes/wgs-7.0/Linux-amd64/bin/./runCA line
>>>>>>>> 5880
>>>>>>>>
>>>>>>>> ----------------------------------------
>>>>>>>> Last few lines of the relevant log file
>>>>>>>> (/projects/nn9201k/Celera/work2/salaris1/3-overlapcorrection/overlapStor
>>>>>>>> e-
>>>>>>>> up
>>>>>>>> date-erates.err):
>>>>>>>>
>>>>>>>> AS_OVS_openBinaryOverlapFile()-- Failed to open
>>>>>>>> '/projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore/0001~' for
>>>>>>>> reading: No such file or directory
>>>>>>>>
>>>>>>>> ----------------------------------------
>>>>>>>> Failure message:
>>>>>>>>
>>>>>>>> failed to apply the overlap corrections
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> So it can obviously not find the file /salaris.ovlStore/0001~. The
>>>>>>>> reason
>>>>>>>> is, from what I can see, that the /salaris.ovlStore/0001~ file has
>>>>>>>> already
>>>>>>>> been updated to /salaris.ovlStore/0001 before it stopped. In fact it
>>>>>>>> seems
>>>>>>>> to have stopped after updating /salaris.ovlStore/0249 (of 430). Is there
>>>>>>>> a
>>>>>>>> way to tell runCA to continue from  /salaris.ovlStore/0250~, instead of
>>>>>>>> from
>>>>>>>> 0001~, which is obviously not there any more??
>>>>>>>> Another solution I was thinking of is to run the previous overlapStore
>>>>>>>> command again manually (the one that was done before starting the
>>>>>>>> frgcorr
>>>>>>>> and ovlcorr:
>>>>>>>> /xanadu/home/chrishah/programmes/wgs-7.0/Linux-amd64/bin/overlapStore
>>>>>>>> -c
>>>>>>>> /projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore.BUILDING  -g
>>>>>>>> /projects/nn9201k/Celera/work2/salaris1/salaris.gkpStore  -i 0 -M 14000
>>>>>>>> -L
>>>>>>>> /projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore.list>
>>>>>>>> /projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore.err 2>&1) to
>>>>>>>> restore the status from before the frgcorr and ovlcorr steps, before
>>>>>>>> resuming runCA. This should restore the 0001~ file, right? The most
>>>>>>>> important thing is that I want to avoid rerunning the frgcorr and
>>>>>>>> ovlcorr
>>>>>>>> steps, because these steps were really resource intensive.
>>>>>>>>
>>>>>>>> I would really appreciate any comments or suggestions to my problem!
>>>>>>>> Thanks
>>>>>>>> in advance for your help!
>>>>>>>>
>>>>>>>> much obliged,
>>>>>>>> Christoph
>>>>>>>>
>>>>>>>> University of Oslo
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------
>>>>>>>> --
>>>>>>>> --
>>>>>>>> --
>>>>>>>> Live Security Virtual Conference
>>>>>>>> Exclusive live event will cover all the ways today's security and
>>>>>>>> threat landscape has changed and how IT managers can respond.
>>>>>>>> Discussions
>>>>>>>> will include endpoint security, mobile security and the latest in
>>>>>>>> malware
>>>>>>>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>>>>>>>> _______________________________________________
>>>>>>>> wgs-assembler-users mailing list
>>>>>>>> wgs...@li...
>>>>>>>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users
>>>>>>>>
>>>>>> --------------------------------------------------------------------------
>>>>>> --
>>>>>> --
>>>>>> Live Security Virtual Conference
>>>>>> Exclusive live event will cover all the ways today's security and
>>>>>> threat landscape has changed and how IT managers can respond. Discussions
>>>>>> will include endpoint security, mobile security and the latest in malware
>>>>>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>>>>>> _______________________________________________
>>>>>> wgs-assembler-users mailing list
>>>>>> wgs...@li...
>>>>>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users
>>