wgs-assembler-users Mailing List for Whole-Genome Shotgun Assembler (Page 17)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Brian,

Thanks! overlaps are being computed now and CVS version of CA has been 
successfully compiled. Will try the runCA-overlapStoreBuild.pl once the 
overlapper is finished. One question there: I understand that the memory 
usage is regulated by the -jobs j parameter. higher value for j means 
less memory for every job. How can I specify the number of CPUs to be 
used in the parallel steps?

Thanks for your help! I appreciate it!

cheers,
Christoph

On 07/10/2012 10:18 PM, Walenz, Brian wrote:
> Quick guess is that runCA is finding the old ovlStore and assuming it is
> complete, then continuing on to frgcorr.  runCA tests for the existence of
> name.ovlStore to determine if overlaps are finished; it doesn't check that
> the store is valid.  So, delete *ovlStore* too.
>
> Your latest build (from scratch) is suffering from a long standing
> dependency issue.  It needs kmer checked out and 'make install'ed.
>
> make[1]: *** No rule to make target `sweatShop.H', needed by
> `classifyMates.o'.  Stop.
> make[1]: *** Waiting for unfinished jobs....
> make: *** [objs] Error 1
>
> Once kmer is installed, wipe (again) the Linux-amd64 and rebuild.
>
> The kmer included in CA7 is too old for the CVS version of CA, so you'll
> need to grab it from subversion.
>
> http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Check_ou
> t_and_Compile
>
> b
>
>
> On 7/10/12 4:00 PM, "Christoph Hahn" <chr...@gm...> wrote:
>
>> Hi,
>>
>> I actually tried to just rerun the overlapper. I moved the 1-overlapper
>> and the 3-overlapcorrection directories and just ran runCA and it
>> immediately starts with doing frgcorr. Do you mean recompute from the
>> very start? Is there a way to avoid recomputing the initial overlaps at
>> least(it took some 10000 CPUhours)??
>>
>> Tried to compile it again - not successful. Ran make in the src
>> directory (output in makelog) and also in the AS_RUN directory (output
>> AS_RUN-makelog).
>>
>> Thanks,
>> Christoph
>>
>>
>> On 07/10/2012 09:04 PM, Walenz, Brian wrote:
>>> Odd, the *gz should only be deleted after the store is successfully built.
>>> runCA might have been confused by the attempt to rerun.  The easiest will be
>>> to recompute.  :-(
>>>
>>> I've never seen the 'libCA.a' error before.  That particular program is the
>>> first to get built.  Looks like libCA.a wasn't created.  My fix for most
>>> strange compile errors is to remove the entire Linux-amd64 directory and
>>> recompile.  If that fails, send along the complete output of make and I'll
>>> take a look.
>>>
>>> b
>>>
>>>
>>>
>>>
>>> On 7/10/12 2:15 PM, "Christoph Hahn" <chr...@gm...> wrote:
>>>
>>>> Hi Brian,
>>>>
>>>> Thanks for your reply!
>>>>
>>>> I would be happy to try the new parallel overlap store build, but I
>>>> think I need the *.ovb.gz outputs for that and unfortunately I dont have
>>>> them any more. Looks like they were deleted after the ovlStore was
>>>> build. So I guess I ll need to run the overlapper again, first. Am I
>>>> understanding that correctly?
>>>>
>>>> I have downloaded the cvs and tried to make, but I get:
>>>> *** No rule to make target `libCA.a', needed by `fragmentDepth'. Stop.
>>>>
>>>> I really appreciate your help!
>>>>
>>>> cheers,
>>>> Christoph
>>>>
>>>>
>>>> On 07/10/2012 05:09 PM, Walenz, Brian wrote:
>>>>> Hi, Christoph-
>>>>>
>>>>> The original overlap store build is difficult to resume.  I think it can be
>>>>> done, but it will take code changes that are probably specific to the case
>>>>> you have.  Only if you do not have the *ovb.gz outputs from overlapper will
>>>>> I suggest this.
>>>>>
>>>>> Option 1 is then to restart.
>>>>>
>>>>> Option 2 is to use a new 'data-parallel' overlap store build
>>>>> (AS_RUN/runCA-overlapStoreBuild.pl).  It runs as a series of three grid
>>>>> jobs.  The first job is parallel, and transfers the overlapper output into
>>>>> buckets for sorting.  The second job, also parallel, sorts each bucket.
>>>>> The
>>>>> final job, sequential, builds an index for the store.  Since this compute
>>>>> is
>>>>> just a collection of jobs, it can be restarted/resumed/fixed easily.
>>>>>
>>>>> Its performance can be great -- at JCVI we've seen builds that we estimated
>>>>> would take 2 days using the original sequential build, finish in a few (4?)
>>>>> hours with the data parallel version.  But on our development cluster, it
>>>>> is
>>>>> slower than the sequential version.  It depends on the disk throughput.
>>>>> Our
>>>>> dev cluster is powered off of a 6-disk ZFS, while the production side has a
>>>>> big Isilon.
>>>>>
>>>>> It is only in CVS.  I just added command line help and a bit of
>>>>> documentation, so do an update first.
>>>>>
>>>>> Happy to provide help if you want to try it out.  More than happy to accept
>>>>> better documentation.
>>>>>
>>>>> b
>>>>>
>>>>>
>>>>> On 7/10/12 6:47 AM, "Christoph Hahn" <chr...@gm...> wrote:
>>>>>
>>>>>> Hei Ole,
>>>>>>
>>>>>> Thanks for your reply. I had looked on the preprocessing page you are
>>>>>> referring to just recently. Sounds like a good approach you are using!
>>>>>> Will definitely consider that to make the assembly more effective in a
>>>>>> next try. Thanks for that!
>>>>>> For now, I think I am pretty much over all the trimming and correction
>>>>>> steps (once I get this last thing sorted out..). As far as I can see the
>>>>>> next step is already building the unitigs, so I ll try to finish this
>>>>>> assembly as it is now. Will try to improve it afterwards. I am really
>>>>>> curious how a first attempt of a hybrid approach (454+illumina) will
>>>>>> perform in comparison to the pure illumina assemblies which I have
>>>>>> pretty much optimized now (and with which I am pretty happy, btw), I
>>>>>> think.
>>>>>>
>>>>>> I am afraid, your suggestion to do doFragmentCorrection=0 directly now
>>>>>> will not work. For the next step (the unitigger) I ll need an intact
>>>>>> overlap store. As it is now, I think it is useless, being only
>>>>>> half-updated.. I also discovered that just rerunning the previous
>>>>>> overlapStore command (the one before the frg- and ovlcorrection) is not
>>>>>> working as I thought it would.
>>>>>> Seems to be a very unfortunate situation - really dont know how to
>>>>>> proceed.. It would be fantastic if anyone could give me a tip what to do!!
>>>>>>
>>>>>> Thanks for your help!
>>>>>>
>>>>>> much obliged,
>>>>>> Christoph
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 09.07.2012 13:20, Ole Kristian Tørresen wrote:
>>>>>>> Hi Christoph.
>>>>>>>
>>>>>>> This is not an answer to your question, but a suggestion for a
>>>>>>> work-around. If I remember correctly, you have both Illumina and 454
>>>>>>> reads. Celera runs, as you see below, frgcorrection and overlap based
>>>>>>> trimming to correct 454 reads, and merTrim to correct Illumina reads
>>>>>>> (can also be used on 454 reads). What I've been doing lately, is to
>>>>>>> run meryl on a trusted set of Illumina reads, pair end for example, I
>>>>>>> ran it on some overlapping reads which I had merged with FLASH. Then
>>>>>>> you can use the set of trusted k-mers to correct different datasets.
>>>>>>> For example, I first ran CA to the end of OBT (overlap based trimming)
>>>>>>> for my 454 reads, and then output the result as fastq-files. I used
>>>>>>> the trusted k-mer set to correct these 454 reads too. If you do this
>>>>>>> for all your reads, used either merTim or merTrim/OBT, and do
>>>>>>> deduplication on all the datasets too, then you'll end up with reads
>>>>>>> that you can use in assemblies where you skip relatively expensive
>>>>>>> steps as frgcorrection.
>>>>>>>
>>>>>>> I don't think frgcorrection is that useful for the type of data you're
>>>>>>> using anyway.
>>>>>>>
>>>>>>> If you have a set of corrected reads, you can use these settings for CA:
>>>>>>> doOBT=0
>>>>>>> doFragmentCorrection=0
>>>>>>>
>>>>>>> When I think of it, you might use doFragmentCorrection=0 on this
>>>>>>> assembly now. You might have to clean up your directory tree, like
>>>>>>> removing the 3-overlapcorrection directory and maybe some other steps
>>>>>>> too. Apply with caution.
>>>>>>>
>>>>>>> Most of the stuff I've mentioned I've taken from here:
>>>>>>> http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Prepr
>>>>>>> oc
>>>>>>> es
>>>>>>> sing
>>>>>>> and discussion with Brian.
>>>>>>>
>>>>>>> Ole
>>>>>>>
>>>>>>> On 9 July 2012 12:47, Christoph Hahn<chr...@gm...>  wrote:
>>>>>>>> Dear users and developers,
>>>>>>>>
>>>>>>>> I have the following problem: In my assembly process I have just
>>>>>>>> completed
>>>>>>>> the fragment- and overlap error correction. Unfortunately runCA stopped
>>>>>>>> in
>>>>>>>> the subsequent updating of the overlapStore, because of an incorrectly
>>>>>>>> set
>>>>>>>> time limit..
>>>>>>>> If I am trying to resume the assembly now, I get the following error:
>>>>>>>> ----------------------------------------START Mon Jul  9 11:05:53 2012
>>>>>>>> /xanadu/home/chrishah/programmes/wgs-7.0/Linux-amd64/bin/overlapStore
>>>>>>>> -u
>>>>>>>> /projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore
>>>>>>>> /projects/nn9201k/Celera/work2/salaris1/3-overlapco
>>>>>>>> rrection/salaris.erates>
>>>>>>>> /projects/nn9201k/Celera/work2/salaris1/3-overlapcorrection/overlapStore
>>>>>>>> -u
>>>>>>>> pd
>>>>>>>> ate-erates.err
>>>>>>>> 2>&1
>>>>>>>> ----------------------------------------END Mon Jul  9 11:05:54 2012 (1
>>>>>>>> seconds)
>>>>>>>> ERROR: Failed with signal HUP (1)
>>>>>>>> ========================================================================
>>>>>>>> ==
>>>>>>>> ==
>>>>>>>> ====
>>>>>>>>
>>>>>>>> runCA failed.
>>>>>>>>
>>>>>>>> ----------------------------------------
>>>>>>>> Stack trace:
>>>>>>>>
>>>>>>>>      at
>>>>>>>> /usit/titan/u1/chrishah/programmes/wgs-7.0/Linux-amd64/bin/./runCA
>>>>>>>> line
>>>>>>>> 1237
>>>>>>>>             main::caFailure('failed to apply the overlap corrections',
>>>>>>>> '/projects/nn9201k/Celera/work2/salaris1/3-overlapcorrection/o...')
>>>>>>>> called
>>>>>>>> at /usit/titan/u1/chrishah/programmes/wgs
>>>>>>>> -7.0/Linux-amd64/bin/./runCA line 4077
>>>>>>>>             main::overlapCorrection() called at
>>>>>>>> /usit/titan/u1/chrishah/programmes/wgs-7.0/Linux-amd64/bin/./runCA line
>>>>>>>> 5880
>>>>>>>>
>>>>>>>> ----------------------------------------
>>>>>>>> Last few lines of the relevant log file
>>>>>>>> (/projects/nn9201k/Celera/work2/salaris1/3-overlapcorrection/overlapStor
>>>>>>>> e-
>>>>>>>> up
>>>>>>>> date-erates.err):
>>>>>>>>
>>>>>>>> AS_OVS_openBinaryOverlapFile()-- Failed to open
>>>>>>>> '/projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore/0001~' for
>>>>>>>> reading: No such file or directory
>>>>>>>>
>>>>>>>> ----------------------------------------
>>>>>>>> Failure message:
>>>>>>>>
>>>>>>>> failed to apply the overlap corrections
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> So it can obviously not find the file /salaris.ovlStore/0001~. The
>>>>>>>> reason
>>>>>>>> is, from what I can see, that the /salaris.ovlStore/0001~ file has
>>>>>>>> already
>>>>>>>> been updated to /salaris.ovlStore/0001 before it stopped. In fact it
>>>>>>>> seems
>>>>>>>> to have stopped after updating /salaris.ovlStore/0249 (of 430). Is there
>>>>>>>> a
>>>>>>>> way to tell runCA to continue from  /salaris.ovlStore/0250~, instead of
>>>>>>>> from
>>>>>>>> 0001~, which is obviously not there any more??
>>>>>>>> Another solution I was thinking of is to run the previous overlapStore
>>>>>>>> command again manually (the one that was done before starting the
>>>>>>>> frgcorr
>>>>>>>> and ovlcorr:
>>>>>>>> /xanadu/home/chrishah/programmes/wgs-7.0/Linux-amd64/bin/overlapStore
>>>>>>>> -c
>>>>>>>> /projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore.BUILDING  -g
>>>>>>>> /projects/nn9201k/Celera/work2/salaris1/salaris.gkpStore  -i 0 -M 14000
>>>>>>>> -L
>>>>>>>> /projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore.list>
>>>>>>>> /projects/nn9201k/Celera/work2/salaris1/salaris.ovlStore.err 2>&1) to
>>>>>>>> restore the status from before the frgcorr and ovlcorr steps, before
>>>>>>>> resuming runCA. This should restore the 0001~ file, right? The most
>>>>>>>> important thing is that I want to avoid rerunning the frgcorr and
>>>>>>>> ovlcorr
>>>>>>>> steps, because these steps were really resource intensive.
>>>>>>>>
>>>>>>>> I would really appreciate any comments or suggestions to my problem!
>>>>>>>> Thanks
>>>>>>>> in advance for your help!
>>>>>>>>
>>>>>>>> much obliged,
>>>>>>>> Christoph
>>>>>>>>
>>>>>>>> University of Oslo
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------
>>>>>>>> --
>>>>>>>> --
>>>>>>>> --
>>>>>>>> Live Security Virtual Conference
>>>>>>>> Exclusive live event will cover all the ways today's security and
>>>>>>>> threat landscape has changed and how IT managers can respond.
>>>>>>>> Discussions
>>>>>>>> will include endpoint security, mobile security and the latest in
>>>>>>>> malware
>>>>>>>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>>>>>>>> _______________________________________________
>>>>>>>> wgs-assembler-users mailing list
>>>>>>>> wgs...@li...
>>>>>>>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users
>>>>>>>>
>>>>>> --------------------------------------------------------------------------
>>>>>> --
>>>>>> --
>>>>>> Live Security Virtual Conference
>>>>>> Exclusive live event will cover all the ways today's security and
>>>>>> threat landscape has changed and how IT managers can respond. Discussions
>>>>>> will include endpoint security, mobile security and the latest in malware
>>>>>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>>>>>> _______________________________________________
>>>>>> wgs-assembler-users mailing list
>>>>>> wgs...@li...
>>>>>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users
>>

2012	Jan (1)	Feb (2)	Mar	Apr (29)	May (8)	Jun (5)	Jul (46)	Aug (16)	Sep (5)	Oct (6)	Nov (17)	Dec (7)
2013	Jan (5)	Feb (2)	Mar (10)	Apr (13)	May (20)	Jun (7)	Jul (6)	Aug (14)	Sep (9)	Oct (19)	Nov (17)	Dec (3)
2014	Jan (3)	Feb	Mar (7)	Apr (1)	May (1)	Jun (30)	Jul (10)	Aug (2)	Sep (18)	Oct (3)	Nov (4)	Dec (13)
2015	Jan (27)	Feb	Mar (19)	Apr (12)	May (10)	Jun (18)	Jul (4)	Aug (2)	Sep (2)	Oct	Nov (1)	Dec (9)
2016	Jan (6)	Feb	Mar	Apr	May	Jun	Jul (1)	Aug (1)	Sep (1)	Oct	Nov	Dec

wgs-assembler-users Mailing List for Whole-Genome Shotgun Assembler (Page 17)

wgs-assembler-users — Discussion about Celera Assembler