wgs-assembler-users Mailing List for Whole-Genome Shotgun Assembler (Page 2)

Brought to you by: brianwalenz, jasonmiller9704, mcschatz, skoren

wgs-assembler-users — Discussion about Celera Assembler

You can subscribe to this list here.

2012	_Jan (1)	_Feb (2)	_Mar	_Apr (29)	_May (8)	_Jun (5)	_Jul (46)	_Aug (16)	_Sep (5)	_Oct (6)	_Nov (17)	_Dec (7)
2013	_Jan (5)	_Feb (2)	_Mar (10)	_Apr (13)	_May (20)	_Jun (7)	_Jul (6)	_Aug (14)	_Sep (9)	_Oct (19)	_Nov (17)	_Dec (3)
2014	_Jan (3)	_Feb	_Mar (7)	_Apr (1)	_May (1)	_Jun (30)	_Jul (10)	_Aug (2)	_Sep (18)	_Oct (3)	_Nov (4)	_Dec (13)
2015	_Jan (27)	_Feb	_Mar (19)	_Apr (12)	_May (10)	_Jun (18)	_Jul (4)	_Aug (2)	_Sep (2)	_Oct	_Nov (1)	_Dec (9)
2016	_Jan (6)	_Feb	_Mar	_Apr	_May	_Jun	_Jul (1)	_Aug (1)	_Sep (1)	_Oct	_Nov	_Dec

Flat | Threaded

<< < 1 2 3 4 .. 19 > >> (Page 2 of 19)

[wgs-assembler-users] wgs failed at frgcorr.sh step

From: Christian D. <chr...@gm...> - 2015-07-23 15:14:05

Hi,

I  run wgs-8.2beta until the assembler went idle on one of the overlap
correction steps (frgcorr.sh). Obviously, one of the early fragments didn't
finish, as frgcorr.sh for this fragment was running for 15h and the log
file contained only the first few row up to ### Using 20 pthreads. The
assembler stopped the correction only a few fragments after the idle one.
I killed the idle process and executed the fragcorr.sh command for this
fragment manually. After that, I run runCA again with the original command
and immediatly got the failure message:

"gatekeeper failed to add fragments"

As this didn't seem to work, I renamed the folder 3-overlapcorrection and
run runCA again leading to the same error message.
I thought that starting with the step before the error correction could
work and run:

/software/wgs-8.2beta/Linux-amd64/bin/overlapStoreBuild  -o
/cabog/CA/genome.ovlStore.BUILDING  -g /cabog/CA/genome.gkpStore  -M 8192
-L /cabog/CA/genome.ovlStore.list  > /cabog/CA/genome.ovlStore.err 2>&1

The log file genome.ovlStore.err contains the following:

gkStore_open()--  ERROR!  Incorrect element sizes; code and store are
incompatible.
  gkLibrary:                    store   216   code   216 bytes
  gkPackedFragment:             store    24   code    24 bytes
  gkNormalFragment:             store    48   code    48 bytes
  gkStrobeFragment:             store    48   code    48 bytes
  AS_READ_MAX_NORMAL_LEN_BITS:  store    16   code    18

Is it possible to restart the assembly at this point?
 What steps do I have to take to "rescue" the assembly results up to this
point (23 days of calculation time)

Thanks
Chris

[wgs-assembler-users] Fwd: wgs failed at frgcorr.sh step

From: Christian D. <chr...@gm...> - 2015-07-23 14:29:23

Hi,

I  run wgs-8.2beta until the assembler went idle on one of the overlap
correction steps (frgcorr.sh). Obviously, one of the early fragments didn't
finish, as frgcorr.sh for this fragment was running for 15h and the log
file contained only the first few row up to ### Using 20 pthreads. The
assembler stopped the correction only a few fragments after the idle one.
I killed the idle process and executed the fragcorr.sh command for this
fragment manually. After that, I run runCA again with the original command
and immediatly got the failure message:

"gatekeeper failed to add fragments"

As this didn't seem to work, I renamed the folder 3-overlapcorrection and
run runCA again leading to the same error message.
I thought that starting with the step before the error correction could
work and run:

/software/wgs-8.2beta/Linux-amd64/bin/overlapStoreBuild  -o
/cabog/CA/genome.ovlStore.BUILDING  -g /cabog/CA/genome.gkpStore  -M 8192
-L /cabog/CA/genome.ovlStore.list  > /cabog/CA/genome.ovlStore.err 2>&1

The log file genome.ovlStore.err contains the following:

gkStore_open()--  ERROR!  Incorrect element sizes; code and store are
incompatible.
  gkLibrary:                    store   216   code   216 bytes
  gkPackedFragment:             store    24   code    24 bytes
  gkNormalFragment:             store    48   code    48 bytes
  gkStrobeFragment:             store    48   code    48 bytes
  AS_READ_MAX_NORMAL_LEN_BITS:  store    16   code    18

Is it possible to restart the assembly at this point?
 What steps do I have to take to "rescue" the assembly results up to this
point (>20 days of calculation time)

Thanks
Chris

Re: [wgs-assembler-users] Issue with overlap prep during PBcR run

From: Serge K. <ser...@gm...> - 2015-06-22 23:19:23

Hi,

This is a limitation of BLASR/sawriter which is used for the overlapping in hybrid correction. Due to 32-bit indicies they can only support 4GB of sequence. You have to use a value <4gb for ovlHashBlockLength (your current spec file is 6GB so reducing it to 4 or 3 will work). There are recommended parameters on the wiki page for large genomes:
http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR#Correcting_Large_.28.3E_100Mbp.29_Genomes_.28Using_high-identity_data_or_CA_8.1.29 <http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR#Correcting_Large_.28.3E_100Mbp.29_Genomes_.28Using_high-identity_data_or_CA_8.1.29>

However, I would advice against using hybrid correction with a mammalian genome, especially on a single machine. It will be very slow. Instead, I’d recommend using only the PacBio data with the low coverage settings from the wiki page:
http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR#Low_Coverage_Assembly <http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR#Low_Coverage_Assembly>
It will be significantly faster than hybrid correction and we’ve used as little as 18X to assemble 2GB+ genomes. The assembly will not be as contiguous as it is from 50X+ but should be reasonable.

Sergey


> On Jun 22, 2015, at 2:47 PM, Stephanie D'Souza <sd...@bu...> wrote:
> 
> Hello,
> 
> 
> I have been trying to run PBcR on a mammalian genome with hybrid data (~54x of Illumina HiSeq and ~24x of PacBio) on a 64 core, 512GB machine, and get the following error:
> 
> ERROR:  Overlap prep job /projectnb/keplab/sdsouza/PBcR_June2015//tempbat2newPBcR_6-4-15/1-overlapper/long_reads_part   1 FAILED.
> ERROR:  Overlap prep job /projectnb/keplab/sdsouza/PBcR_June2015//tempbat2newPBcR_6-4-15/1-overlapper/long_reads_part   2 FAILED.
> ERROR:  Overlap prep job /projectnb/keplab/sdsouza/PBcR_June2015//tempbat2newPBcR_6-4-15/1-overlapper/long_reads_part   3 FAILED.
> ERROR:  Overlap prep job /projectnb/keplab/sdsouza/PBcR_June2015//tempbat2newPBcR_6-4-15/1-overlapper/long_reads_part   4 FAILED.
> ERROR:  Overlap prep job /projectnb/keplab/sdsouza/PBcR_June2015//tempbat2newPBcR_6-4-15/1-overlapper/long_reads_part   5 FAILED.
> ERROR:  Overlap prep job /projectnb/keplab/sdsouza/PBcR_June2015//tempbat2newPBcR_6-4-15/1-overlapper/long_reads_part   6 FAILED.
> ERROR:  Overlap prep job /projectnb/keplab/sdsouza/PBcR_June2015//tempbat2newPBcR_6-4-15/1-overlapper/long_reads_part   7 FAILED.
> ERROR:  Overlap prep job /projectnb/keplab/sdsouza/PBcR_June2015//tempbat2newPBcR_6-4-15/1-overlapper/long_reads_part   8 FAILED.
> 
> 8 overlap partitioning jobs failed.
> 
> 
> In other words, 8/9 partitioning jobs fail. When I go into the 1-overlapper directory, I find .hash.err files that all say the same thing; here is a representative file:
> 
> ERROR! Reading fasta files greater than 4Gbytes is not supported.
> Command exited with non-zero status 1
> 0.00user 0.00system 0:00.01elapsed 0%CPU (0avgtext+0avgdata 2032maxresident)k
> 0inputs+8outputs (0major+153minor)pagefaults 0swaps
> 
> 
> It looks like my PacBio data is being partitioned into 9 files of size 5.7GB each, except for the 9th file which is under 4 GB in size; thus only 8 jobs fail. Why should the size of the file matter? Should I change a .spec file parameter to correct this? (The .spec file I used is attached.) I'd appreciate any help on this.
> 
> Thanks very much,
> Stephanie
> 
> 
> --
> Stephanie D'Souza
> MD/PhD Program, PhD Year 1
> Kepler Lab
> Department of Microbiology
> Boston University School of Medicine
> L519 - 72 E Concord St.
> Boston, MA 02118
> 
> <newPBcR_6-4-15.spec.txt>------------------------------------------------------------------------------
> Monitor 25 network devices or servers for free with OpManager!
> OpManager is web-based network management software that monitors 
> network devices and physical & virtual servers, alerts via email & sms 
> for fault. Monitor 25 devices for free with no restriction. Download now
> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o_______________________________________________
> wgs-assembler-users mailing list
> wgs...@li...
> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users

[wgs-assembler-users] Issue with overlap prep during PBcR run

From: Stephanie D'S. <sd...@bu...> - 2015-06-22 18:47:54

#PBcR options
genomeSize=2100000000
maxCoverage=50
maxGap = 1500
blasr=-noRefineAlign -advanceHalf -noSplitSubreads -minMatch 10 -minPctIdentity 70 -bestn 24 -nCandidates 24

# original asm settings
merSize = 14

#merylMemory = 128000
#merylThreads = 32

# ovlStoreMemory = 8192

# grid info


ovlHashBits = 27
# ovlThreads = 32
ovlHashBlockLength = 6000000000
ovlRefBlockLength  = 1000000000
ovlRefBlockSize = 0


# for mer overlapper
merCompression = 1
merOverlapperSeedBatchSize = 500000
merOverlapperExtendBatchSize = 250000

#frgCorrThreads = 2
#frgCorrBatchSize = 100000

#ovlCorrBatchSize = 100000







#akshaya's
ovlStoreMemory=256000
ovlThreads=45
ovlCorrConcurrency=20
ovlCorrBatchSize=3000000

merylMemory=256000
merylThreads=45

frgCorrThreads=48
frgCorrBatchSize=3000000
frgCorrConcurrency=20



/archive/Steph/lane1.frg
/archive/Steph/lane2.frg
/archive/Steph/lane3.frg

Re: [wgs-assembler-users] understanding PBcR error-correction

From: Serge K. <ser...@gm...> - 2015-06-19 22:03:33

Hi,

Fundamentally, it’s the same approach as the non-low coverage data. It computes overlaps with MHAP using more sensitive parameters than the default (same k-mer size but larger sketch size, which decreases your chance of missing an overlap). The PBDAGCON step creates the consensus but the threshold to keep a base is reduced from the default of 4-fold to 2-fold. Most bases have higher coverage than this but the lowered threshold is what is responsible for keeping many more sequences at this lower coverage level while still trimming artifacts in the data when only a single sequence supports it. The worst sequences will remain relatively high error but the median error rates of the sequences is significantly increased since the majority of them have more than 2-fold coverage correcting them. They may have some regions of remaining higher error. The assembly step will trim the corrected sequences again to remove any artifacts not filtered by the initial correction.

As far as I know, HGAP performs an all-vs-longest brute force alignment (using BLASR, this is why it’s computationally expensive). There is an index built on the longest sequences but the same is true for pretty much all methods (including PBcR which uses the min-hash as its index). PBcR will use a similar approach and try to correct the longest 40X of data using all data (i.e. all data mapped to longest 40X) but since you have less than 40X it’s all-vs-all. PBcR will use partial overlaps when doing a correction (that is part of a read is contained in part of another read, not only fully contained ones like the default in HGAP). There is a global filter which only allows each sequence to map to its best coverage positions, where best is based on size and identity.

Serge

> On Jun 18, 2015, at 10:39 AM, mic...@ip... wrote:
> 
> Dear Serge, 
> 
> Could you give me some information about how PBcR does the error-correction (specially for low coverage). 
> This might sound like a bold question but i have to ask since could not find any detailed information about it. 
> 
> I fed PBcR with 22 x PacBio data of a 1.3 Gb genome (low coverage settings) and it returned 15 x of error-corrected reads. This result is amazing (evenwhen considering the quality to be "only" 97-98 instead of 99+).
> 
> I know that overlaps are found using your MHAP aligner and that those overlaps are fed to PBDAGCON to create consensus, which then results in high confident base-information of the whole sequence. 
> 
> Does PBcR(like HGAP) use long sequences as initial "references" for the alignments or is it just brute-force all-against-all alignment and piling the overlaps up to find as many overlaps (coverage) per position as possible? 
> 
> Is there is a lower coverage threshold to do consensus calling at a given position of the read? 
> 
> Those questions relate more to PBDAGCON, for which i could not find much information. Maybe you could point me to some information about PBDAGCON or briefly explain its settings in PBcR. 
> 
> Thank you, 
> 
> Michel

[wgs-assembler-users] understanding PBcR error-correction

From: <mic...@ip...> - 2015-06-18 14:55:58

Dear Serge, 

Could you give me some information about how PBcR does the error-correction (specially for low coverage). 
This might sound like a bold question but i have to ask since could not find any detailed information about it. 

I fed PBcR with 22 x PacBio data of a 1.3 Gb genome (low coverage settings) and it returned 15 x of error-corrected reads. This result is amazing (evenwhen considering the quality to be "only" 97-98 instead of 99+).

I know that overlaps are found using your MHAP aligner and that those overlaps are fed to PBDAGCON to create consensus, which then results in high confident base-information of the whole sequence. 

Does PBcR(like HGAP) use long sequences as initial "references" for the alignments or is it just brute-force all-against-all alignment and piling the overlaps up to find as many overlaps (coverage) per position as possible? 

Is there is a lower coverage threshold to do consensus calling at a given position of the read? 

Those questions relate more to PBDAGCON, for which i could not find much information. Maybe you could point me to some information about PBDAGCON or briefly explain its settings in PBcR. 

Thank you, 

Michel

Re: [wgs-assembler-users] PBcR restart

From: Serge K. <ser...@gm...> - 2015-06-18 13:42:53

Hi,

Sorry for the delay in replying. If you are trying to re-run with the sensitive low-coverage options, you need to start from scratch by either removing the existing results or using a new library name.

Serge

> On Jun 12, 2015, at 5:08 PM, Seth Munholland <mu...@uw...> wrote:
> 
> Hi Serge,
> 
> I double checked BLASR/PBDAGCON, kept the old stage 9 folder changed, upgraded to wgs8.3rc2 and tried to rerun it but still got the same "Will not overwrite" error.  At this point the command line output is gone so I can't grep it.  Would it be safed to a log file somewhere?
> 
> Seth Munholland, B.Sc.
> Department of Biological Sciences
> Rm. 304 Biology Building
> University of Windsor
> 401 Sunset Ave. N9B 3P4
> T: (519) 253-3000 Ext: 4755 <>
> On Thu, May 28, 2015 at 10:39 AM, Serge Koren <ser...@gm... <mailto:ser...@gm...>> wrote:
> Yes, the 25X coverage is for after correction. However, 12X should be sufficient for a reasonable assembly, certainly not a tiny fraction of your genome like you’re seeing. If you run
> gatekeeper -dumpinfo PI440795_Self_Assembled/asm.gkpStore
> 
> That should give more info on what reads made it into the assembly.
> 
> 
>> On May 28, 2015, at 10:37 AM, Seth Munholland <mu...@uw... <mailto:mu...@uw...>> wrote:
>> 
>> After correction I ended up with about 12x coverage.  I presume the ~25x coverage suggested on the PBcR page is for after correction?  I'll try the low-coverage parameters and double check BLASR/PBDAGCON next, thanks.
>> 
>> Seth Munholland, B.Sc.
>> Department of Biological Sciences
>> Rm. 304 Biology Building
>> University of Windsor
>> 401 Sunset Ave. N9B 3P4
>> T: (519) 253-3000 Ext: 4755 <>
>> On Wed, May 27, 2015 at 6:56 PM, Serge Koren <ser...@gm... <mailto:ser...@gm...>> wrote:
>> That most likely means you ended up with too little coverage for assembly after correction. You can check the coverage in the PI440795_Self_Assembled*.fastq files. If you’re not already, I’d suggest using the low-coverage parameters on the wiki page:
>> http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR#Low_Coverage_Assembly <http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR#Low_Coverage_Assembly>
>> 
>> I’d also double-check that you have BLASR/PBDAGCON available in your path and that it is being used for assembly (in your tempPI440795_Self_Assembled/runPartition.sh file look for the word pbdagcon).
>> 
>>> On May 26, 2015, at 12:03 PM, Seth Munholland <mu...@uw... <mailto:mu...@uw...>> wrote:
>>> 
>>> Hi Serge,
>>> 
>>> I looked into the 9-terminator folder and found the asm.utg.fasta file, but it's only 4.5MB (~0.007x coverage) when I started wth ~33x coverage.  Any suggestions for where to look for the data loss?
>>> 
>>> Seth Munholland, B.Sc.
>>> Department of Biological Sciences
>>> Rm. 304 Biology Building
>>> University of Windsor
>>> 401 Sunset Ave. N9B 3P4
>>> T: (519) 253-3000 Ext: 4755 <>
>>> On Tue, May 26, 2015 at 11:18 AM, Serge Koren <ser...@gm... <mailto:ser...@gm...>> wrote:
>>> Hi,
>>> 
>>> This was a bug fixed in CA 8.3rc2 (when the assembly of the corrected data failed, the restart did not work properly). If you grep for runCA in your command line output from your run and re-run the last command (it should have the library name as the -d option). That will re-create the 9-terminator directory and corresponding files. Unless you install the missing perl package, the qc generation will still fail, but it only contains statistics on the assembly, the asm.utg.fasta file should be your complete assembly.
>>> 
>>> Serge
>>> 
>>> 
>>>> On May 26, 2015, at 10:49 AM, Seth Munholland <mu...@uw... <mailto:mu...@uw...>> wrote:
>>>> 
>>>> Hello Everyone,
>>>> 
>>>> I was running a PBcR through to assembly with nothing in my spec file excet memory options since I share the server.  I got all the way to step 9 (terminator) when I got the following error:
>>>> 
>>>> ----------------------------------------START Tue May 26 02:18:24 2015
>>>> /usr/bin/env perl /data/bill.crosby/apps/wgs-8.3rc1/Linux-amd64/bin/caqc.pl <http://caqc.pl/> -euid  /lore/bill.crosby.storage/PI440795/PI440795_Self_Assembled/9-terminator/asm.asm
>>>> Can't locate Statistics/Descriptive.pm in @INC (@INC contains: /usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/5.8.8 .) at /data/bill.crosby/apps/wgs-8.3rc1/Linux-amd64/bin/caqc.pl <http://caqc.pl/> line 18.
>>>> BEGIN failed--compilation aborted at /data/bill.crosby/apps/wgs-8.3rc1/Linux-amd64/bin/caqc.pl <http://caqc.pl/> line 18.
>>>> ----------------------------------------END Tue May 26 02:18:24 2015 (0 seconds)
>>>> ERROR: Failed with signal INT (2)
>>>> The Cleaner has arrived.  Doing 'none'.
>>>> ----------------------------------------END Tue May 26 02:18:24 2015 (1490 seconds)
>>>> 
>>>> I google search tells me that I can try manually running asmOutputFasta to try and make the missing output fasta (http://sourceforge.net/p/wgs-assembler/mailman/message/33260123/ <http://sourceforge.net/p/wgs-assembler/mailman/message/33260123/>).  When I try it the fasta files are only ~3MB.  The same link warns that the asm may be incomplete and I might have to repeat step 9 in runCA, this is where I get stuck.
>>>> 
>>>> I've renamed the 9-terminator folder to 9-terminator-old, but what is the command for runCA to pickup a PBcR run?  I tried specifying the directory, prefix, and spec file and after changing to the hash memory options in my spec file i get:
>>>> 
>>>> Failure message:
>>>> 
>>>> no fragment files specified, and stores not already created
>>>> 
>>>> While trying to rerun the PBcR command again gives:
>>>> 
>>>> Error: requested to output PI440795_Self_Assembled.frg but file already exists. Will not overwrite.
>>>> 
>>>> 
>>>> Seth Munholland, B.Sc.
>>>> Department of Biological Sciences
>>>> Rm. 304 Biology Building
>>>> University of Windsor
>>>> 401 Sunset Ave. N9B 3P4
>>>> T: (519) 253-3000 Ext: 4755 <>------------------------------------------------------------------------------
>>>> One dashboard for servers and applications across Physical-Virtual-Cloud 
>>>> Widest out-of-the-box monitoring support with 50+ applications
>>>> Performance metrics, stats and reports that give you Actionable Insights
>>>> Deep dive visibility with transaction tracing using APM Insight.
>>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y_______________________________________________ <http://ad.doubleclick.net/ddm/clk/290420510;117567292;y_______________________________________________>
>>>> wgs-assembler-users mailing list
>>>> wgs...@li... <mailto:wgs...@li...>
>>>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users <https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users>
>>> 
>>> 
>> 
>> 
> 
>

Re: [wgs-assembler-users] PBcR restart

From: Seth M. <mu...@uw...> - 2015-06-12 21:08:38

Hi Serge,

I double checked BLASR/PBDAGCON, kept the old stage 9 folder changed,
upgraded to wgs8.3rc2 and tried to rerun it but still got the same "Will
not overwrite" error.  At this point the command line output is gone so I
can't grep it.  Would it be safed to a log file somewhere?

Seth Munholland, B.Sc.
Department of Biological Sciences
Rm. 304 Biology Building
University of Windsor
401 Sunset Ave. N9B 3P4
T: (519) 253-3000 Ext: 4755

On Thu, May 28, 2015 at 10:39 AM, Serge Koren <ser...@gm...> wrote:

> Yes, the 25X coverage is for after correction. However, 12X should be
> sufficient for a reasonable assembly, certainly not a tiny fraction of your
> genome like you’re seeing. If you run
> gatekeeper -dumpinfo PI440795_Self_Assembled/asm.gkpStore
>
> That should give more info on what reads made it into the assembly.
>
>
> On May 28, 2015, at 10:37 AM, Seth Munholland <mu...@uw...> wrote:
>
> After correction I ended up with about 12x coverage.  I presume the ~25x
> coverage suggested on the PBcR page is for after correction?  I'll try the
> low-coverage parameters and double check BLASR/PBDAGCON next, thanks.
>
> Seth Munholland, B.Sc.
> Department of Biological Sciences
> Rm. 304 Biology Building
> University of Windsor
> 401 Sunset Ave. N9B 3P4
> T: (519) 253-3000 Ext: 4755
>
> On Wed, May 27, 2015 at 6:56 PM, Serge Koren <ser...@gm...> wrote:
>
>> That most likely means you ended up with too little coverage for assembly
>> after correction. You can check the coverage in the
>> PI440795_Self_Assembled*.fastq files. If you’re not already, I’d suggest
>> using the low-coverage parameters on the wiki page:
>>
>> http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR#Low_Coverage_Assembly
>>
>> I’d also double-check that you have BLASR/PBDAGCON available in your path
>> and that it is being used for assembly (in your
>> tempPI440795_Self_Assembled/runPartition.sh file look for the word
>> pbdagcon).
>>
>> On May 26, 2015, at 12:03 PM, Seth Munholland <mu...@uw...>
>> wrote:
>>
>> Hi Serge,
>>
>> I looked into the 9-terminator folder and found the asm.utg.fasta file,
>> but it's only 4.5MB (~0.007x coverage) when I started wth ~33x coverage.
>> Any suggestions for where to look for the data loss?
>>
>> Seth Munholland, B.Sc.
>> Department of Biological Sciences
>> Rm. 304 Biology Building
>> University of Windsor
>> 401 Sunset Ave. N9B 3P4
>> T: (519) 253-3000 Ext: 4755
>>
>> On Tue, May 26, 2015 at 11:18 AM, Serge Koren <ser...@gm...>
>> wrote:
>>
>>> Hi,
>>>
>>> This was a bug fixed in CA 8.3rc2 (when the assembly of the corrected
>>> data failed, the restart did not work properly). If you grep for runCA in
>>> your command line output from your run and re-run the last command (it
>>> should have the library name as the -d option). That will re-create the
>>> 9-terminator directory and corresponding files. Unless you install the
>>> missing perl package, the qc generation will still fail, but it only
>>> contains statistics on the assembly, the asm.utg.fasta file should be your
>>> complete assembly.
>>>
>>> Serge
>>>
>>>
>>> On May 26, 2015, at 10:49 AM, Seth Munholland <mu...@uw...>
>>> wrote:
>>>
>>> Hello Everyone,
>>>
>>> I was running a PBcR through to assembly with nothing in my spec file
>>> excet memory options since I share the server.  I got all the way to step 9
>>> (terminator) when I got the following error:
>>>
>>> ----------------------------------------START Tue May 26 02:18:24 2015
>>> /usr/bin/env perl /data/bill.crosby/apps/wgs-8.3rc1/Linux-amd64/bin/
>>> caqc.pl -euid
>>> /lore/bill.crosby.storage/PI440795/PI440795_Self_Assembled/9-terminator/asm.asm
>>> Can't locate Statistics/Descriptive.pm in @INC (@INC contains:
>>> /usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi
>>> /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl
>>> /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi
>>> /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl
>>> /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/5.8.8 .) at
>>> /data/bill.crosby/apps/wgs-8.3rc1/Linux-amd64/bin/caqc.pl line 18.
>>> BEGIN failed--compilation aborted at
>>> /data/bill.crosby/apps/wgs-8.3rc1/Linux-amd64/bin/caqc.pl line 18.
>>> ----------------------------------------END Tue May 26 02:18:24 2015 (0
>>> seconds)
>>> ERROR: Failed with signal INT (2)
>>> The Cleaner has arrived.  Doing 'none'.
>>> ----------------------------------------END Tue May 26 02:18:24 2015
>>> (1490 seconds)
>>>
>>> I google search tells me that I can try manually running asmOutputFasta
>>> to try and make the missing output fasta (
>>> http://sourceforge.net/p/wgs-assembler/mailman/message/33260123/).
>>> When I try it the fasta files are only ~3MB.  The same link warns that the
>>> asm may be incomplete and I might have to repeat step 9 in runCA, this is
>>> where I get stuck.
>>>
>>> I've renamed the 9-terminator folder to 9-terminator-old, but what is
>>> the command for runCA to pickup a PBcR run?  I tried specifying the
>>> directory, prefix, and spec file and after changing to the hash memory
>>> options in my spec file i get:
>>>
>>> Failure message:
>>>
>>> no fragment files specified, and stores not already created
>>>
>>> While trying to rerun the PBcR command again gives:
>>>
>>> Error: requested to output PI440795_Self_Assembled.frg but file already
>>> exists. Will not overwrite.
>>>
>>>
>>> Seth Munholland, B.Sc.
>>> Department of Biological Sciences
>>> Rm. 304 Biology Building
>>> University of Windsor
>>> 401 Sunset Ave. N9B 3P4
>>> T: (519) 253-3000 Ext: 4755
>>>
>>> ------------------------------------------------------------------------------
>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>> Widest out-of-the-box monitoring support with 50+ applications
>>> Performance metrics, stats and reports that give you Actionable Insights
>>> Deep dive visibility with transaction tracing using APM Insight.
>>>
>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y_______________________________________________
>>> wgs-assembler-users mailing list
>>> wgs...@li...
>>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users
>>>
>>>
>>>
>>
>>
>
>