Whole-Genome Shotgun Assembler / Bugs / #302 PBcR self-corrected assembly

Sergey Koren - 2015-04-13

The most likely reason is that you ended up with too low coverage in the corrected sequences to get good assembly. Have you checked how much coverage is in the corrected sequences and what their average length is compared to your input sequences?

There are a couple of things to check. First, in your temporary folder in runPartition, look for text that either says falcon_sense or pbdagcon. For the lower coverage you have you want to use pbdagcon so if runPartition.sh doesn’t say pbdagcon, you can make sure you have SMRTportal installed and in your path and re-run the pipeline specifying -sensitive on the command line for PBcR (see http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR#Low_Coverage_Assembly http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR#Low_Coverage_Assembly). Make sure you also specify the genome size to PBcR on the command line. If that doesn’t improve your assembly or it was already using pbdagcon, the link I included has some guidance on adjusting parameters for low coverage datasets that you can try to see if your assembly improves.

On Apr 12, 2015, at 6:08 AM, Mark markcharder@users.sf.net wrote:

[bugs:#302] http://sourceforge.net/p/wgs-assembler/bugs/302 PBcR self-corrected assembly

Status: open
Group: scaffolder
Labels: PBcR MHAP self-correct PacBio
Created: Sun Apr 12, 2015 10:08 AM UTC by Mark
Last Updated: Sun Apr 12, 2015 10:08 AM UTC
Owner: nobody

Hi all,

I have just used PBcR in wgs-8.3 to self-correct and assemble some PacBio reads.
The reads I assembled were 36 x coverage. The genome I am assembling is a haploid genome which is known to be approx 39 Mb and has around 7 % repetitive DNA.

The .spec file I used contained the following:

ovlHashBits = 25
ovlHashBlockLength = 180000000
ovlMemory = 20

The program ran successfully without producing any errors. The problem is that the assembly is approximately 9 Mb long and made up of approx 350 scaffolds - much smaller than the expected 39 Mb. I was just wondering if there are any parameters that can be altered to improve the assembly length?

Kind regards,
Mark Derbyshire

Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/wgs-assembler/bugs/302/ https://sourceforge.net/p/wgs-assembler/bugs/302
To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/ https://sourceforge.net/auth/subscriptions

Related

Bugs: #302

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Mark - 2015-04-16

Thank you very much for the advice. I am now running with altered parameters, i.e. specifying -sensitive and providing the genome size.
However, this time the process has been running for around 60 hours, as compared to the 20 hours it took before.
wgs has created a lot more files in the temporary folder, probably close to 200.

Some examples include:
1.err
1.lay.err
asm.100.log

Is this correct? I am running on an 8 core node with 20 GB available RAM. Is this enough?
Is it possible to run this without SMRT portal in my environment? It is not installed on the server I am running on. I have a lot of SMRT modules and have been able to locally install individual modules that are necessary for other things.

Sorry for all the questions, I don't have a lot of expertise myself and am not in contact with anyone who uses this software extensively.

Regards,
Mark

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Sergey Koren - 2015-04-16
  
  It should create those files either with or without sensitive (the number is controlled by partitions in your command line). However, for sensitive it uses PBDAGCON instead of falcon_sense. PBDAGCON is significantly slower so it’s likely that falcon_sense ran so fast you didn’t notice the temporary files being created and removed. That would also account for the longer runtime as compared to previously.
  
  Serge
  
  On Apr 16, 2015, at 10:44 AM, Mark markcharder@users.sf.net wrote:
  
  Thank you very much for the advice. I am now running with altered parameters, i.e. specifying -sensitive and providing the genome size.
  However, this time the process has been running for around 60 hours, as compared to the 20 hours it took before.
  wgs has created a lot more files in the temporary folder, probably close to 200.
  
  Some examples include:
  1.err
  1.lay.err
  asm.100.log
  
  Is this correct? I am running on an 8 core node with 20 GB available RAM. Is this enough?
  Is it possible to run this without SMRT portal in my environment? It is not installed on the server I am running on. I have a lot of SMRT modules and have been able to locally install individual modules that are necessary for other things.
  
  Sorry for all the questions, I don't have a lot of expertise myself and am not in contact with anyone who uses this software extensively.
  
  Regards,
  Mark
  
  [bugs:#302] http://sourceforge.net/p/wgs-assembler/bugs/302 PBcR self-corrected assembly
  
  Status: open
  Group: scaffolder
  Labels: PBcR MHAP self-correct PacBio
  Created: Sun Apr 12, 2015 10:08 AM UTC by Mark
  Last Updated: Sun Apr 12, 2015 10:08 AM UTC
  Owner: nobody
  
  Hi all,
  
  I have just used PBcR in wgs-8.3 to self-correct and assemble some PacBio reads.
  The reads I assembled were 36 x coverage. The genome I am assembling is a haploid genome which is known to be approx 39 Mb and has around 7 % repetitive DNA.
  
  The .spec file I used contained the following:
  
  ovlHashBits = 25
  ovlHashBlockLength = 180000000
  ovlMemory = 20
  
  The program ran successfully without producing any errors. The problem is that the assembly is approximately 9 Mb long and made up of approx 350 scaffolds - much smaller than the expected 39 Mb. I was just wondering if there are any parameters that can be altered to improve the assembly length?
  
  Kind regards,
  Mark Derbyshire
  
  Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/wgs-assembler/bugs/302/ https://sourceforge.net/p/wgs-assembler/bugs/302
  To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/ https://sourceforge.net/auth/subscriptions
  
  Related
  
  Bugs: #302
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

PBcR self-corrected assembly

Group

Searches

Help

#302 PBcR self-corrected assembly

Related

Discussion

Related

Related