Menu

#5 New Blasr

1.0
closed
nobody
None
2017-01-12
2016-11-08
No

Need to support new blasr's different args.

Discussion

  • Afif Elghraoui

    Afif Elghraoui - 2016-12-12

    We're expecting to ship blasr 5 in the next Debian release, so I've prepared a patch to PB suite (attached) so that they can be co-installed. I haven't been able to test this yet since I first need to build blasr with pbbam (which is required for the --sam option you're using in some cases here), but I'm sending you the patch now so that the work doesn't get redone inadvertently. Maybe you can test it before I do.

     
  • Afif Elghraoui

    Afif Elghraoui - 2016-12-12

    I should note that this patch is against the current release (15.8.24).

     
  • Afif Elghraoui

    Afif Elghraoui - 2016-12-23

    I haven't been able to test this properly since blasr (5.3) can't generate sam/bam output if input is fasta or fastq. See https://github.com/PacificBiosciences/blasr/issues/312

    In my testing with the example data, I'm currently getting:

    2016-12-23 08:09:49,660 [INFO] Running Blasr
    2016-12-23 08:09:50,161 [ERROR] blasr mapping failed!
    2016-12-23 08:09:50,162 [ERROR] RETCODE 255
    2016-12-23 08:09:50,162 [ERROR] STDOUT [INFO] 2016-12-23T08:09:49 [blasr] started.
    ERROR, can not convert non-pacbio reads to pbbam record.
    
    2016-12-23 08:09:50,162 [ERROR] STDERR None
    2016-12-23 08:09:50,163 [ERROR] Exiting
    
     
  • Afif Elghraoui

    Afif Elghraoui - 2016-12-24

    It turns out that the error is triggered by the mapping of intermediate fastq files because their headers are not in a PacBio compatible format. See https://github.com/PacificBiosciences/blasr/issues/312#issuecomment-269055488

    The problem in pbsuite/honey/bampie.py is here:

                loc = mateplace + ":" + str(pos)
                          fout.write("@%s_%d%s%d%s\n%s\n+\n%s\n" % (read.qname, \
                           maq, tai, strand, loc, seq, qal))
    

    Appending read mapping information to read.qname causes blasr to interpret these reads as non-pacbio compatible. If the underscore here is changed to a space, we can get past this step, but then the later steps fail because the resulting SAM file doesn't have the information that pbhoney expects to find in the sequence IDs.

    I don't think it's good that blasr expects the read IDs in a very specific format. On the other hand, rewriting a subset of the fastq file and changing the sequence IDs to hold mapping information seems to be a roundabout approach.

    I think blasr has just been stabilized for the next smrtanalysis release, while the necessary changes to pbhoney will be more intrusive than expected. If we don't get this fixed to work with blasr 5 within the next week or so, pbsuite will be excluded from the next Debian release. If this happens, we still have sniffles as an alternative, but I'd like both to be there. I think pbjelly is also affected.

     
  • Afif Elghraoui

    Afif Elghraoui - 2016-12-29

    To continue the monologue, a simple solution occurred to me and I was able to fix this. Here are two patches that will fix the blasr 5 incompatibility for both pbhoney and pbjelly. blasr must be built with pbbam in order to be used with pbhoney because sam/bam output is required. These patches are applied in the Debian/Ubuntu package.

     

    Last edit: Afif Elghraoui 2016-12-29
  • Adam English

    Adam English - 2017-01-12

    Thanks for your work on this!
    I've incorporated the patches and a few other spots where the double-dash was needed.
    I've got a number of changes I'm hoping to devleop in about a month, so I'll have full testing and a new version up late Q1 2017

     
  • Adam English

    Adam English - 2017-01-12
    • status: open --> closed
     

Log in to post a comment.