Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#90 gap5: quality values for gap inserts on contig join is 0

closed-fixed
Gap5 (15)
5
2011-04-18
2011-04-01
Bastien Chevreux
No

Hi James,

I just came across something I would call a bug: assume you have two contigs which get joined and this introduces gaps in the reads of one contig. Then gap5 will give the inserted gaps a quality value of 0, which is extremely dangerous if not totally wrong if afterwards the consensus is calculated by taking quality values into.

Here's what happened to me: in a join, the end of a contig with two erroneous bases got joined to the widdle of another contig with hundreds of correct Solexa bases like this:

...ccgttAcgtg
...ccgttAcgtg
...ccgtt*cgtgtgactgac
...ccgtt*cgtgtgactgac...
...ccgtt*cgtgtgactgac...
(hundreds more)

When calculating the consensus again in MIRA (which takes quality values into account), the two "A" bases had a quality so much larger (upper 30's) than the gap bases (having 0), that the consensus algorithm took both the "A" bases as well as the gap bases into account for building a consensus. Under normal circumstances my algorithms would have given a IUPAC base as consensus, but as there is no IUPAC for "base or gap", the base wins out and therefore things go awfully wrong.

Would there be anything speaking against the rule to insert gap bases with a quality calculated as the average of the neighbouring bases?

Best,
Bastien

Discussion

  • James Bonfield
    James Bonfield
    2011-04-05

    Agreed this sounds like a bug. I'll investigate.

     
  • James Bonfield
    James Bonfield
    2011-04-18

    See http://staden.svn.sourceforge.net/viewvc/staden?revision=2470&view=revision

    Instead of the average I implemented the minimum of the two surrounded bases. It's something I've been wanting to do for years, but feared the subtle changes it would cause our local project-checking team. MIN seems like a better function than average though.

     
  • James Bonfield
    James Bonfield
    2011-04-18

    • status: open --> closed-fixed