Menu

Quality filtering on minimum average quality

Jake
2022-09-16
2022-09-21
  • Jake

    Jake - 2022-09-16

    I've been using bbduk for quality filtering and I pretty much just took it for granted. However, I've been looking at it as I usually use minavequality=15 which removes about 15-20% of our reads. minavequality=30 removes almost 100% of them.

    However when I do the quality averaging by hand, almost all of my reads have an average quality above 30. I've been taking the ASCII values of the quality scores and subtracting 33, summing them up and dividing by the length.

    How is bbduk computing the average quality?

     
  • Jake

    Jake - 2022-09-21

    Seems like it is done with

    -10*math.log(sum([10**(-(ord(c)-33)/10) for c in line4])/len(line4),10)
    

    rather than just

    sum([ord(c)-33 for c in line4])/len(line4)
    

    So, its the quality score of the average probability.

     

Log in to post a comment.