shane hogle - 2019-03-13

Spent like the last hour dealing with nearly the same cryptic error message, and I think it has to do with the parsing of fastq headers for the removal of optical duplicates.

from the manual...

optical=f If true, mark or remove optical duplicates only. This means they are Illumina reads within a certain distance on the flowcell. Normal Illumina names needed. Also for tile-edge and well duplicates.

I was trying to remove duplicates in an ncbi SRA download that didn't have the original illumina header names/formatting. Once I tried this on raw reads with illumina formatting that I'd gotten directly from the sequencing center the error message went away.

I didn't really do any other testing so I could be wrong here, but my explanation makes sense since illumina headers contain tiling coordinates. hopefully this helps someone else

see also - https://github.com/BioInfoTools/BBMap/issues/15#issuecomment-472574978