Menu

#47 Chastity filter and flags

1.0
open
nobody
None
2021-05-04
2021-05-04
Jordi Camps
No

When processing Illumina >1.8 reads, the reads are marked as filtered out or not. This is known as chastity filter. Usually, those reads are taken away and not used, but some times they are found in the FastQ files for some reason.

When using the reformat.sh tool to convert FastQ files to SAM files, there's a parameter that allows us to discard reads that contains ' 1:Y:' or ' 2:Y:'. But when the reads are not discarded, they are included in the SAM file and this information is lost. And this is a bug, as there is a place in the SAM file to keep this information and with the current implementation the information is wrong.

All reads whose chastity filter is 'Y' should have the SAM flag 512 set (which means that "read fails platform/vendor quality checks"). All other reads should have this flag not set. This should work also in the opposite direction, where a read with this flag set should generate a FastQ file with an 'Y'.

Related to this bug I have another comment. Documentation for the chastityfilter parameter says that it will discard all reads with ' 1:Y:' or ' 2:Y:'. That's good, but what happens with reads with other numbers like ' 3:Y:'? I'm having files with this nomenclature, so it would be better to really parse the fields and discard reads with an 'Y' in the second field, keeping the first field as is.

Discussion


Log in to post a comment.