description

Anonymous
Attachments
test_tile14_base2.png (80885 bytes)

Python script that produces color-coded quality plots based on fastq format read quality scores. Scores are averaged over binned read tile coordinates. Useful for spotting spatial quality patterns.

Use:

python tile_visual_quality.py fastq_file_name image_output_name [--ngrid int --tile int --base int --encoding str]

  1. options in [] are optional, --ngrid controls the resolution, --tile selects the tile to be plotted, --base selects the base position for which the score is calculated, range from 0 to k-1, where the read length is k

  2. default ngrid is 500, default tile is all

  3. if no base position is specified, the median of the score across all (ex: 36) bases is used for every read

  4. encoding type: 'sanger', 'illumina1.0' or 'illumina1.3', default is 'illumina1.3'

  5. Requires python packages argparse, sys, matplotlib, numpy

Example:

python tile_visual_quality.py data.fastq tile.png --ngrid 500 --tile 10 --base 5 --encoding 'illumina1.3'

python tile_visual_quality.py example_fastq_file.txt test --base 2 --ngrid 200

The second line should produce a file named "test_tile14_base2.png" which looks like the attachment.

Links:

FASTQ format
Score encoding


MongoDB Logo MongoDB