Check success of illumina2srf

  • Hi Rob,

    Thanks for the new release. I really like the support for RTA data and indexing reads.
    I have been playing around with creating SRF files including the raw information (-b) and wanted to check if I was successful, especially with paired and indexed runs.

    I've tried to use the srf2fastq tool provided with the io_lib package, however I get the following error message:

    "Zero or greater than one CNF chunks found."

    I also get this message with the SRF files generated with the GA-Pipeline v.1.4 SRF_ARCHIVE_REQUIRED option. Don't know if I'm doing something wrong.

    Do you know the best way to test if the generated SRF files are valid?


  • Rob Davies
    Rob Davies

    Hello, Deniz.

    This is srf2fastq not being very friendly.  You need to use the '-c' option to tell it to read the 'calibrated' confidence values from the SRF file (i.e. the ones originally stored in the qseq.txt files).

    Run 'srf2fastq -h' to get a list of the other options that it understands.


  • Hi Rob,

    Thanks for the reply. I missed the -c option, now it's working for me.



  • Anonymous

    Hi Denizk,
    Do you get fastq with proper qualities using srf2fastq?
    Also, how do you produce srf files?
    Apparently srf2fastq is trying to convert scores in a weird way… I believe it's trying to convert in sanger format assuming input in solexa format or worse (i.e. scores like 'b' become 'D')

  • Rob Davies
    Rob Davies

    The quality values produced by srf2fastq are correct.

    SRF is a binary format, so it stores the actual quality values as integers along with a flag to say if they are in the (now very obsolete) log-odds scale, or if they use the phred scale.  srf2fastq takes this data and then outputs it as a Sanger-formatted fastq file (I.e. phred+33).

    For information on fastq encodings, see the wikipedia fastq page, or the Nucleic Acids Research article on the fasta format (doi:10.1093/nar/gkp1137).