Bug: Windows line ending

2013-05-23
2013-05-23
  • It appears that v4.5 does not correctly handle windows line-endings which contain carriage returns. Following compression and decompression an 'N' is added to the sequence and an '!' to the quality string, a carriage return is kept at the end of the sequence ID line but is absent from the optional read ID line which preceeds the quality string.

    Cheers,
    Nathan

     
  • James Bonfield
    James Bonfield
    2013-05-23

    Oops. Thanks for the bug report Nathan.

    I'll investigate the issue.

     
  • James Bonfield
    James Bonfield
    2013-05-23

    What I'm thinking is to auto-detect whether there are '\r' characters in the file and to set a flag in the header so it'll reproduce them on output. This means you get the same out as you put in, but you can't encode on windows and extract in unix format or vice versa.

    Is that sufficient?

     
  • James Bonfield
    James Bonfield
    2013-05-23

    I reverted that decision in the end after experimenting with it in code - not very clean.

    Now it simply reads \r\n or \n line ends and works regardless (provided the file doesn't change format mid way). Extracting is always in \n unix style format though with no choice for windows formats.

    I also bug fixed a check-sum error when read names exist after the + line.

    See the uploaded 4.6 tarball. This is binary compatible with the 4.5 output.