fqzcomp / Discussion / General Discussion: Bug: Windows line ending

Bug: Windows line ending

Forum: General Discussion

Creator: Nathan S. Watson-Haigh

Created: 2013-05-23

Updated: 2013-05-23

Nathan S. Watson-Haigh - 2013-05-23

It appears that v4.5 does not correctly handle windows line-endings which contain carriage returns. Following compression and decompression an 'N' is added to the sequence and an '!' to the quality string, a carriage return is kept at the end of the sequence ID line but is absent from the optional read ID line which preceeds the quality string.

Cheers,
Nathan

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

James Bonfield - 2013-05-23

Oops. Thanks for the bug report Nathan.

I'll investigate the issue.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

James Bonfield - 2013-05-23

What I'm thinking is to auto-detect whether there are '\r' characters in the file and to set a flag in the header so it'll reproduce them on output. This means you get the same out as you put in, but you can't encode on windows and extract in unix format or vice versa.

Is that sufficient?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

James Bonfield - 2013-05-23

I reverted that decision in the end after experimenting with it in code - not very clean.

Now it simply reads \r\n or \n line ends and works regardless (provided the file doesn't change format mid way). Extracting is always in \n unix style format though with no choice for windows formats.

I also bug fixed a check-sum error when read names exist after the + line.

See the uploaded 4.6 tarball. This is binary compatible with the 4.5 output.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.