From: Thorsten B. <tho...@do...> - 2011-05-27 20:14:09
|
Hi Alan! >>> [problem with different line endings in *nix/Windows] >> >> I'm no expert in C, but the problem to me seems to be somewhere in the >> line-reading code of cmap0_palette_read. After checking the input file >> for existence, the contents of that file are read. First, a line with >> the number of colors to be defined is expected. That line (although also >> ending with "incorrect" CRLF) is read without a problem - otherwise we >> would have seen the error message "Unrecognized cmap0 header". >> >> A loop over the number of colors follows. Within this loop each line is >> analyzed - however, a line is read here using fgets instead of fscanf as >> before (and reading at most 29 characters). The last character of the >> data read is then replaced by NUL which C needs to terminate strings >> correctly. >> If the line ending is just 0x0A this will work correctly, as it also >> will, if the line has no ending (because 29 characters have been read) - >> however, if the line ending is 0x0D0A the 0x0D will remain in the string >> read. I'd suggest this to be checked by someone with deeper C experience - especially with experience regarding the caveats of C. Had it come to Ada, I would have been able to give advice, but not for C. >> This string is then checked for either being seven characters long or >> being longer than 9 characters - everything else results in the error we >> saw. This explains, why the column added by me to the cmap0_default.pal >> file seemed to fix the problem. >> >> To me, there seem to be several ways to correct the line-ending problem >> here: >> >> 1) According to >> http://pubs.opengroup.org/onlinepubs/009695399/functions/fgets.html >> fgets, if returning without an error, adds a NUL character already to >> the string. So, to me it seems unnecessary to replace the last >> character by NUL. OTOH, it might not hurt, though. >> >> 2) Wouldn't it be more safe to check, whether the last characters of >> the string are 0x0A, 0x0D or any combination of them and replace each >> of these characters by a Null-byte? That should not hurt in any case >> and would involve just changing line 1300 of file plctrl.c >> >> from >> >> <code> >> color_info[strlen( color_info ) - 1] = '\0'; /* remove return character >> */ >> </code> >> >> to something like (may not be correct C, but the idea should be >> understandable) >> >> <code> >> for (my_local_i = 0 ; my_local_i < 30; my_local_i++) >> { >> /* remove return character(s) */ >> if (color_info[my_local_i] == 0x0A) || >> (color_info[my_local_i] == 0x0D) >> { >> color_info[my_local_i] = 0x00; >> } >> } >> </code> >> >> 3) Usage of fscanf instead of fgets and sscanf. However, this might >> bring up new problems. > > I believe we need the fgets/sscanf combination since that allows the > code to reread the line for the case of alternate formats. I doubt this, because there is no re-reading of a line in case of an error. If something is wrong with the current line, an error flag is set and the "line reading loop" is immediately left. At least within the routine reading cmap0 (cmap0_palette_read) files this is the case. > So I think idea (2) is the way to go to fix this issue. But nonetheless I do also think that (2) is the best way to solve this issue. > Since it turns out that indeed Windows tarball unpackers can convert > Unix line endings to Windows line endings, this is an accident that > has obviously happened before (the historical posting about this issue > which has been referred to previously in this thread), which has > happened now to Thorsten, and which is waiting to happen to others. > Therefore, it is important to fix the file-reading code so that *.pal > files can have either Unix or Windows line endings. I have spent more time looking at this source file and while it seems easy to fix this issue for cmap0 files, it might be more difficult to fix this for cmap1 files. These are read in routine c_plspal1, which as fas as may understanding of C goes, has no re-reading loop, too (this needs to be verified!). In this routine, a NUL character is written to the string read only in a single case ("Old tk file format") - to assure that strings longer than 160 characters are correctly terminated. I wonder why this isn't done for newer file formats. Anyway: The fix I suggested (i.e. checking the string character by character for 0x0A or 0x0D) should work here as well. > Does one of our developers with access to Windows want to take > responsibility for fixing this (or evaluating Thorsten's patch if he > goes further with his idea 2?). An obvious test of the updated code > (and something we should do anyway once the issue is fixed) would be > the change the svn properties of the *.pal files to native so they > will be checked out on Windows with the Windows line endings. I could patch my sources here (on Monday, when I'm back in the office, that is) and test this on my windows system using the automatically "corrected" line endings by WinZIP, if that helps. > Thanks, Thorsten, for being persistent about this issue so we finally > discovered the exact source of this nagging problem. You're welcome. -- Gruß, Thorsten |