From: Earnie B. <ear...@ya...> - 2003-01-31 19:01:33
|
I've just uploaded a snapshot of my port of the newest version of GNU Bison to the MinGW project for download. You can get it at http://prdownloads.sf.net/mingw/bison-1.875.0-2003.01.31-1.exe for evaluation. Have Fun, Earnie. |
From: Earnie B. <ear...@ya...> - 2003-02-10 22:29:54
|
I've just uploaded a snapshot of my port of the newest version of GNU Bison to the MinGW project for download. You can get it at http://prdownloads.sf.net/mingw/bison-1.875.0-2003.02.10-1.exe for evaluation. This attempts resolve a reported issue where the installation prefix isn't c:\mingw. Have Fun, Earnie. |
From: Luke D. <cod...@ho...> - 2003-02-01 09:07:57
|
I tried building this to examine the text vs. binary issue you mentioned in the release notes, but most of the testsuite (e.g. beginning with test 3) fails because it can't find m4. The problem is that config.h contains: #define M4 "/bin/m4" so of course _spawnvp() fails. I haven't downloaded your binary but I was just wondering what configure option (or whatever) you used to get around this? Ideally bison should build for MinGW without requiring any configure options but I don't know if this is your goal. Luke ----- Original Message ----- From: "Earnie Boyd" <ear...@ya...> To: "MinGW Users" <min...@li...> Sent: Saturday, February 01, 2003 3:01 AM Subject: [Mingw-users] Snapshot: bison-1.875.0-2003.01.31-1.exe > I've just uploaded a snapshot of my port of the newest version of GNU > Bison to the MinGW project for download. You can get it at > http://prdownloads.sf.net/mingw/bison-1.875.0-2003.01.31-1.exe for > evaluation. > > Have Fun, > Earnie. |
From: Earnie B. <ear...@ya...> - 2003-02-01 10:23:15
|
Luke Dunstan wrote: > I tried building this to examine the text vs. binary issue you mentioned in > the release notes, but most of the testsuite (e.g. beginning with test 3) > fails because it can't find m4. The problem is that config.h contains: > > #define M4 "/bin/m4" > > so of course _spawnvp() fails. I haven't downloaded your binary but I was > just wondering what configure option (or whatever) you used to get around > this? Ideally bison should build for MinGW without requiring any configure > options but I don't know if this is your goal. > I cheated and modified the config.h by hand. You can also set M4 environment variable to point to c:/msys/1.0/bin/m4 or perhaps even just m4 (Hmm, maybe the define shouldn't be absolute). I agree that it should build without requiring modification. I'm not sure what to modify though, perhaps autoconf, perhaps bash? BTW, if you add a -w to the diff command found in the testsuite script in the tests directory, you'll see two failing test cases, otherwise there are more. The MSYS diff treats \r as white space, that's how I knew I had missed some output modes to binary. Earnie. |
From: Luke D. <cod...@ho...> - 2003-02-01 15:26:04
|
----- Original Message ----- From: "Earnie Boyd" <ear...@ya...> To: "Luke Dunstan" <cod...@ho...> Cc: "MinGW Users" <min...@li...> Sent: Saturday, February 01, 2003 6:23 PM Subject: Re: [Mingw-users] Snapshot: bison-1.875.0-2003.01.31-1.exe > Luke Dunstan wrote: > > I tried building this to examine the text vs. binary issue you mentioned in > > the release notes, but most of the testsuite (e.g. beginning with test 3) > > fails because it can't find m4. The problem is that config.h contains: > > > > #define M4 "/bin/m4" > > > > so of course _spawnvp() fails. I haven't downloaded your binary but I was > > just wondering what configure option (or whatever) you used to get around > > this? Ideally bison should build for MinGW without requiring any configure > > options but I don't know if this is your goal. > > > > I cheated and modified the config.h by hand. You can also set M4 > environment variable to point to c:/msys/1.0/bin/m4 or perhaps even just > m4 (Hmm, maybe the define shouldn't be absolute). > I agree that it > should build without requiring modification. I'm not sure what to > modify though, perhaps autoconf, perhaps bash? I'm not sure either. The configure script checks the value of $M4, but it only uses it if it is an absolute path (I don't know why, but I guess it doesn't matter here). The way that configure searches for m4 in PATH is a bit redundant when execvp/_spawnvp do it anyway, but I think it really comes down to the difference between the standard way of finding executables on Unix vs. Windows (hardcoding vs. relative). However, in the case of MSYS we can't find m4 relative to the location of bison anyway, and a hardcoded path is no good so the only option is to search for m4 in PATH. Possibly just something like this: #ifdef __MINGW32__ #undef M4 #define M4 "m4" #endif It could go in system.h just after config.h is included. I don't know if something like that would be acceptable in the official m4 sources though. > > BTW, if you add a -w to the diff command found in the testsuite script > in the tests directory, you'll see two failing test cases, otherwise > there are more. The MSYS diff treats \r as white space, that's how I > knew I had missed some output modes to binary. > > Earnie. > These tests are affected by line ending differences: 4 39 40 41 42 104 Actually you didn't miss any because the carriage returns are in the output of the bison-generated parser executables, not the output of bison itself. Note that if the bison source package was in DOS format (e.g. if you checked it out using a Windows CVS client), it would match the output of the programs, and you wouldn't need to change any of the fopen() calls to use binary. It would be better, though, if we used a version of "diff" that treated CRLF the same as LF instead of treating CR as whitespace. If MSYS diff can be modified it would be okay, but maybe we could use a Win32 native diff instead (like Manu's port)? Of course other scripts and MSYS tools may depend on the output of diff being in Unix format. These tests have a problem with sed: 21 23 The input text is echoed as you would expect when not passing the -n option to sed, but the sed script also has a special comment "#n" on its first line, which is supposed to have the same effect as -n according to the sed manual. I can confirm that this is a bug in MSYS sed, and strangely the bug is also in the Gnuwin32 port, but Cygwin's sed works. See sed/compile.c:408 in the sed source. When I run the full set of tests, it initially reports that these tests have failed for some reason: 85 91 92 95-103 However, later when the tests are rerun in verbose mode they succeed, and they also succeed if I run them individually from the command line, in verbose mode or not (e.g. "./testsuite 85"). Apparently the exit status is being set somewhere that it shouldn't be, but I haven't found how. You said that only two tests failed for you (presumably 21 and 23), so didn't you see this behaviour? Luke |
From: Greg C. <chi...@mi...> - 2003-02-01 17:28:51
|
Luke Dunstan wrote: > > Note that if the bison source package was in DOS format (e.g. if you checked > it out using a Windows CVS client), it would match the output of the > programs, and you wouldn't need to change any of the fopen() calls to use > binary. It would be better, though, if we used a version of "diff" that > treated CRLF the same as LF instead of treating CR as whitespace. If MSYS > diff can be modified it would be okay, but maybe we could use a Win32 native > diff instead (like Manu's port)? Of course other scripts and MSYS tools may > depend on the output of diff being in Unix format. Would any of these ideas make sense? - run unix2dos on the canned test results before diff - run dos2unix on the generated test results before diff - use diff's '--ignore-space-change' flag > The input text is echoed as you would expect when not passing the -n option > to sed, but the sed script also has a special comment "#n" on its first > line, which is supposed to have the same effect as -n according to the sed > manual. I can confirm that this is a bug in MSYS sed, and strangely the bug > is also in the Gnuwin32 port, but Cygwin's sed works. See sed/compile.c:408 > in the sed source. I once had problems like this. For a while I was actually distributing two sed's to my team: cygwin's for some purposes, and someone else's--Mikey's, I think--for other purposes. Both were gnu sed 3.02 . All the silly problems went away when I got gnu sed 3.02.80, built it with MSYS, and got rid of the other sed binaries. Others seem to have had similar experiences (quotes below), although I don't know whether this will fix the '#n' problem. http://main.rtfiber.com.tw/~changyj/sed/notes.html The scripts in this site have been tested with GNU sed version 3.02.80, these scripts may result in incorrect results when using older versions of sed, including GNU sed 3.02. http://www-106.ibm.com/developerworks/linux/library/l-sed1.html The right sed In this series, we will be using GNU sed 3.02.80. Some (but very few) of the most advanced examples you'll find in my upcoming, follow-on articles in this series will not work with GNU sed 3.02 or 3.02a. If you're using a non-GNU sed, your results may vary. Why not take some time to install GNU sed 3.02.80 now? Then, not only will you be ready for the rest of the series, but you'll also be able to use arguably the best sed in existence! |
From: Earnie B. <ear...@ya...> - 2003-02-01 17:32:24
|
Luke Dunstan wrote: > > However, later when the tests are rerun in verbose mode they succeed, and > they also succeed if I run them individually from the command line, in > verbose mode or not (e.g. "./testsuite 85"). Apparently the exit status is > being set somewhere that it shouldn't be, but I haven't found how. You said > that only two tests failed for you (presumably 21 and 23), so didn't you see > this behaviour? > I didn't have/take time to investigate those two thouroughly. But, you sure have. Thanks, at least we know where to look to fix. Diff, I think I'd prefer the ``native'' version approach. Have you tried it? Sed, do you want to patch it? Earnie. |
From: Earnie B. <ear...@ya...> - 2003-02-01 17:47:30
|
Earnie Boyd wrote: > > Sed, do you want to patch it? > Or back port to 3.02.80? Earnie. |
From: Luke D. <cod...@ho...> - 2003-02-02 05:38:29
|
----- Original Message ----- From: "Earnie Boyd" <ear...@ya...> To: "Luke Dunstan" <cod...@ho...> Cc: "MinGW Users" <min...@li...> Sent: Sunday, February 02, 2003 1:32 AM Subject: Re: [Mingw-users] Snapshot: bison-1.875.0-2003.01.31-1.exe > Luke Dunstan wrote: > > > > However, later when the tests are rerun in verbose mode they succeed, and > > they also succeed if I run them individually from the command line, in > > verbose mode or not (e.g. "./testsuite 85"). Apparently the exit status is > > being set somewhere that it shouldn't be, but I haven't found how. You said > > that only two tests failed for you (presumably 21 and 23), so didn't you see > > this behaviour? > > > > I didn't have/take time to investigate those two thouroughly. But, you > sure have. Thanks, at least we know where to look to fix. > > Diff, I think I'd prefer the ``native'' version approach. Have you > tried it? Not yet. > > Sed, do you want to patch it? > > Earnie. The Gnuwin32 version is 3.02.80 and it doesn't work, so that won't be any help. I just tried the latest sed 4.0.5 and it doesn't work either, so it seems to be a Windows-specific issue. The relevant code is: case '#': if (cur_cmd->a1) bad_prog(_(NO_SHARP_ADDR)); ch = inchar(); if (ch=='n' && first_script && cur_input.line < 2) if ( (prog.base && prog.cur==2+prog.base) || (prog.file && !prog.base && 2==ftell(prog.file))) no_default_output = 1; while (ch != EOF && ch != '\n') ch = inchar(); continue; /* restart the for (;;) loop */ I only did a little debugging but I found that ftell() returns -1 instead of 2. It's probably a text vs. binary thing but does anyone know why this happens? Since there is no test for the #n in the sed testsuite it is not surprising that this bug hasn't been found before. I had to convert the sed testsuite to DOS format to run it, but when it gets to a test called "spencer" it goes nuts accessing A: drive because it contains many command lines with a: somewhere in them :) Luke |
From: Earnie B. <ear...@ya...> - 2003-02-02 11:24:41
|
Luke Dunstan wrote: > >>Sed, do you want to patch it? >> >>Earnie. > > > The Gnuwin32 version is 3.02.80 and it doesn't work, so that won't be any > help. I just tried the latest sed 4.0.5 and it doesn't work either, so it > seems to be a Windows-specific issue. The relevant code is: > > case '#': > if (cur_cmd->a1) > bad_prog(_(NO_SHARP_ADDR)); > ch = inchar(); > if (ch=='n' && first_script && cur_input.line < 2) > if ( (prog.base && prog.cur==2+prog.base) > || (prog.file && !prog.base && 2==ftell(prog.file))) > no_default_output = 1; > while (ch != EOF && ch != '\n') > ch = inchar(); > continue; /* restart the for (;;) loop */ > > I only did a little debugging but I found that ftell() returns -1 instead of > 2. It's probably a text vs. binary thing but does anyone know why this > happens? > Yes and no for text vs binary. See http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclib/html/_crt_ftell.asp for details. If prog.file is a stdin stream the reason for failure is that Win32 stdio streams aren't seekable. Earnie. |
From: Earnie B. <ear...@ya...> - 2003-02-02 12:19:27
|
Luke Dunstan wrote: > > > The Gnuwin32 version is 3.02.80 and it doesn't work, so that won't be any > help. I just tried the latest sed 4.0.5 and it doesn't work either, so it > seems to be a Windows-specific issue. The relevant code is: > > case '#': > if (cur_cmd->a1) > bad_prog(_(NO_SHARP_ADDR)); > ch = inchar(); > if (ch=='n' && first_script && cur_input.line < 2) > if ( (prog.base && prog.cur==2+prog.base) > || (prog.file && !prog.base && 2==ftell(prog.file))) > no_default_output = 1; > while (ch != EOF && ch != '\n') > ch = inchar(); > continue; /* restart the for (;;) loop */ > Does the following patch make since? Index: compile.c =================================================================== RCS file: /prjz/.cvsroot/sed/sed/compile.c,v retrieving revision 1.1.1.1 diff -u -3 -p -r1.1.1.1 compile.c --- compile.c 2003/02/01 23:21:27 1.1.1.1 +++ compile.c 2003/02/02 12:13:42 @@ -1043,7 +1043,7 @@ compile_program(vector) ch = inchar(); if (ch=='n' && first_script && cur_input.line < 2) if ( (prog.base && prog.cur==2+prog.base) - || (prog.file && !prog.base && 2==ftell(prog.file))) + || (prog.file && !prog.base && prog.file != stdin && 2==ftell(prog.file))) no_default_output = 1; while (ch != EOF && ch != '\n') ch = inchar(); |
From: Chris H. <pop...@so...> - 2003-02-02 16:14:10
|
Just wondering if I have to set the BISON_HAIRY and BISON_SIMPLE environment variables with the MinGW port of Bison? Regards Chris |
From: Earnie B. <ear...@ya...> - 2003-02-02 16:29:54
|
Chris Hansen wrote: > Just wondering if I have to set the BISON_HAIRY and BISON_SIMPLE > environment variables with the MinGW port of Bison? > No, more current versions of bison have done away with them. Earnie. |
From: Luke D. <cod...@ho...> - 2003-02-03 04:37:32
|
----- Original Message ----- From: "Earnie Boyd" <ear...@ya...> To: "Luke Dunstan" <cod...@ho...> Cc: "MinGW Users" <min...@li...> Sent: Sunday, February 02, 2003 8:19 PM Subject: Re: [Mingw-users] Snapshot: bison-1.875.0-2003.01.31-1.exe > Luke Dunstan wrote: > > > > > > The Gnuwin32 version is 3.02.80 and it doesn't work, so that won't be any > > help. I just tried the latest sed 4.0.5 and it doesn't work either, so it > > seems to be a Windows-specific issue. The relevant code is: > > > > case '#': > > if (cur_cmd->a1) > > bad_prog(_(NO_SHARP_ADDR)); > > ch = inchar(); > > if (ch=='n' && first_script && cur_input.line < 2) > > if ( (prog.base && prog.cur==2+prog.base) > > || (prog.file && !prog.base && 2==ftell(prog.file))) > > no_default_output = 1; > > while (ch != EOF && ch != '\n') > > ch = inchar(); > > continue; /* restart the for (;;) loop */ > > > > Yes and no for text vs binary. See > http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclib/html/ _crt_ftell.asp > for details. If prog.file is a stdin stream the reason for failure is > that Win32 stdio streams aren't seekable. The sed script is specified as "-f foo.sed", so prog.file would not be stdin... > Does the following patch make since? > > Index: compile.c > =================================================================== > RCS file: /prjz/.cvsroot/sed/sed/compile.c,v > retrieving revision 1.1.1.1 > diff -u -3 -p -r1.1.1.1 compile.c > --- compile.c 2003/02/01 23:21:27 1.1.1.1 > +++ compile.c 2003/02/02 12:13:42 > @@ -1043,7 +1043,7 @@ compile_program(vector) > ch = inchar(); > if (ch=='n' && first_script && cur_input.line < 2) > if ( (prog.base && prog.cur==2+prog.base) > - || (prog.file && !prog.base && 2==ftell(prog.file))) > + || (prog.file && !prog.base && prog.file != stdin && > 2==ftell(prog.file))) > no_default_output = 1; > while (ch != EOF && ch != '\n') > ch = inchar(); > The problem is that "2==ftell(prog.file)" is false, so adding "prog.file != stdin" won't help. However, I have found that this issue seems to be entirely a text vs. binary problem when building sed as a Mingw executable. I was under the impression that opening a Unix-format file in text mode with MSVCRT would work the same as opening a DOS-format file in text mode because the CRLF -> LF conversion would simply have no effect. After some confusion this code showed that I was wrong: #include <stdio.h> #include <errno.h> #include <string.h> int main(int argc, char *argv[]) { FILE *fp; int t, e; if(argc == 2) { fp = fopen(argv[1], "r"); if(fp == NULL) { perror(argv[1]); return 1; } } else fp = stdin; printf("c1: %c\n", fgetc(fp)); printf("c2: %c\n", fgetc(fp)); t = ftell(fp); e = errno; printf("ftell: %d\n", t); printf("errno: %u (%s)\n", e, strerror(e)); return 0; } The input file foo.sed in DOS format contains: #n /bar/ p Any command line "./a foo.sed", "./a < foo.sed" or "cat foo.sed | ./a" works as expected: c1: # c2: n ftell: 2 errno: 0 (No error) However, with the same input in Unix format the output of "./a foou.sed" or "./a < foou.sed" is: c1: # c2: n ftell: -1 errno: 0 (No error) Strangely, "cat foou.sed | ./a" works normally. In case you are wondering, it still happens when the first line of the input file contains more than 2 characters, _unless_ the first line contains several thousand characters (presumably because this is larger than the stdio input buffer). I take this to mean that ftell() on a text mode stream has undefined behaviour (especially since errno isn't even set), though the MS docs neglect to mention this. Does anyone know if this is ANSI C conforming? As for sed, it may not be worth fixing for mingw32 host, BUT the MSYS version of sed fails for a completely different reason! See sed.c:173: case 'f': #ifdef __MSYS__ if (!newz) the_program = compile_string(the_program, "s/^M$//"); newz = 1; #endif the_program = compile_file(the_program, optarg); break; If you look back at the conditional it checks first_script, but since compile_string() has already been called, the input script is no longer considered to be the first script. I'll let you know if I find a simple way to fix this. BTW the string literal should contain \r not a real ^M. Luke |
From: Earnie B. <ear...@ya...> - 2003-02-03 14:55:24
|
Luke Dunstan wrote: > > The problem is that "2==ftell(prog.file)" is false, so adding "prog.file != > stdin" won't help. However, I have found that this issue seems to be > entirely a text vs. binary problem when building sed as a Mingw executable. > I was under the impression that opening a Unix-format file in text mode with > MSVCRT would work the same as opening a DOS-format file in text mode because > the CRLF -> LF conversion would simply have no effect. After some confusion > this code showed that I was wrong: > You and just about everyone else I suppose are under the same impression, I know that I was. Yeah, and further digging in the code pointed out that the condition for prog.file == stdin shouldn't be a problem. > #include <stdio.h> > #include <errno.h> > #include <string.h> > > int main(int argc, char *argv[]) > { > FILE *fp; > int t, e; > > if(argc == 2) > { > fp = fopen(argv[1], "r"); > if(fp == NULL) > { > perror(argv[1]); > return 1; > } > } > else > fp = stdin; > printf("c1: %c\n", fgetc(fp)); > printf("c2: %c\n", fgetc(fp)); > t = ftell(fp); > e = errno; > printf("ftell: %d\n", t); > printf("errno: %u (%s)\n", e, strerror(e)); > > return 0; > } > > The input file foo.sed in DOS format contains: > > #n > /bar/ p > > Any command line "./a foo.sed", "./a < foo.sed" or "cat foo.sed | ./a" works > as expected: > > c1: # > c2: n > ftell: 2 > errno: 0 (No error) > > However, with the same input in Unix format the output of "./a foou.sed" or > "./a < foou.sed" is: > > c1: # > c2: n > ftell: -1 > errno: 0 (No error) > We should document your above example on the website as known ``MS Funnies and Follies'' so that we don't forget it. > Strangely, "cat foou.sed | ./a" works normally. Maybe not, MSYS pipes are binary mode and that would be duplicated to stdin for the script. > In case you are wondering, > it still happens when the first line of the input file contains more than 2 > characters, _unless_ the first line contains several thousand characters > (presumably because this is larger than the stdio input buffer). I take this > to mean that ftell() on a text mode stream has undefined behaviour > (especially since errno isn't even set), though the MS docs neglect to > mention this. Does anyone know if this is ANSI C conforming? > Interesting, and I doubt that it's ANSI conforming. > As for sed, it may not be worth fixing for mingw32 host, BUT the MSYS > version of sed fails for a completely different reason! See sed.c:173: > > case 'f': > #ifdef __MSYS__ > if (!newz) > the_program = compile_string(the_program, "s/^M$//"); > newz = 1; > #endif > the_program = compile_file(the_program, optarg); > break; > > If you look back at the conditional it checks first_script, but since > compile_string() has already been called, the input script is no longer > considered to be the first script. I'll let you know if I find a simple way > to fix this. BTW the string literal should contain \r not a real ^M. > Perhaps just resetting first_script? Earnie. |
From: Greg C. <chi...@mi...> - 2003-02-04 02:39:52
|
Luke Dunstan wrote: > > I was under the impression that opening a Unix-format file in text mode with > MSVCRT would work the same as opening a DOS-format file in text mode because > the CRLF -> LF conversion would simply have no effect. After some confusion > this code showed that I was wrong: [snip example] A LF-delimited file is not a text file at all under dos or windows. When you open it in text mode, all bets are off. The rtl doesn't have to behave 'rationally' here. > I take this > to mean that ftell() on a text mode stream has undefined behaviour > (especially since errno isn't even set), though the MS docs neglect to > mention this. Does anyone know if this is ANSI C conforming? I don't have C89, but I have C99, and it requires very little of a conforming implementation's ftell() for text streams. The evidence presented doesn't demonstrate that this implementation fails to conform. The program is not strictly conforming because it depends on unspecified behavior. > In case you are wondering, > it still happens when the first line of the input file contains more than 2 > characters, _unless_ the first line contains several thousand characters > (presumably because this is larger than the stdio input buffer). A conforming implementation doesn't even have to support a text file that has a line that long: 7.19.2/7 An implementation shall support text files with lines containing at least 254 characters, including the terminating new-line character. The non-text file was opened in text mode, so ftell() is required only to return "unspecified information" that can be used as an argument for fseek(): 7.19.9.4/2 For a text stream, its file position indicator contains unspecified information, usable by the fseek function for returning the file position indicator for the stream to its position at the time of the ftell call; the difference between two such return values is not necessarily a meaningful measure of the number of characters written or read. J.1/1 Unspecified behavior [...] The details of the value returned by the ftell function for a text stream After reading the second character, > The problem is that "2==ftell(prog.file)" is false but 7.19.9.4/2 says this approach doesn't have to work--so the problem is the program, not the rtl. |