Re: [Flex-devel] #include <unistd.h> in middle of scanner: bad idea
flex is a tool for generating scanners
Brought to you by:
wlestes
From: Kaz K. <938...@ky...> - 2020-08-07 06:31:33
|
On 2020-08-06 21:26, Kang-Che Sung wrote: > On Thu, Aug 6, 2020 at 1:37 AM Kaz Kylheku <938...@ky...> > wrote: >> >> Flex generates a file which contains >> a #include <unistd.h> long after user material. This seems to be for >> the sake of having a declaration of isatty. >> >> The problem with this is that the user code can #define macros that >> can break the header. C programs have to be careful with what they >> do before system headers are included. > > System headers can always be overridden by user code. A prominent > example is > feature test macros, which are defined before the inclusion of system > headers > and can change the ABI used for standard library functions. I was writing more from a standard C point of view in the above paragraph. ISO C has no "feature test macros". They are POSIX feature (imitated in other platforms). There is a specific vocabulary of feature test macros. You cannot just make up any symbol. So for instance -D_POSIX_C_SOURCE on the compiler command line has a well defined effect in POSIX environments, due to being documented in a standard. Whereas -D_BLUB_SOURCE, which I just made up, is undefined behavior. My comment wasn't meant to imply that an application must never under any circumstances define any sort of macro prior to including a system header. Ah right; in ISO C, there is a famous counterexample: #define NDEBUG #include <assert.h> Again, documented by language, so okay. > >> Here is small repro test case: >> >> %{ >> >> #define pause blah[] >> >> %} >> >> %% >> >> Result: >> >> $ flex lex.l ; cc lex.yy.c >> lex.l:3:15: error: declaration of ‘blah’ as array of functions >> #define pause blah[] > > Why would anyone want to do that? Redefining a standard library symbol > makes > the code unportable (except for local scope redefines). The point is that that is a local scope define. My sample doesn't even include any headers at all. Of course things like <stdio.h> are implied but they are expected to be before #define pause. The programmer doesn't expect <unistd.h> in there at all. What breaks it is that Flex generates the code such that there is a surprise #include <unistd.h> long after the #define pause. The inclusion of <unistd.h> will not break if it precedes the #define pause. The issue isn't standard library symbols; I just picked that for an easy repro. Any macro whatsoever could potentially cause a problem. I also don't care about "portable" as much as "not breaking on any of the half dozen platforms I'm building for". >> How about just an "extern int isatty(int)" somewhere? > > Forward declaration can somewhat defeat the purpose of the header. A > library > implementation might not declare the function exactly as described in > the > standard document. It might contain other attributes or ABI keywords. Indeed; that's not a good solution. Probably, the best thing would be not to have that isatty business in there at all, unless it's requested by some explicit option. isatty is still being wastefully called even if YY_INPUT has redirected the lexer to scan from a character string instead of a stream. The isatty calls even happen if the lexer is generated in -B (batch) mode. (Flex 2.6.4 on Ubuntu 18). Aha! I have a nice workaround: #define YY_NO_UNISTD_H #undef isatty #define isatty(x) 0 Presto; I'm no longer including <unistd.h> myself, there is no complaint about an undeclared isatty, and the numerous ioctls in the strace output: ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0 are gone! (IMHO, the above effect should be obtained by just %option batch or -B.) And that also gives us one example to the "why would anyone redefine a library symbol" question. If <unistd.h> were included at the top, then we could let that include be, and just do: #undef isatty #define isatty(x) 0 This trick is going to work fine on all the platforms I care about, so it is de facto portable enough. There are theoretical ways that could still break, but if it were perpetrated before #include <unistd.h>, the probability of failure would go to 100%. That's really why you don't want surprise header inclusions after user-defined material. |