From: Hans-Bernhard B. <br...@ph...> - 2000-10-09 17:31:19
|
[I keep on following up to my own postings ...] On Fri, 6 Oct 2000, Hans-Bernhard Broeker wrote: > The next-best idea I have is to abandon usage of 'yytext' as a > back-buffer, completely. This would get us rid of all the calls to > yymore(), and thus avoid the problem. Been there, done that. And it really does help. I now have a scanner (and slightly changed crossref.c to go with it) that uses a homegrown dynamically resizeable buffer, my_yytext, to store the contents of the current line in, until the putcrossref() that ends the line. Once that was done, I decided to move the comment scanning stuff into the scanner itself. And that one turned out to help a *lot*, too. After yet another major renovation of almost all scan rules to make sure comments don't confuse the lookahead stuff, I now have a scanner that: 1) is safe to use flex' %pointer mode, and whatever flex option set I tried it on (up to and including '-Cfar' and '-CFar', the recommended utmost-performance settings). But don't be shocked if in -CFa mode, you get a scanner.c file of about 2.5 Megabytes ;-) 2) does no longer need an input() function override. This means that flex can now read its input in much larger chunks (our skipcomment_input() used single getc()s!). The original scanner was actually so getc()-heavy that by linking the executable statically to remove the extra call overhead involved with shared libraries, I could save about 30% of processing time. 3) is about a factor of 4 faster than the 15.0bl2 scanner built with flex. It's actually within reach of being I/O-bottlenecked. yylex() and functions called by it burn only ca. 60% of the CPU time of a 'cscope -cub' run, now. The putc()'s in putcrossref() start to become important already. 4) doesn't crash even if I throw the full linux kernel source tree at it. 5) unlike one of my intermediate milestones, contains lots and lots of backing-up states. The original scanner has about 35, the new one on the order of 280. But it's still much faster. It does show some differences in behaviour compared to the existing scanner, though: *) the old scanner copied comments following an #include "file.h" on the same line to the output, the new one doesn't. IMHO the old behaviour was buggy. *) For the K&R version of the classical CONCAT(a,b) macro: #define CONCAT(a,b) a/**/b cscope 15.0bl2 outputs the following cscope.out file (-c mode): 1 #define #CONCAT ( a , b ) a ) b Note how the DEFINEEND line (the single ')' indented by a tab on the one-before-last line) comes *before* the actual end of this line, and/or the definition, as far as that's concerned. I think that's a bug, too. Depending on how I treat comments in my flex rules, I can either reproduce this bug, or output a some what more sensible one, where the last three lines are replace by a b ) Curiously enough, though, both versions look exactly the same, if displayed by cscope itself. I don't fully understand what this supposed to look like (or what the DEFINEEND entry is needed for, to begin with...) I'm not attaching the scanner, this time. If you want to have a look, you can request it, or I may upload it to the 'patches' section on SourceForge. Hans-Bernhard Broeker (br...@ph...) Even if all the snow were burnt, ashes would remain. |