I would like to consider as a lexer(tokenizer) to support the "Codecompletion
plugin in Codeblocks IDE".
as the home page said, the code directed lexer is faster than the table driven
lexer. then Quex is a code directed lexer.
I just do a comparesion of the flex. I use the C/004 demo code, and I disabled
the cout output in the while loop, and also disable the assert macro. by
lexing a 8M cpp source code, it takes about 1.5s.
I just take some extra test by using flex gramma, and the result is quite
fast, it is only about 150 ms.
See a post link of mine in the codeblocks' forum:
Hi, I found a post by you in GCC maillist http://gcc.gnu.org/ml/gcc/2007-05/msg00726.html
You said that 200% performance, so I have more interests on this. I'm not sure
how to run the benchmark on my WinXp and MinGW, I have also seen that all the
result data in
C:\Program Files\quex\quex-0.50.1\demo\benchmark
is only about quex, there is no flex and re2c result.
Can you show some of them?
Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
As you said in "demo/benchmark" there is already a setup
to benchmark and compare with other systems. The Makefile
contains targets to build a quex-based, a flex-based and a
re2c-based parser.
As you can see, it is also a cache issue--smaller programs perform faster,
cause
lesser cache misses.
In directory "demo/benchmark/run" you find some helper scripts to run your
benchmark.
Note, that the benchmark tries to isolate the cost for lexical analyzis. An
inadvertent use of the std::string class can slow down the performance
tremendously.
(*) Your Case
:w
From what I understood, the thing you are trying to do is covered by the
benchmark. If not, I would like from you the following:
-- a .qx file that describes exactly the grammar and the
pattern-action pairs.
-- a .cpp file for your parser/lexical analyzer, where
the lexical analyzis is the only thing that happens.
-- an example text file on which the analyzer runs for
benchmarking.
-- some results from '> time my_lexer' as a reference.
In this case, send me an email to fschaef at users dot sourceforge dot net,
so that I have your private email and we can exchange attachments.
It should definitely be possible to get a lexer for the task you
describe with a clock cycles per character rate of 10 or better.
Best Regards
Frank
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Note, that the benchmark tries to isolate the cost for lexical analyzis. An
inadvertent use of the std::string class can slow down the performance
tremendously.
That's may be the reason why my code is slow down, because In my test code, I
guess it use "std::string".
I'm so grad to see the result, your quex is quite good, I will choose this one
as the lexer to refactor the " tokenizer of codecompletion plugins"
thanks
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Please, let me know about your benchmark results. Note, that with
--path-compression, --template-compression, or --template-compression-uniform,
you can achieve sometimes a big deal of program space savings.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The lines above print out environment variables that tell how
your environment find applications. The variable PATH contains
a list of directories where executables can be found. The
directory where quex is located must be one of them. Else,
it must be added.
Inside the make environment things may change. The make target
'me_watch' allows to print out the variables SHELL, PATH, and QUEX_PATH.
Dependent on the shell other variables may be activated. The variable
QUEX_PATH must be set to the directory where quex is installed.
Regards
Frank
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Also, I have QUEX_PATH variable defined in my windows environment
The default value is: E:\code\quex\quex-0.50.1
I have tried to change it to /e/code/quex/quex-0.50.1
But both cases, when I run the command:
make run/lexer-quex
I get the same result as before: quex command not found.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
by the way, I should you should add the \r to the c.qx
like:
P_WHITESPACE [ \t\r\n]+
...
<skip: [ \n\r\t]>
Also, I think the pdf manual is a bit hard for beginners. I think a mini
example and tutorial is enough. Also, I'm not sure How to generate C stype
EasyLexer. When I use the default command:
quex -i xxx.qx -o EasyLexer
It always generate a C++ code.
thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
-- There is a minimalist example in the PDF Documentation. Do you mean
there should be a mini docu that only contains the minimalist example?
Well, go ahead write it in rst or plain text and I will include it in the
download
section.
-- Good point with \r
-- use command line option '--language C'. This becomes clear
from the examples in demo/C/*
Best Regards
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I would like to consider as a lexer(tokenizer) to support the "Codecompletion
plugin in Codeblocks IDE".
as the home page said, the code directed lexer is faster than the table driven
lexer. then Quex is a code directed lexer.
I just do a comparesion of the flex. I use the C/004 demo code, and I disabled
the cout output in the while loop, and also disable the assert macro. by
lexing a 8M cpp source code, it takes about 1.5s.
I just take some extra test by using flex gramma, and the result is quite
fast, it is only about 150 ms.
See a post link of mine in the codeblocks' forum:
http://forums.codeblocks.org/index.php/topic,13017.msg87659.html#msg87659
Any ideas?
Thanks.
Did you disable the asserts? Can you provide à Sample Application?
Thanks for your reply.
I have disabled the "assert" by adding -DQUEX_OPTION_ASSERTS_DISABLED to the
compiler options.
I use MinGW 4.4.4 compiler Windows XP,python 2.6.5.
The code is exact the same as minimal sample, see below:
The C grammar is used in:
C:\Program Files\quex\quex-0.50.1\demo\C\004\c.qx
And I use the command to generate the code
quex -i simple.qx -o tiny_lexer
By the way, I found that there is a benchmark folder, there:
C:\Program Files\quex\quex-0.50.1\demo\benchmark\in\flex
So, can you show some result of comparing flex and quex??
Thanks.
Hi, I found a post by you in GCC maillist
http://gcc.gnu.org/ml/gcc/2007-05/msg00726.html
You said that 200% performance, so I have more interests on this. I'm not sure
how to run the benchmark on my WinXp and MinGW, I have also seen that all the
result data in
C:\Program Files\quex\quex-0.50.1\demo\benchmark
is only about quex, there is no flex and re2c result.
Can you show some of them?
Thanks.
(*) demo/benchmark
As you said in "demo/benchmark" there is already a setup
to benchmark and compare with other systems. The Makefile
contains targets to build a quex-based, a flex-based and a
re2c-based parser.
builds a flex based lexical analyzer,
builds a quex based analyzer, etc.
(*) Results
Results on "code/linux-2.6.22.17-kernel-dir.c".
(1) lexer-flex:
Compiled with -Os (size optimized)
clock_cycles_per_character = {35.004494}, // overhead eliminated
Compiled with -O3 (speed optimized)
clock_cycles_per_character = {43.224880}
(2) lexer-quex
Compiled with -Os (size optimized)
clock_cycles_per_character = {17.738173},
Compiled with -O3 (speed optimized)
clock_cycles_per_character = {17.393938}
As you can see, it is also a cache issue--smaller programs perform faster,
cause
lesser cache misses.
In directory "demo/benchmark/run" you find some helper scripts to run your
benchmark.
Note, that the benchmark tries to isolate the cost for lexical analyzis. An
inadvertent use of the std::string class can slow down the performance
tremendously.
(*) Your Case
:w
From what I understood, the thing you are trying to do is covered by the
benchmark. If not, I would like from you the following:
-- a .qx file that describes exactly the grammar and the
pattern-action pairs.
-- a .cpp file for your parser/lexical analyzer, where
the lexical analyzis is the only thing that happens.
-- an example text file on which the analyzer runs for
benchmarking.
-- some results from '> time my_lexer' as a reference.
In this case, send me an email to fschaef at users dot sourceforge dot net,
so that I have your private email and we can exchange attachments.
It should definitely be possible to get a lexer for the task you
describe with a clock cycles per character rate of 10 or better.
Best Regards
Frank
Thanks for your reply. As you said:
That's may be the reason why my code is slow down, because In my test code, I
guess it use "std::string".
I'm so grad to see the result, your quex is quite good, I will choose this one
as the lexer to refactor the " tokenizer of codecompletion plugins"
thanks
Please, let me know about your benchmark results. Note, that with
--path-compression, --template-compression, or --template-compression-uniform,
you can achieve sometimes a big deal of program space savings.
I have tried in under MSYS and MinGW 4.4.4, but faild. I even can't build the
bench mark.
See the log:
XXXXX /e/code/quex/quex-0.50.1/demo/benchmark
$ make run/lexer-quex
quex -i in/quex/c.qx \
--foreign-token-id-file in/token-ids.h \
--output-directory out \
--engine quex_scan \
--no-string-accumulator \
--token-offset 3 \
--token-prefix TKN_ \
--token-policy single
make: quex: Command not found
make: *** Error 127
It is strange, I have all python, mingw correctly installed, I can run quex
command in the windows cmd.
I don't know how to find a solution...
echo $PATH
in msys shell. In Makefile
me_watch:
echo $PATH
echo $SHELL
echo $QUEX_PATH
Note that the thing before 'echo' must be a tabulator. Then,
Dear fschaef:
Thanks for your reply. but I'm sorry I can't catch your idea. I don't know how
to follow your steps.
running the "echo $PATH" will print all the search patch.
But what does "me_watch: " mean??
Sorry, I'm not an normal linux user, I don't know much about bash.....
What does the last command "make me_watch" means?
Sorry, sorry....
The lines above print out environment variables that tell how
your environment find applications. The variable PATH contains
a list of directories where executables can be found. The
directory where quex is located must be one of them. Else,
it must be added.
Inside the make environment things may change. The make target
'me_watch' allows to print out the variables SHELL, PATH, and QUEX_PATH.
Dependent on the shell other variables may be activated. The variable
QUEX_PATH must be set to the directory where quex is installed.
Regards
Frank
thanks. but I still failed.
I add these code to the makefile under :
E:\code\quex\quex-0.50.1\demo\benchmark
The command make me_watch gives the result:
Also, I have QUEX_PATH variable defined in my windows environment
The default value is: E:\code\quex\quex-0.50.1
I have tried to change it to /e/code/quex/quex-0.50.1
But both cases, when I run the command:
make run/lexer-quex
I get the same result as before: quex command not found.
by the way, I should you should add the \r to the c.qx
like:
Also, I think the pdf manual is a bit hard for beginners. I think a mini
example and tutorial is enough. Also, I'm not sure How to generate C stype
EasyLexer. When I use the default command:
It always generate a C++ code.
thanks.
-- There is a minimalist example in the PDF Documentation. Do you mean
there should be a mini docu that only contains the minimalist example?
Well, go ahead write it in rst or plain text and I will include it in the
download
section.
-- Good point with \r
-- use command line option '--language C'. This becomes clear
from the examples in demo/C/*
Best Regards