|
From: fredt <fr...@us...> - 2005-10-18 16:48:33
|
Last year, work was carried out off list and two partial / demo
implementations, one with JFlex + CUP and one with JavaCC were produced. As
the two developers that participated had to concentrate on other work, this
was not progressed since.
My assessment of the whole project can be summarized below:
1. If we want just to improve the tokenizer, JFlex seems to be OK. It seems
the licnese is not an issue as the generated files can be released under
BSD. Work done by Loic can be completed and used to replace our existing
tokenizer fairly easily.
2. If we want to introduce a new parser, a lot more than what the current
crop of parser generators can do is necessary in order to achieve the
current level of performance and simplify future development. I won't get
into the details here but I have previously outlined what has to be done. It
is a fairly large project to develop a specialized generator / compiler on
top of CUP.
Fred
----- Original Message -----
From: "Blaine Simpson" <bla...@ad...>
To: <hsq...@li...>
Sent: 18 October 2005 15:58
Subject: [Hsqldb-developers] Survey of Scanners and Parsers
I'm finally catching up to our Lexer/Parser thread after a year and a
half. I made a SQL
interpreter with
lex and yacc about 20 years ago, which was used in a real product. My
immediate goal is
to figure out which tools we are most likely to use and learn them.
Yesterday I reviewed
the available open source scanners and parsers after reading what
Campbell and others
had to say. Here are my estimations of the suitability of the available
products for our
use. In most cases, I don't mention the several products which are no
longer actively
used, which are intentionally limiting (i.e., unsuitable for complex
languages), or which
have non-Java requirements for build or run-time.
Knowing the other important goals for HSQLDB in the short and long term,
I don't
know if we'll get enough people to obligate to see through a rewrite of
the primary
HSQLDB interpreter/parser. I'm going to learn the tools anyways though,
in part
because I intend to make a PL/SQL engine for HSQLDB, and I don't think that
anything but a trivial subset would be possible without using a real
scanner and/or
parser.
Scanners
JFlex. GPL. I like it. Input looks very intuitive... at least for
somebody who used
to use lex. Documentation and examples look great. Lots of people
think
very highly of it. Change log shows that they took the effort to
follow Java
conventions (e.g. output class naming, Ant builds).
JLex. GPL-compatible. Not seeing much activity in the past couple
years.
JFlex is a "rewrite" of
JLex and can be run in JLex-compatibility mode. According to the
JFlex authors,
JFlex can do everything that JLex could do, but faster and better.
Not wanting to
take the time to run my own comparisons, I'm inclined to believe
that until I hear
somebody disagree.
SableCC. Looks like a good product. Supports EBNF. Not as widely
used as
JFlex, but still actively used. Input looks similar to lex input.
Compares well
to Antlr, but can't find comparison to JLex. Looks like I'd have to
download
the distro to find out what kind of license it has.
Coco/R. "Slightly extended" GPL. I'm not sure what the input
specification files
looks like for the Coco/R scanner, but if it is ATG files, they look
much more
difficult to maintain and much less intuitive than
lex/flex/jlex/jflex input files.
Looks like docs only availble in PDF (I love HTML docs!), but maybe the
HTML docs just aren't advertised as well as the PDFs. The
documentation
about the Java distro gives me the feeling that good Java design is
not their
top priority.
Antlr. BSD license. Poor performance.
Parsers
JavaCC (formerly "Jack"). BSD. Poor docs. I haven't seen anybody
compare CUP and
JavaCC and favor JavaCC. I saw somebody complain that JavaCC was
commercial, so
it may have been commercial at one time.
Jacc. BSD. Pure Java. I don't see much use of it for the past few
years. I'm concerned
about support, fixes for newly found bugs, etc.
CUP. GPL-compatible. Very popular. Highly regarded. Definitely
works with JFlex.
Beaver. BSD. Not as popular as CUP, but still actively used.
Definitely works with
JFlex. Allegedly the fastest performance possible for the class of
parsers that
we're looking at. Input spec looks intuitive yet powerful. Good
Java design.
SableCC. Not as widely used as CUP. I don't know about
compatibility with
JFlex. See SableCC listing in Scanner list.
Coco/R. See Coco/R listing in Scanner list.
SableCC. See SabbleCC listing in Scanner list.
Please reply with recommendations against any of these. I don't want to
waste my time
learning products that I won't ever use. I've already started learning
JFlex. CUP and
Beaver are next on my to-learn list, depending on future discussion and
findings.
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
hsqldb-developers mailing list
hsq...@li...
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
|