I've been trying to get a semantic-based parser to work for the PVS
language for some time now. As I watch work on CEDET-related project
charge forward, and as often Wisent is suggested for new developers
and grammars, I decided to give it a whirl.
I find the state of the documentation and source quite confusing at
times (most of the time). This has led me to a partial understanding
of how semantic and wisent fit together.
My state at the moment:
o I have a Wisent-based grammar for PVS written.
It has basic lambda actions written for most important rules.
It has what I consider a reasonable set of conflicts (i.e., walking
the state transition table by-hand through the rules with conflicts
seems to do the right thing).
o I have tested semantic-flex to determine that I am getting the
correct token stream from the semantic (f)lexer (*). This stream
looks fine; all keywords are returned as such, punctuation is good,
o I have started to debug rules in a bottom-up fashion.
I set a single %start rule, "testrule". testrule refers to a single
'leaf' rule that I wish to test. Some of these leaf rules work
fine, when testing with semantic-show-unmatched-syntax-mode, some
partially work, and some do not work at all.
I believe my current problem must come down to the scanning (**)
process (conversion of semantic-flex results to a wisent-based
o I have defined multi-character <punctuation> tokens as nearly all
of the example wisent grammars do. E.g.,
%token <punctuation> COLON_EQ ":="
I presume that the wisent-flex is supposed to properly convert
semantic-flex single-character punctuation sequences into
multi-character punctuation tokens for wisent?
How can I test this? Perhaps something equivalent to
(setq lex-result (semantic-flex (point-min) (point-max) 0 500))
o If I edebug trace the behavior of bovinate, wisent-parse-region (!)
is called rather than wisent-parse-stream. Compiling my grammar
results in a *-install-parser function that uses
semantic-install-function-overrides to install a stream-based
parser, like all of the included wisent grammars.
Why is the compiler generating a stream-based parser? When I run
bovinate it tries to use the region-based parser (which is,
according to the docs, mutually exclusive), and thus the parser
falls back to the default wisent parser.
o Running bovinate always results in nil.
Here is an example of a leaf rule that works fine:
formulaname : AXIOM | CHALLENGE | CLAIM | CONJECTURE | COROLLARY
| FACT | FORMULA | LAW | LEMMA | OBLIGATION
| POSTULATE | PROPOSITION | SUBLEMMA | THEOREM ;
Note that consists entirely of keywords (symbols).
Here is a rule that partially works:
unaryop : NOT | TILDE | LBRACK_RBRACK | LT_GT | UMINUS ;
The "NOT" works, but the other options do not. TILDE is the
single-character punctuation '~', LBRACK_RBRACK is defined as the
double character punctuation "", but will (also?) be matched as a
semantic-list as I'm parsing with a scanner (!) depth of 0, LT_GT is
the double character punctuation "<>", and UMINUS is <punctuation>
"-*-" and is meant to be used as an illegal token sequence to provide
a placeholder for unary minus's precedence and nonassociativity.
Obviously, if I cannot get even the most simple rule to work, I have
no hope in debugging the full grammar.
The current grammar is available at
if someone has interest, expertise, and generosity.
Joseph R. Kiniry ID 78860581 ICQ 4344804
SOS Group, University of Nijmegen http://www.cs.kun.nl/~kiniry/
KindSoftware, LLC http://www.kindsoftware.com/
Board Chair: NICE http://www.eiffel-nice.org/
(*) One primary source of confusion is the hugely overloaded set of
terms used for these products. -lex- vs. -flex-, "semantic"
vs. "bovinate" etc.
(**) This term is used to avoid this whole lex vs. flex craziness.