Hi, Sorry for the delay. Everyone seems to be quite busy these
days. ;)
The original semantic parser; the bovinator, does the lexical
analysis step "under the covers". You can use a few of the described
flags to mutate the lexer's behavior, but you don't need a fancy
lexical function.
Wisent is currently growing on the side of semantic until 1.4 is
released (as soon as I get a little more free time.) At this time,
you need to take the lexical output, from the original semantic, and
adjust it to be compatible with what wisent expects in terms of it's
token consumption rules as described in the Bison manual.
Anyway, your first task is certainly to perfect the syntax table for
regular expressions. I'm not sure what more complex sregex can do,
but a good syntax table is important if you run into lexical issues.
Anyway, if you want to use the LL parser, you probably don't need to
do anything for the lexical stage except make sure you have a good
syntax table. This implies you need a mode for the buffer in which
you will put the regular expression. If fundamental mode has the
right syntax, you should be all set (as below.)
The difference between bovinate and wisent parse is which of the
persers you end up using. The bovinate routine will call
wisent-parser iff you have done the right setup for wisent (I'll let
David speak to that.) otherwise, it tries to use the bovinator, or
LL parser.
As for why you get `nil' when your run your program, my guess is that
when the bovine toplevel parser starts off, the number of parenthesis
of depth you get is 0, meaning it doesn't look into the groups. You
will need to locally bind `semantic-flex-depth' to get that value
into the lexer when called from the parser.
Another issue you have is you did not specify %start. You will need
to set that so the parser knows what rule to start with.
Lastly, your rules when started properly will generate something more
like this:
(resregex-bovinate "foo(bar)")
=> '(("foo") (group "bar"))
and the rule should look like this:
regexp
: symbol
| open-paren symbol close-paren
(group $1)
;
Anyway, you may have some rule setup based on wisent instead of the
LL bovine parser which is confusing you. I am not yet fluent in
wisent parsing so I can't address those questions.
Good Luck
Eric
>>> Alex Schroeder <alex@...> seems to think that:
>Alex Schroeder <alex@...> writes:
>
>> Anybody have a bnf file for regexps? I'm trying to write a regexp to
>> sregex converter...
>
>Here I am again, trying to figure this lexer/parser stuff out. I'm
>sorry, I just do not understand the documentation (semantic, wisent,
>and bison manuals). Can I ask for some help? Perhaps the answers can
>be added to the manual. I volunteer to write that part, once I
>understand it myself. :)
>
>I am using the CVS stuff.
>
>I want to start out with this: A regexp is a regexp, or a group. A
>group has an open-paren, a regexp, and a close-paren. Sounds simple
>enough.
>
>Here is the stuff I start out with in my BNF file.
>
>%outputfile resregex.el
>%parsetable resregex-table
>%setupfunction resregex-setup
>%languagemode fundamental-mode
>
>TopExpr
> : regexp
>;
>
>regexp
> : symbol
> | open-paren symbol close-paren
> (list 'group $1)
> ;
>
>I thought I could do this without tokens, because the lexer already
>knows about the kinds of symbols I need, and I don't have any
>keywords. That is what I figured from the node (semantic)Settings.
>
>I then wrote a function to store the flexing output in a variable:
>
>(defun resregex-lex-string (str)
> "Lex the string STR for RESREGEX.
>This uses `semantic-flex', which see."
> (with-temp-buffer
> (insert str)
> (resregex-setup)
> (setq resregex-token-input
> (semantic-flex (point-min) (point-max) 10))))
>
>This seems to work fine for the lexing stage:
>
> (resregex-lex-string "foo")
> => '((symbol 1 . 4))
>
> (resregex-lex-string "foo(bar)")
> => '((symbol 1 . 4) (open-paren 4 . 5)
> (symbol 5 . 8) (close-paren 8 . 9)))))
>
>Now, in some old xpath code I had lying around where I used wisent,
>before, I transformed this stream of tokens into another stream as
>described in the node (wisent)What the parser must receive.
>
>This is one of the points where I do not follow. Why does the lexer
>not return the kind of structure the parser needs? When I read
>(semantic)Semantic Components, nothing indicates that the output of
>the lexer is not equivalent to the input to the parser.
>
>As to wisent, last time when I was writing the XPATH stuff, I used
>wisent-parse to do the job. I forgot why this was necessary. Why was
>it? What is the difference between calling bovinate and wisent-parse?
>Is this an explanation for the different stream formats?
>
>Ok, so now I want to parse a regexp using the BNF stuff I wrote. It
>seems to me that I need something similar to resregex-lex-string
>above, but now on the bovination layer. So this is what I tried:
>
>(defun resregex-bovinate (str)
> "Bovinate the STR."
> (with-temp-buffer
> (insert str)
> (resregex-setup)
> (goto-char (point-min))
> (semantic-bovinate-toplevel)))
>
>Based on the BNF above I hoped for the following output:
>
> (resregex-bovinate "foo")
> => '("foo")
>
> (resregex-bovinate "foo(bar)")
> => '("foo" (group "bar"))
>
>Or at least something similar. But I always get nil. Why? Reading
>(semantic)Compiling a language file with the bovinator, I was unable
>to figure out what I was supposed to do, and what to expect, actually.
>Is the result saved in some variable?
>
>The (semantic)Programming stuff seems not to be geared towards writers
>of parser, rather towards writers of utilities using parsers. Once I
>understand what I am doing, I could write a little example for the
>parser writers.
>
>Alex.
--
Eric Ludlam: zappo@..., eric@...
Home: http://www.ultranet.com/~zappo Siege: http://www.siege-engine.com
Emacs: http://cedet.sourceforge.net GNU: http://www.gnu.org
|