Re: [cedet-semantic] using semantic/wisent as a parser

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

   In fact, you're missing a lexer definition, based on lexical rule
   analyzers auto-generated from the %type statements in your grammar.

For some reason I thought it would be totally auto-generated - as
in, no need to write anything extra. Oops!

   Also there is no need to override `semantic-parse-region' in
   simple-mode, because you use the default!

   I hacked a little bit you grammar & simple-mode support code, and
   the parser works well now with this example.simple file:

Thanks a bunch - is indeed working just fine for me now.

If it seems to you (as it does to me) that these files might help
other people who want to see a bare-bones set up, you might then
include them in the semantic distribution.

I have a few questions at this point.  The wisent-calc.wy doesn't
produce the same kind of tags as you suggested here:

   ;; For use with Semantic, must return valid semantic tags!
   expr
     : ;; empty
     | symbol
       (TAG "expr" 'expr :value $1)
     | symbol PLUS symbol
       (TAG "expr" 'expr :value (concat $1 $2 $3))
     ;

Rather, like Many of the examples in the documentation for wisent
produce things like `(cons $1 $2)' or `(+ $1 $3)'.

Since I want to "grow" this parser into an interpreter for a simple
programming language (Tiger), it seems like it would be nice to get
some usable LISP code generated at this phase.  What do you suggest
for this?  

Suppose that, following the pattern established in wisent-calc.wy, I
define

   expr
     : ;; empty
     | symbol
       (string-to-number $1)
     | symbol PLUS symbol
       (+ $1 $3)
     ;

Then I feed in the example input

   10 + 11

with bovinate and get `nil'.  (I naively might expect to get
something like `21' or `(+ 10 11)' instead.)

Even the kluge

   expr
     : ;; empty
     | symbol
       (concat $1)
     | symbol PLUS symbol
       (concat "(+ " $1 " " $3 " )")
     ;

doesn't work, though this seems to be very much like the code that
appears in wisent-java.wy.

Of course, I can write

   expr
     : ;; empty
     | symbol
       (TAG "expr" 'expr :value $1)
     | symbol PLUS symbol
       (TAG "expr" 'expr :value (concat "(+ " $1 " " $3 ")"))
     ;

and *this* will work with bovinate, but then I'd be curious to know
what the prefered method for doing operations on the output of
bovinate is.

My next question is this - even with the document-comment-start and
document-comment-end as set up here --

(defun semantic-default-simple-setup ()
  "Set up a buffer for semantic parsing of a SIMPLE language."
  ;; Install the parser
  (simple-wy--install-parser)
  ;; Setup the lexer
  (setq semantic-lex-analyzer 'simple-lexer
        ;; Do a full depth lexical analysis.
        semantic-lex-depth nil)
  ;; Other useful things.
  (setq document-comment-start "/*"
        document-comment-end   " */"))

and the apparent instruction to ignore comments as given here

(define-lex simple-lexer
  "Simple lexical analyzer."
  semantic-lex-ignore-whitespace
  semantic-lex-ignore-newline
  semantic-lex-ignore-comments
  ;;;; Auto-generated analyzers.
  simple-wy--<symbol>-regexp-analyzer
  simple-wy--<punctuation>-string-analyzer
  ;;;;
  semantic-lex-default-action)

Parsing the following input 

  /* this is a comment */

  10 + 11

  1000 + 1

produces this as output:

(("expr" expr
  (:value "this")
  nil #<overlay from 4 to 8 in example.simple>)
 ("expr" expr
  (:value "is")
  nil #<overlay from 9 to 11 in example.simple>)
 ("expr" expr
  (:value "a")
  nil #<overlay from 12 to 13 in example.simple>)
 ("expr" expr
  (:value "comment")
  nil #<overlay from 14 to 21 in example.simple>)
 ("expr" expr
  (:value "(+ 10 11)")
  nil #<overlay from 26 to 33 in example.simple>)
 ("expr" expr
  (:value "(+ 1000 1)")
  nil #<overlay from 35 to 43 in example.simple>))

How to ensure that things between /* and */ are ignored?  (Is that
space character in the definition of document-comment-end relevant?)