Re[1]: [cedet-semantic] a little confused...

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi Phillip,

  David may be best able to answer of your questions about wisent, but
I'll try to answer some anyway.

>>> Phillip Lord <p....@ru...> seems to think that:
>
>Dear all
>
>I'm attempting to provide semantic support for a new language
>mode. I'm a little handicapped here as I've not written a grammar like
>this before. So I was wondering a few things. Please excuse my
>ignorance, if any of these questions are daft...

We do need to spend time on the developers documentation to answer
questions like this.

>1) Is there a good reason to use wisent rather than bovine. 
>
>Currently I decided to try wisent for the remarkably bad reason that
>I've got a copy of the lex/yacc book, so I thought I'd go for
>wisent. 

The bovine grammar and compiler compiler was set up by me as an
experiment and proof of concept.  I has many conveniences for
writing tagging grammars that makes writing real compilers or
taggers for complex languages very difficult.  The algorithm is also
very simple.

Wisent is a port of Bison.  The compilers it makes are much faster.
You can also take a bison/yacc grammar, and run it through the script
in bison-wisent.el as a starting point for a wisent grammar.

>2) Is there a good simple case example that I can use to build from? 
>
>I am aware of both wisent-expr.el, and the port of the calculator
>grammar. This is good, but this only demonstrates how the parser
>works. At heart what I am interested in is knowing how to hook
>semantic support up for all the other tools in emacs, in particular
>speedbar, ECB, imenu and so on. 

What you want to do is write a tagging grammar.  You can use wisent
to make interpreters, syntax checkers, and all sorts of things, but
the specific implementation you seek is a tagging grammar.  I think
wisent-awk.el is probably a good starting point.  If you need
something really simple, the cogre directory in cedet has a
wiset-dot.wy grammar which is simple also.  It parsers files in the
dot language (used to make UML diagrams and the like.)

>Say for instance I have a language like so...
>
>class ExampleClass
>{
>        method methodName1();
>        method methodName2();
>        method methodName3();
>}
>
>What would I have to do to get this is parse, and have the methods,
>and class names appear in ECB? 

This uses a trick where the contents of the { } shows up as a single
lexical token.  You grammar must then recurs into that block and
have a second parser start target for the contents of your class.

You are best of looking in the wisent section of the documentation
for answers on this topic.

>3) I'm a little confused as to how the lexer and the grammar
>   interact. 
>
>I've but into my grammar a line like so...
>
>%token CLASS    "class"
>
>
>And this works fine. On lexing a buffer, I get back the token "CLASS",
>whenever "class" appears. So the lexer appears to understand the
>grammar. 
>
>But I can't get this to work for parenthesis. So I tried...
>
>%token <open-paren> LPAREN "("
>
>But I always get back an "open-paren" token on lexing, and never
>LPAREN. For a while I even tried replacing this with

I think you may be using semantic-flex, or you did not build your own
lexical analyzer.  Check the documentation on writing and debugging
lexers.  Once you write your own lexer (which is very easy.) you will
get the names you expect.

>%token "LPAREN" "LP"
>
>which actually works fine, although
>
>class LP
>  method LP methodName RP
>RP
>
>looks ugly compared to...
>
>class(
>   method(methodName)
>)

This is because the %token entries for converting a string
representing a symbol into a symbol only works for strings
representing symbols in the default lexical analyzer.

Wisent takes the raw lexical stream, and can make compound lexical
tokens such as ":=".  I'm not really sure how it all works.  Hopefully
David can give a better description.

>4) Finally what steps do I have to take in between rebuilding the
>   grammar and testing it?
>
>I seemed to be getting inconsistencies between what the grammar was
>supposed to do, and what it was appearing to do. I think that stuff is
>getting cached somewhere. 
>
>Currently when my grammar changes I...
>
>a) regenerate it with C-cC-c. This seems to re-evaluate the generated
>   lisp.
>
>b) Call "normal-mode" in my language buffer, to re-init all local
>   variables. 
>
>c) always call "semantic-clear-top-level-cache" before I lex, or parse
>   a buffer.
>
>Are all these steps necessary. Do I need more?

Do you have a major mode for your new language?  Most of the
documentation assumes there is a major-mode pre-written for a
language.  You need the named major mode to help semantic identify
and initialize buffers for this new parser.  It needs to have a valid
syntax table that is accurate to your langauge as well.

When this is true, the rebuild process (starting with C-c C-c) should
reinitialize and reset all buffers of your major mode so no
additional work needs to be done.

Sometimes I too feel the need to reset the cache "just in case."  You
can do this simply with "C-u M-x bovinate".

>I have tried reading the documentation. I'm sure that some of these
>have already been answered there, and I have missed them, but it's a
>lot to take in all at once. 
  [ ... ]

There is a lot of good stuff in the existing manuals, but some of it
can be hard to find.  Don't give up. ;)

Eric

-- 
          Eric Ludlam:                 za...@gn..., er...@si...
   Home: http://www.ludlam.net            Siege: www.siege-engine.com
Emacs: http://cedet.sourceforge.net               GNU: www.gnu.org