Re[1]: [cedet-semantic] a little confused...
Brought to you by:
zappo
From: Eric M. L. <er...@si...> - 2003-11-10 14:10:15
|
Hi Phillip, David may be best able to answer of your questions about wisent, but I'll try to answer some anyway. >>> Phillip Lord <p....@ru...> seems to think that: > >Dear all > >I'm attempting to provide semantic support for a new language >mode. I'm a little handicapped here as I've not written a grammar like >this before. So I was wondering a few things. Please excuse my >ignorance, if any of these questions are daft... We do need to spend time on the developers documentation to answer questions like this. >1) Is there a good reason to use wisent rather than bovine. > >Currently I decided to try wisent for the remarkably bad reason that >I've got a copy of the lex/yacc book, so I thought I'd go for >wisent. The bovine grammar and compiler compiler was set up by me as an experiment and proof of concept. I has many conveniences for writing tagging grammars that makes writing real compilers or taggers for complex languages very difficult. The algorithm is also very simple. Wisent is a port of Bison. The compilers it makes are much faster. You can also take a bison/yacc grammar, and run it through the script in bison-wisent.el as a starting point for a wisent grammar. >2) Is there a good simple case example that I can use to build from? > >I am aware of both wisent-expr.el, and the port of the calculator >grammar. This is good, but this only demonstrates how the parser >works. At heart what I am interested in is knowing how to hook >semantic support up for all the other tools in emacs, in particular >speedbar, ECB, imenu and so on. What you want to do is write a tagging grammar. You can use wisent to make interpreters, syntax checkers, and all sorts of things, but the specific implementation you seek is a tagging grammar. I think wisent-awk.el is probably a good starting point. If you need something really simple, the cogre directory in cedet has a wiset-dot.wy grammar which is simple also. It parsers files in the dot language (used to make UML diagrams and the like.) >Say for instance I have a language like so... > >class ExampleClass >{ > method methodName1(); > method methodName2(); > method methodName3(); >} > >What would I have to do to get this is parse, and have the methods, >and class names appear in ECB? This uses a trick where the contents of the { } shows up as a single lexical token. You grammar must then recurs into that block and have a second parser start target for the contents of your class. You are best of looking in the wisent section of the documentation for answers on this topic. >3) I'm a little confused as to how the lexer and the grammar > interact. > >I've but into my grammar a line like so... > >%token CLASS "class" > > >And this works fine. On lexing a buffer, I get back the token "CLASS", >whenever "class" appears. So the lexer appears to understand the >grammar. > >But I can't get this to work for parenthesis. So I tried... > >%token <open-paren> LPAREN "(" > >But I always get back an "open-paren" token on lexing, and never >LPAREN. For a while I even tried replacing this with I think you may be using semantic-flex, or you did not build your own lexical analyzer. Check the documentation on writing and debugging lexers. Once you write your own lexer (which is very easy.) you will get the names you expect. >%token "LPAREN" "LP" > >which actually works fine, although > >class LP > method LP methodName RP >RP > >looks ugly compared to... > >class( > method(methodName) >) This is because the %token entries for converting a string representing a symbol into a symbol only works for strings representing symbols in the default lexical analyzer. Wisent takes the raw lexical stream, and can make compound lexical tokens such as ":=". I'm not really sure how it all works. Hopefully David can give a better description. >4) Finally what steps do I have to take in between rebuilding the > grammar and testing it? > >I seemed to be getting inconsistencies between what the grammar was >supposed to do, and what it was appearing to do. I think that stuff is >getting cached somewhere. > >Currently when my grammar changes I... > >a) regenerate it with C-cC-c. This seems to re-evaluate the generated > lisp. > >b) Call "normal-mode" in my language buffer, to re-init all local > variables. > >c) always call "semantic-clear-top-level-cache" before I lex, or parse > a buffer. > >Are all these steps necessary. Do I need more? Do you have a major mode for your new language? Most of the documentation assumes there is a major-mode pre-written for a language. You need the named major mode to help semantic identify and initialize buffers for this new parser. It needs to have a valid syntax table that is accurate to your langauge as well. When this is true, the rebuild process (starting with C-c C-c) should reinitialize and reset all buffers of your major mode so no additional work needs to be done. Sometimes I too feel the need to reset the cache "just in case." You can do this simply with "C-u M-x bovinate". >I have tried reading the documentation. I'm sure that some of these >have already been answered there, and I have missed them, but it's a >lot to take in all at once. [ ... ] There is a lot of good stuff in the existing manuals, but some of it can be hard to find. Don't give up. ;) Eric -- Eric Ludlam: za...@gn..., er...@si... Home: http://www.ludlam.net Siege: www.siege-engine.com Emacs: http://cedet.sourceforge.net GNU: www.gnu.org |