>>>>> "Eric" == Eric M Ludlam <eric@...> writes:
>> 1) Is there a good reason to use wisent rather than bovine.
>>
>> Currently I decided to try wisent for the remarkably bad reason
>> that I've got a copy of the lex/yacc book, so I thought I'd go
>> for wisent.
Eric> The bovine grammar and compiler compiler was set up by me as
Eric> an experiment and proof of concept. I has many conveniences
Eric> for writing tagging grammars that makes writing real compilers
Eric> or taggers for complex languages very difficult. The
Eric> algorithm is also very simple.
Eric> Wisent is a port of Bison. The compilers it makes are much
Eric> faster. You can also take a bison/yacc grammar, and run it
Eric> through the script in bison-wisent.el as a starting point for
Eric> a wisent grammar.
Okay, so for my purposes there is probably not a lot in it. I don't
have an existing bison grammar, and the language is not that
complex. But it may get more complex later, so perhaps wisent is the
best way.
>> 2) Is there a good simple case example that I can use to build
>> from?
>>
>> I am aware of both wisent-expr.el, and the port of the calculator
>> grammar. This is good, but this only demonstrates how the parser
>> works. At heart what I am interested in is knowing how to hook
>> semantic support up for all the other tools in emacs, in
>> particular speedbar, ECB, imenu and so on.
Eric> What you want to do is write a tagging grammar. You can use
Eric> wisent to make interpreters, syntax checkers, and all sorts of
Eric> things, but the specific implementation you seek is a tagging
Eric> grammar. I think wisent-awk.el is probably a good starting
Eric> point. If you need something really simple, the cogre
Eric> directory in cedet has a wiset-dot.wy grammar which is simple
Eric> also. It parsers files in the dot language (used to make UML
Eric> diagrams and the like.)
Ah. Graphviz and me are old friends....
I'll have a good look at this. Its looks very promising. I should have
looked through the other directories! I only checked the semantic
directory for simple cases.
>> Say for instance I have a language like so...
>>
Eric> This uses a trick where the contents of the { } shows up as a
Eric> single lexical token. You grammar must then recurs into that
Eric> block and have a second parser start target for the contents
Eric> of your class.
Eric> You are best of looking in the wisent section of the
Eric> documentation for answers on this topic.
I played around briefly with the block analysers, but I just got
confused! I shall try the documentation again.
>> And this works fine. On lexing a buffer, I get back the token
>> "CLASS", whenever "class" appears. So the lexer appears to
>> understand the grammar.
>>
>> But I can't get this to work for parenthesis. So I tried...
>>
>> %token <open-paren> LPAREN "("
>>
>> But I always get back an "open-paren" token on lexing, and never
>> LPAREN. For a while I even tried replacing this with
Eric> I think you may be using semantic-flex, or you did not build
Eric> your own lexical analyzer. Check the documentation on writing
Eric> and debugging lexers. Once you write your own lexer (which is
Eric> very easy.) you will get the names you expect.
I tried to build my own lexical analyser, and then print out the
results out using a function hacked from wisent-java-lex-buffer. I
can't remember whether I made this use semantic-flex or not. I'll
check when I get home.
Eric> Wisent takes the raw lexical stream, and can make compound
Eric> lexical tokens such as ":=". I'm not really sure how it all
Eric> works. Hopefully David can give a better description.
Okay.
Eric> Do you have a major mode for your new language? Most of the
Eric> documentation assumes there is a major-mode pre-written for a
Eric> language. You need the named major mode to help semantic
Eric> identify and initialize buffers for this new parser. It needs
Eric> to have a valid syntax table that is accurate to your langauge
Eric> as well.
I do have a major mode. I've written a couple of (small) major modes
before, but I have never had to modify the syntax table. I've tried
this with my mode, and it mostly seems to work (forward-word and so
forth works as expected and so on).
Eric> When this is true, the rebuild process (starting with C-c C-c)
Eric> should reinitialize and reset all buffers of your major mode
Eric> so no additional work needs to be done.
Is it the "%languagemode" declaration which enables this?
Eric> Sometimes I too feel the need to reset the cache "just in
Eric> case." You can do this simply with "C-u M-x bovinate".
Okay, this is easier.
>> I have tried reading the documentation. I'm sure that some of
>> these have already been answered there, and I have missed them,
>> but it's a lot to take in all at once.
Eric> [ ... ]
Eric> There is a lot of good stuff in the existing manuals, but some
Eric> of it can be hard to find. Don't give up. ;)
Considering that semantic 2 is in its early stages, the documentation
is pretty good! The problem is here that I am trying to learn two many
new things at once, so I have two many variables to juggle at once. In
some senses it would be easier for me to do the individual bits
independently. So I have provided font-lock support in the standard
way (I don't think font-lock and semantic have been hooked up together
yet right?), and I'm sure I could do the same for imenu. But semantic
seems the way to go for the future. So I shall stick at it till it
works!
Cheers
Phil
|