Re[2]: [CEDET-devel] wisent-c news
Brought to you by:
zappo
From: Eric M. L. <er...@si...> - 2003-07-07 04:05:06
|
Hi, >>> David Ponce <dav...@wa...> seems to think that: >Hi Eric, > > > This looks like some great stuff! > >Thanks! > > > I will have to upgrade my parser knowledge in order to participate > > properly. Do you think your use of abstract syntax trees will be > > necessary in many grammars? Better yet, would they simplify some > > existing grammars like the java and python grammars? > >I used sort of AST in the C grammar to deal with the high level of >generality of C declarations that can appears both at "top level" and >inside other declarations (like in function parameter lists). > >Using an AST permits to keep useful notions in a convenient data >structure, and simplifies access to these informations at higher >level to produce semantic tags. > >Maybe using AST could simplify other grammars too. I don't know. I >have to look more thoroughly at grammars code to see if it is worth >using AST. > > > If so, is it possible to tie it up in a formalized way so less > > wisent-ast programming is needed in the grammars? > >I am afraid, but I am not sure to understand what you would like to >achieve here. IMO wisent-ast programming is already very simple and >mainly consists of these actions: > >- add/replace a node value. >- merge two ASTs to obtain a new AST containing the union of nodes. >- retrieve value of nodes. > >Perhaps, wisent-ast could become a more general semantic-ast library, >as there is no wisent specific code there? Even better we could use >grammar built-in macros to completely hide the implementation, like >it is already done with tags. Something like this: > >- AST-ADD >- AST-PUT >- AST-GET >- AST-MERGE >- etc. This is indeed what I was thinking, and share your need for additional consideration on the topic. > > I do not fully understand the specifics of how it is being utilized > > yet, but it seems similar to what the .by C parser does to compound > > lots of data into a tag that is formalized later in the post parser > > expand function. > >One of the difficulty with the C grammar, is that a declaration is a >very high level thing that can correspond to a type declaration, a >variable declaration, a function declaration (prototype) or >definition. Another difficulty is that some informations needed to >produce tags, are obtained by deeply nested rules. > >Using AST here permits to store the useful informations that will be >used in a later step to decide what kind of declaration tag to >produce. That can be decided at a higher level rule semantic action >for simple cases, or at tag expansion for more complex declarations >like this one: > >int i = 0, *const j = 1, fprintf(HANDLE hnd, char * msg, ...); > >The above declaration is in fact equivalent to three declarations: > >int i = 0; >int *const j = 1; >int fprintf(HANDLE hnd, char * msg, ...); > >In such cases `wisent-c-expand-tag' receives a variable tag whose >name is a list of three AST, one for each compound item that contains >the sub-declaration part of that compound item. For the above >example the list of AST will be something like: > >( > (:id ("i")) > (:id ("j") :specifiers ("const") :pointer ("*")) > (:id ("fprintf") :parms (the list of parameter tags)) > ) > >In fine, that will be expanded to three tags: > >("i" variable :type "int" ...) >("j" variable :type "int" ((typemodifiers const)) ...) >("fprintf" function :type "int" > :arguments (the list of parameter tags) > ...) Yes, this is what the .by grammar must do as well, so it is being used for the same purpose as what I was using the "name" slot for. [ ... ] > > I would like to encourage you to check in your "experimental" > > changes. I'm sure your new parser will with grow to completely > > replace the old one in time. > > > > It might also be nice if the long comment describing the > > contributing grammar were in a separate text file. It might help > > keep it pristine as changes shake the working grammar. > >Maybe it would be worth using a texinfo file? Whatever you think can be best leveraged to help new users. Perhaps our doc directory should have a main index, like the Emacs "dir" file, and it could contain a link to the C texinfo file. > > Do you think the wisent directory is a bit cluttered yet? > > Perhaps it is time to update the directory tree a bit: > > > > cedet/semantic/bovine > > /doc > > /parsers/c > > /java > > /python > > /... > > /tests > > /wisent > > > > That might make the build process and load path a bit tricky, but > > having test files next to the parser would make things neater. We > > could put both the .by and .wy grammars in the same directory. > >That's a nice idea! I would prefer to call the "parsers" directory: >"languages", or something like that, to avoid confusion between true >parsers and support stuff needed to parse a particular language. > >What do you think of this directory organization? > >cedet/semantic > /core (?) > /doc > /languages > /grammars (?) > /C > /java > /python > /... > /parsers > /bovine > /wisent > /util (?) > >The "main" directory would contain all the core stuff. > >Perhaps it would make sense to have a "core" subdirectory for core and >db libraries, and an "util" subdirectory for applications (completion, >intellisense, etc.). The main directory would only contain project >files like NEWS, INSTALL, README, ChangeLog, Makefile, project.ede, >semantic-load.el, semantic-al.el, etc.. In the existing project files, I have a "semantic" and "tools" target as examples of some groups. I often see this for a package called NAME: NAME/src NAME/othersubdir so perhaps src would be more logical than core? >The "parsers" subdirectory would contains the parser engines and >associated support libraries. Common parser stuff would be directly >in that directory. Each specific parser would be in its own >subdirectory. This sounds ok to me. As there are only 2 parsers, I don't mind them at the top level either. >The "languages" subdirectory would contain language support stuff. >Common things would be directly in that directory, and each language >specific stuff (including tests) would be in its own subdirectory. >The grammar framework could be in a "grammars" subdirectory or >directly under the "languages" directory. Languages is indeed a better word to use. The Emacs parallel is "modes", which doesn't really fit our model very well. ;) Thanks! Eric -- Eric Ludlam: za...@gn..., er...@si... Home: http://www.ludlam.net Siege: www.siege-engine.com Emacs: http://cedet.sourceforge.net GNU: www.gnu.org |