Thread: Re: [CEDET-devel] semantic-bovinate-toplevel & lexical analysis
Brought to you by:
zappo
From: David P. <Dav...@wa...> - 2002-08-02 07:20:04
|
Hi Eric, > Based on this discussion, I checked in some changes to > semantic. Specifically, I used `define-override' to create > `semantic-bovinate-region' which is loosely based on > `semantic-edits-bovinate-region'. It does not take a table, lex > stream nor anything else. Just start, end, and a parser symbol. I > found that it didn't need a table. [...] I had a look at your last changes that introduce the new overload function `semantic-bovinate-region'. It appears that we have now the following parse functions defined: - semantic-bovinate-nonterminals (stream nonterm &optional depth returnonerror) - semantic-bovinate-region (start end &optional reparse-symbol) - semantic-bovinate-from-nonterminal-full (start end nonterm &optional depth) - semantic-bovinate-region-until-error (start end nonterm &optional depth) IMO, it is no more necessary to have all the above function defined. What I propose: 1. Remove `semantic-bovinate-from-nonterminal-full' and `semantic-bovinate-region-until-error'. 2. Merge `semantic-bovinate-nonterminals' and `semantic-bovinate-region'. So `semantic-bovinate-region' signature would become: - semantic-bovinate-region (start end &optional reparse-symbol depth returnonerror) In `semantic-bovinate-region-default' we would add the code from `semantic-bovinate-nonterminals'. The BNF and WY generators should be updated to generate calls to `semantic-bovinate-region' with required parameters. An advantage of that is simplification of code and, more important, that the bovination mechanism implemented in `semantic-bovinate-nonterminals' would be overridable :-) So we would have a full parser plug-in architecture: - the parser - the incremental parser - the "bovinator" What do you think=3F > David, you will need to create a `wisent-bovinate-region' or > similarly named function which your wisent setup code can install > via `semantic-install-function-overrides'. Then again, perhaps the > default will work for you. [...] I think it should :-) David |
From: David P. <Dav...@wa...> - 2002-08-05 13:02:13
|
Eric, [...] >Perhaps a different name would help, such as: > > semantic-bovinate-nonterminals-iterativly-from-stream [...] What do you think of using 'parse' instead of 'bovinate' like we already use 'lex' for lexical analysis functions=3F We could have the following parse API: - semantic-bovinate-nonterminal -> semantic-parse - semantic-bovinate-nonterminals -> semantic-parse-all - semantic-bovinate-incremental-parser -> semantic-parse-changes - semantic-bovinate-region -> semantic-parse-region The corresponding override symbols would be respectively: - parse - parse-all - parse-changes - parse-region David |
From: Eric M. L. <er...@si...> - 2002-08-05 21:09:53
|
>>> "David PONCE" <Dav...@wa...> seems to think that: >Eric, > >[...] >>Perhaps a different name would help, such as: >> >> semantic-bovinate-nonterminals-iterativly-from-stream >[...] > >What do you think of using 'parse' instead of 'bovinate' like we >already use 'lex' for lexical analysis functions? > >We could have the following parse API: > >- semantic-bovinate-nonterminal -> semantic-parse >- semantic-bovinate-nonterminals -> semantic-parse-all >- semantic-bovinate-incremental-parser -> semantic-parse-changes >- semantic-bovinate-region -> semantic-parse-region > >The corresponding override symbols would be respectively: > >- parse >- parse-all >- parse-changes >- parse-region [ ... ] That's a good idea. We had discussed (briefly) before about "bovinate" being synonymous with "parse" and "bovinator" being the LL parser, but this cleans that up nicely. Existing language authors should only be effected by not regenerating their language files from the bnf, and that should be ok. I'll do that next, though it won't be for a couple days yet as I have some catapulting to do (My team is a bit behind at the moment.) Eric -- Eric Ludlam: za...@gn..., er...@si... Home: www.ultranet.com/~zappo Siege: www.siege-engine.com Emacs: http://cedet.sourceforge.net GNU: www.gnu.org |
From: David P. <Dav...@wa...> - 2002-08-06 10:03:58
Attachments:
semantic.patch
|
Hi Eric, [...] >>What do you think of using 'parse' instead of 'bovinate' like we >>already use 'lex' for lexical analysis functions=3F [...] > >That's a good idea. We had discussed (briefly) before about >"bovinate" being synonymous with "parse" and "bovinator" being the LL >parser, but this cleans that up nicely. > >Existing language authors should only be effected by not regenerating >their language files from the bnf, and that should be ok. > >I'll do that next, though it won't be for a couple days yet as I have >some catapulting to do (My team is a bit behind at the moment.) I did the work :-) Attached you will find a global patch that implement the new parse API. The change log is at end. The parse API is now completely defined in semantic.el, and default implementations are in semantic.el (`semantic-parse-region-default' which is independent of the core parser), semantic-bovine.el (the LL parser) and semantic-edit.el (the incremental parser). To keep things clean and function names consistent in semantic-bovine.el and semantic-edit.el, I didn't removed the functions `semantic-bovinate-nonterminal' and `semantic-edits-incremental-parser', but respectively made `semantic-parse-default' and `semantic-parse-changes-default' aliases of them. I also added new autoloads for default functions in semantic-load.el. Finally, the changes were limited :-) What do you think=3F David --------- change log: * semantic-bovine.el: (semantic-bovinate-region-default): Removed. (semantic-bovinate-nonterminal-default): Renamed to... (semantic-bovinate-nonterminal): New. (semantic-parse-default): Alias of it. * semantic-edit.el (semantic-bovinate-incremental-parser): Removed. (semantic-bovinate-incremental-parser-default): Renamed to... (semantic-edits-incremental-parser): New. (semantic-parse-changes-default): Alias of it. (semantic-rebovinate-token): Use `semantic-parse'. * semantic-load.el: (semantic-parse-default, semantic-parse-changes-default) (semantic-parse-region-default): Added autoloads. * semantic.el: (semantic-bovinate-parser-name): Renamed to... (semantic-parser-name): New. (semantic-bovination-working-message): Use it. (semantic-parse-region): Replace `semantic-bovinate-region'. (semantic-parse): Replace `semantic-bovinate-nonterminal'. (semantic-parse-changes): Replace `semantic-bovinate-incremental-parser'. (semantic-parse-region-default): Replace `semantic-bovinate-region-default'. (semantic-parse-all): Replace `semantic-bovinate-nonterminals'. (semantic-bovinate-toplevel, semantic-bovinate-region-until-error) (semantic-bovinate-from-nonterminal) (semantic-bovinate-from-nonterminal-full): Use new parse API. |
From: Eric M. L. <er...@si...> - 2002-08-06 13:20:48
|
>>> "David PONCE" <Dav...@wa...> seems to think that: >Hi Eric, > >[...] >>>What do you think of using 'parse' instead of 'bovinate' like we >>>already use 'lex' for lexical analysis functions=3F >[...] >> >>That's a good idea. We had discussed (briefly) before about >>"bovinate" being synonymous with "parse" and "bovinator" being the LL >>parser, but this cleans that up nicely. >> >>Existing language authors should only be effected by not regenerating >>their language files from the bnf, and that should be ok. >> >>I'll do that next, though it won't be for a couple days yet as I have >>some catapulting to do (My team is a bit behind at the moment.) > >I did the work :-) > >Attached you will find a global patch that implement the new parse >API. The change log is at end. > [ ... ] That's great! Thanks. > >What do you think=3F > >David > >--------- change log: [ ... ] >* semantic.el: > [ ... ] >(semantic-parse-all): Replace `semantic-bovinate-nonterminals'. [ ... ] When I was reading the patch, this is the only name that stuck out to me. It looks ok in the change log, but isn't really clear to me in the code. Not that `semantic-bovinate-nonterminals' or even the EXPANDFULL macro are any better, really. Since we are making an effort to fix names, I would be up for a longer and better name. I keep thinking `parse-all-iterativly', or `iterative-parse', but they both sound lame to me. It would be nice if the name could convey that it loops over the core parser till there is no more input. Maybe `parse-loop'. Ugh. Otherwise, everything looks great! Thoughts? Eric -- Eric Ludlam: za...@gn..., er...@si... Home: www.ultranet.com/~zappo Siege: www.siege-engine.com Emacs: http://cedet.sourceforge.net GNU: www.gnu.org |
From: David P. <Dav...@wa...> - 2002-08-06 14:58:00
|
>>(semantic-parse-all): Replace `semantic-bovinate-nonterminals'. > > [ ... ] > > When I was reading the patch, this is the only name that stuck out to > me. It looks ok in the change log, but isn't really clear to me in > the code. Not that `semantic-bovinate-nonterminals' or even the > EXPANDFULL macro are any better, really. Since we are making an > effort to fix names, I would be up for a longer and better name. > > I keep thinking `parse-all-iterativly', or `iterative-parse', but > they both sound lame to me. It would be nice if the name could > convey that it loops over the core parser till there is no more input. > Maybe `parse-loop'. Ugh. What do you think of `semantic-repeat-parse-overall'=3F If you agree with that name I will make the change. And check changes in=3F |
From: Eric M. L. <er...@si...> - 2002-08-07 01:47:34
|
>>> "David PONCE" <Dav...@wa...> seems to think that: >>>(semantic-parse-all): Replace `semantic-bovinate-nonterminals'. >> >> [ ... ] >> >> When I was reading the patch, this is the only name that stuck out to >> me. It looks ok in the change log, but isn't really clear to me in >> the code. Not that `semantic-bovinate-nonterminals' or even the >> EXPANDFULL macro are any better, really. Since we are making an >> effort to fix names, I would be up for a longer and better name. >> >> I keep thinking `parse-all-iterativly', or `iterative-parse', but >> they both sound lame to me. It would be nice if the name could >> convey that it loops over the core parser till there is no more input. >> Maybe `parse-loop'. Ugh. > >What do you think of `semantic-repeat-parse-overall'? > >If you agree with that name I will make the change. >And check changes in? [ ... ] Hmmm. I like that direction. It makes me think `semantic-parse-stream-until-empty', `semantic-parse-stream-repeat', or `semantic-repeat-parse-entire-stream'. Of course, we would then need to change `semantic-parse' to `semantic-parse-stream', which makes sense since `semantic-parse-region' has a target for the verb `parse', so having `semantic-parse-stream' means that we are parsing some other thing. Thus `stream' means something that is not a buffer that has been processed. Even a regex parser can use a `stream' to represent a position moving through a buffer. What do you think? Sorry about being picky, the old names confused me on occasion, and it would be nice to have good clear names this time around. Eric -- Eric Ludlam: za...@gn..., er...@si... Home: www.ultranet.com/~zappo Siege: www.siege-engine.com Emacs: http://cedet.sourceforge.net GNU: www.gnu.org |
From: David P. <Dav...@wa...> - 2002-08-07 07:44:34
|
Hi Eric, >>What do you think of `semantic-repeat-parse-overall'=3F >> >>If you agree with that name I will make the change. >>And check changes in=3F > > [ ... ] > > Hmmm. I like that direction. It makes me think > `semantic-parse-stream-until-empty', `semantic-parse-stream-repeat', > or `semantic-repeat-parse-entire-stream'. Of course, we would then > need to change `semantic-parse' to `semantic-parse-stream', which > makes sense since `semantic-parse-region' has a target for the verb > `parse', so having `semantic-parse-stream' means that we are parsing > some other thing. Thus `stream' means something that is not a > buffer that has been processed. Even a regex parser can use a > `stream' to represent a position moving through a buffer. > > What do you think=3F I like your idea of `semantic-parse-stream'! What do you think of `semantic-repeat-parse-whole-stream' which looks like a more Emacs friendly name (`mark-whole-buffer'). Anyway `semantic-repeat-parse-entire-stream' is good too. I let you decide ;-) > Sorry about being picky, the old names confused me on occasion, and > it would be nice to have good clear names this time around. There is no hurry, and it is a good moment to clarify APIs ;-) To continue on this subject, I think it would improve design if we dissociate from parsers the management of parse tree (token cache) state. Parse trees are clearly independent of what parser produces them. Also, I think there is now too much buffer local flags that can indicate a parse tree state: `semantic-edits-need-reparse' `semantic-dirty-tokens' `semantic-toplevel-bovine-cache-check' `semantic-toplevel-bovine-force-reparse' What I propose is a general purpose, simple, and efficient API that semantic core and parsers can share to get/set parse tree state. IMO, there is actually three useful states that need to be managed: - The parse tree is invalid and must be completely rebuilt. - The parse tree is out of date and can be updated. - The parse tree is up to date. Following is a first implementation. What do you think=3F David ;;; Parse tree state management API ;; (defvar semantic-parse-tree-state nil "State of the current parse tree.") (make-variable-buffer-local 'semantic-parse-tree-state) (defmacro semantic-set-parse-tree-out-of-date () "Set state of current parse tree to out-of-date. The parse tree can be updated by `semantic-parse-changes'." `(setq semantic-parse-tree-state 'out-of-date)) (defmacro semantic-set-parse-tree-invalid () "Set state of current parse tree to invalid. The parse tree must be rebuilt by `semantic-parse-region'." `(setq semantic-parse-tree-state 'invalid)) (defmacro semantic-set-parse-tree-up-to-date () "Set state of current parse tree to up-to-date. The parse tree don't have to be updated." `(setq semantic-parse-tree-state nil)) (defmacro semantic-can-parse-changes-p () "Return non-nil if the current parse tree can be updated." `(eq semantic-parse-tree-state 'out-of-date)) (defmacro semantic-must-parse-region-p () "Return non-nil if the current parse tree must be rebuilt." `(eq semantic-parse-tree-state 'invalid)) (defmacro semantic-must-parse-p () "Return non-nil if the current parse tree is not up-to-date." 'semantic-parse-tree-state) |
From: Eric M. L. <er...@si...> - 2002-08-07 11:35:29
|
>>> "David PONCE" <Dav...@wa...> seems to think that: >Hi Eric, > >>>What do you think of `semantic-repeat-parse-overall'? >>> >>>If you agree with that name I will make the change. >>>And check changes in? >> >> [ ... ] >> >> Hmmm. I like that direction. It makes me think >> `semantic-parse-stream-until-empty', `semantic-parse-stream-repeat', >> or `semantic-repeat-parse-entire-stream'. Of course, we would then >> need to change `semantic-parse' to `semantic-parse-stream', which >> makes sense since `semantic-parse-region' has a target for the verb >> `parse', so having `semantic-parse-stream' means that we are parsing >> some other thing. Thus `stream' means something that is not a >> buffer that has been processed. Even a regex parser can use a >> `stream' to represent a position moving through a buffer. >> >> What do you think? > >I like your idea of `semantic-parse-stream'! What do you think of >`semantic-repeat-parse-whole-stream' which looks like a more Emacs >friendly name (`mark-whole-buffer'). > >Anyway `semantic-repeat-parse-entire-stream' is good too. I let you >decide ;-) I do like `semantic-repeat-parse-whole-stream'. Is there enough distinction between `semantic-parse-stream' and `semantic-parse-whole-stream' such that `repeat' could be removed? Hmmm. I think I'd go for these: semantic-parse-region semantic-repeat-parse-whole-stream semantic-parse-stream semantic-parse-changes Thanks >> Sorry about being picky, the old names confused me on occasion, and >> it would be nice to have good clear names this time around. > >There is no hurry, and it is a good moment to clarify APIs ;-) > >To continue on this subject, I think it would improve design if we >dissociate from parsers the management of parse tree (token cache) >state. Parse trees are clearly independent of what parser produces >them. Also, I think there is now too much buffer local flags that can >indicate a parse tree state: > >`semantic-edits-need-reparse' >`semantic-dirty-tokens' >`semantic-toplevel-bovine-cache-check' >`semantic-toplevel-bovine-force-reparse' > >What I propose is a general purpose, simple, and efficient API that >semantic core and parsers can share to get/set parse tree state. IMO, >there is actually three useful states that need to be managed: > >- The parse tree is invalid and must be completely rebuilt. >- The parse tree is out of date and can be updated. >- The parse tree is up to date. > >Following is a first implementation. What do you think? I like the idea of consolidating state variables. That is good. I also like it because it would be easy to extend to more states. [ ... ] >David > >;;; Parse tree state management API >;; >(defvar semantic-parse-tree-state nil > "State of the current parse tree.") >(make-variable-buffer-local 'semantic-parse-tree-state) A default value of `invalid may be appropriate. >(defmacro semantic-set-parse-tree-out-of-date () > "Set state of current parse tree to out-of-date. >The parse tree can be updated by `semantic-parse-changes'." > `(setq semantic-parse-tree-state 'out-of-date)) I'd be tempted to change `out-of-date' to `needs-update', as out-of-date in Makefiles means the whole target needs to be rebuilt. >(defmacro semantic-set-parse-tree-invalid () > "Set state of current parse tree to invalid. >The parse tree must be rebuilt by `semantic-parse-region'." > `(setq semantic-parse-tree-state 'invalid)) > >(defmacro semantic-set-parse-tree-up-to-date () > "Set state of current parse tree to up-to-date. >The parse tree don't have to be updated." > `(setq semantic-parse-tree-state nil)) > >(defmacro semantic-can-parse-changes-p () > "Return non-nil if the current parse tree can be updated." > `(eq semantic-parse-tree-state 'out-of-date)) I think `can' could be dropped as that is what `-p' means. Perhaps `semantic-cache-[state-symbol]-p' would be a good way to name these queries, such as `semantic-cache-invalid-p' would be good. >(defmacro semantic-must-parse-region-p () > "Return non-nil if the current parse tree must be rebuilt." > `(eq semantic-parse-tree-state 'invalid)) This name made me think you could pass in START and END. Then I thought that was a great idea! If you wanted details on token P, you could say: (semantic-cache-needs-update-in-region-p (semantic-token-start P) (semantic-token-end P)) and if that token is ok, where other stuff is not, you don't need a reparse. It would really complicate some code to pull out that efficiency and I don't know how often it would be useful though. >(defmacro semantic-must-parse-p () > "Return non-nil if the current parse tree is not up-to-date." > 'semantic-parse-tree-state) Great ideas! Thanks Eric -- Eric Ludlam: za...@gn..., er...@si... Home: www.ultranet.com/~zappo Siege: www.siege-engine.com Emacs: http://cedet.sourceforge.net GNU: www.gnu.org |
From: Eric M. L. <er...@si...> - 2002-08-02 12:59:12
|
>>> "David PONCE" <Dav...@wa...> seems to think that: >Hi Eric, > >> Based on this discussion, I checked in some changes to >> semantic. Specifically, I used `define-override' to create >> `semantic-bovinate-region' which is loosely based on >> `semantic-edits-bovinate-region'. It does not take a table, lex >> stream nor anything else. Just start, end, and a parser symbol. I >> found that it didn't need a table. >[...] > >I had a look at your last changes that introduce the new overload >function `semantic-bovinate-region'. > >It appears that we have now the following parse functions defined: > > - semantic-bovinate-nonterminals > (stream nonterm &optional depth returnonerror) > > - semantic-bovinate-region > (start end &optional reparse-symbol) > > - semantic-bovinate-from-nonterminal-full > (start end nonterm &optional depth) > > - semantic-bovinate-region-until-error > (start end nonterm &optional depth) > >IMO, it is no more necessary to have all the above function defined. >What I propose: > > 1. Remove `semantic-bovinate-from-nonterminal-full' and This one is a convenience for inside grammar rule actions. Perhaps a macro would be better? > `semantic-bovinate-region-until-error'. Ok. > 2. Merge `semantic-bovinate-nonterminals' and > `semantic-bovinate-region'. This is a rather complex function which handles an iterative step through the file, cooks tokens, and displays a working message. Perhaps instead it should be abstracted a little more so that `stream' could be anything, not just a lexical stream. Regex parsers could then use it as a buffer position, returning nil when completing a file. Leaving as is would allow specialized functions to be simple things that set up `stream' and other downward flags. I could also use it for the texinfo regexp parser. Perhaps a different name would help, such as: semantic-bovinate-nonterminals-iterativly-from-stream >So `semantic-bovinate-region' signature would become: > > - semantic-bovinate-region > (start end &optional reparse-symbol depth returnonerror) > >In `semantic-bovinate-region-default' we would add the code from >`semantic-bovinate-nonterminals'. > >The BNF and WY generators should be updated to generate calls to >`semantic-bovinate-region' with required parameters. > >An advantage of that is simplification of code and, more important, >that the bovination mechanism implemented in >`semantic-bovinate-nonterminals' would be overridable :-) > >So we would have a full parser plug-in architecture: > > - the parser > - the incremental parser > - the "bovinator" > >What do you think? I am not yet convinced because I think the iterative token collector that is `semantic-bovinate-nonterminals' which is abstract from a buffer has some difficult logic in it that is not fun to write for each parser specialization. I do agree with you API arguments, however. Override functions can be: `bovinate-region' -> does set up from buffer. Calls `bovinate-nonterminals', though a better name might be needed. `bovinate-nonterminal' -> Core parser `bovinate-incrementally' -> Analyze changes, and rebuild using `bovinate-region'. `lex' -> convert buffer into a `stream'. A regex parser (like imenu) could override `lex' to return the imenu cache, and then override `bovinate-nonterminal' to convert one imenu hit into a semantic token, all without knowing about overlays, nor cooking. Eric -- Eric Ludlam: za...@gn..., er...@si... Home: www.ultranet.com/~zappo Siege: www.siege-engine.com Emacs: http://cedet.sourceforge.net GNU: www.gnu.org |