Re: [CEDET-devel] Incremental parser behavior
Brought to you by:
zappo
From: David P. <da...@dp...> - 2002-08-17 08:04:14
|
Hi Eric, > The reparse-symbol, as used in semantic 1.4 is for tags found inside > other tokens. I expanded on the original use by adding smarts for > splicing new tags in and out of the master cache, within the > child-list of some parent token. To remove the incremental parser for > child tokens would make the incremental parser nearly useless for > Java, where 90% of the file is taken up by one class. I agree with you on that point. > I think an important difference between your analysis and mine is that > I think it is ok to reparse tokens that have a parent, but only if > those tokens were generated using `semantic-repeat-parse-whole-stream', > as opposed to a recursive rule in a wisent grammar. Good point! > I think you even cover in your wisent manual the benefits of using > wisent style repetitive rules for .wy rules as opposed to the > semantic version. A side effect seems to be that it breaks the > incremental parser. If you can identify this specific scenario in > your patch, I think it would be ok. I think a simple way to identify such particular cases is to set a special property in the parent token to tell the incremental parser that it must re-parse it when children are added or removed (how the incremental parser indicates the latter case?). Maybe a `reparse-safepoint' property that could be t to indicate that this token must be re-parsed to preserve its consistency? Nevertheless, there is currently an issue with token properties: they can be set only after token is cooked, that is in a custom `semantic-expand-nonterminal' function. IMO, it would be simpler and more consistent if properties could be explicitly set in grammar semantic actions. This would eliminate unnecessary `semantic-expand-nonterminal' overhead too. I think I found a clean implementation to do that using a :properties keyword. Here it is: (defun semantic-raw-token-properties (token) "Extract properties from raw TOKEN. Properties in raw token have the form :properties PLIST, where PLIST is a property list. Return PLIST in alist form, and remove :properties PLIST from TOKEN by side effect. Return nil and dont change TOKEN, if the :properties keyword was not found." (let* ((props (memq :properties token)) (plist nil) (alist nil)) (when props (setq plist (cadr props)) (while plist (setq alist (cons (cons (car plist) (cadr plist)) alist) plist (cddr plist))) (setcar props (car (cddr props))) (setcdr props (cdr (cddr props)))) alist)) (defun semantic-raw-to-cooked-token (token) "Convert TOKEN from a raw state to a cooked state. The parser returns raw tokens with positional data START/END. We convert it from that to a cooked state with a property list and a vector [START END]. The raw token is changed with side effects and maybe expanded in several cooked tokens when the variable `semantic-expand-nonterminal' is set. So this function always returns a list of cooked tokens." ;; Because some parsers can return tokens already cooked (wisent is ;; an example), check if TOKEN was already cooked to just return it. (if (semantic-cooked-token-p token) token (let* ((props (semantic-raw-token-properties token)) (ncdr (- (length token) 2)) (propcdr (if (natnump ncdr) (nthcdr ncdr token))) (rngecdr (cdr propcdr)) ;; propcdr is the CDR containing the START from the token. ;; rngecdr is the CDR containing the END from the token. ;; PROPCDR will contain the property list after cooking. ;; RNGECDR will contain the [START END] vector after cooking. (range (condition-case nil (vector (car propcdr) (car rngecdr)) (error (debug token) nil))) result expandedtokens) ;; Convert START/END into PROPERTIES/[START END]. (setcar rngecdr range) (setcar propcdr props) ;; Expand based on local configuration (if (not semantic-expand-nonterminal) ;; No expanders (setq result (cons token result)) ;; Glom generated tokens. THESE TOKENS MUST BE VALID ONES! (setq expandedtokens (funcall semantic-expand-nonterminal token) result (if expandedtokens (append expandedtokens result) (cons token result)))) result))) For example, in WY grammar I could change the following definition: nonterminal: any_symbol COLON rules SEMI (wisent-token $1 'nonterminal nil $3 nil) ; by this one: nonterminal: any_symbol COLON rules SEMI (wisent-token $1 'nonterminal nil $3 nil :properties '(reparse-safepoint t) ) ; As :properties is a keyword it can be used anywhere in the argument list :-) I you agree with my proposal I could add the above in semantic.el, and hack `semantic-edits-incremental-parser' to take into account the `reparse-safepoint' property. What do you think? Thanks! David |