Thread: [cedet-semantic] Avoiding redundant rules (EXPANDFULL-related).
Brought to you by:
zappo
From: Joseph K. <ki...@ac...> - 2004-02-04 12:32:47
|
Hi Cedet/Semantic/Wisent folks, I have the following rule for parsing PVS typed identifiers and operators. typedids : idops colon_typeexpr_opt bar_expr_opt (mapcar (function (lambda (idop) (if $3 (VARIABLE-TAG idop (car $2) nil 'predicate $3) (VARIABLE-TAG idop (car $2) nil)))) $1) ; I have another rule called typedid corresponding to the single identifier/operator case. Both of these rules perform as I expect. A (full) example of such an expression is a, b, c, -, ~ : int | n >= 0 Unfortunately, typed identifiers can also occur in parentheses. My lex depth is zero, thus I have a rule of the form: binding : typedid | PAREN_BLOCK *** ; I don't know what to put where the *** is. While I could put something like (EXPANDFULL $1 some_typedids-expandfull) where the rule some_typedids-expandfull is going to be semantically equivalent to typedids, but will have to be written in the "expandfull" style. I wish to avoid this redundancy. I have tried something like (EXPAND $1 parenthesized-typedids) parenthesized-typedids : LPAREN typedids RPAREN (identity $2) ; but, for some reason, with this expression, semantic-show-unmatched-syntax-mode shows no unmatched characters in my test buffer, but bovinate returns only *unparenthesized* typedids when binding is the top-most rule. Is this a standard pattern for wisent parsers? What is the suggested structuring solution? Thanks, Joe P.S. Any comments on my previous post? http://sourceforge.net/mailarchive/message.php?msg_id=7112833 -- Joseph R. Kiniry ID 78860581 ICQ 4344804 SOS Group, University of Nijmegen http://www.cs.kun.nl/~kiniry/ KindSoftware, LLC http://www.kindsoftware.com/ Board Chair: NICE http://www.eiffel-nice.org/ |
From: Eric M. L. <er...@si...> - 2004-02-04 13:52:13
|
Howdy! >>> Joseph Kiniry <ki...@ac...> seems to think that: >Hi Cedet/Semantic/Wisent folks, > >I have the following rule for parsing PVS typed identifiers and >operators. > >typedids : idops colon_typeexpr_opt bar_expr_opt > (mapcar (function (lambda (idop) > (if $3 > (VARIABLE-TAG idop (car $2) nil 'predicate $3) > (VARIABLE-TAG idop (car $2) nil)))) > $1) > ; > >I have another rule called typedid corresponding to the single >identifier/operator case. > >Both of these rules perform as I expect. > >A (full) example of such an expression is > > a, b, c, -, ~ : int | n >= 0 Hmmm, let me try and translate: VARNAME1, VARNAME2 : DATATYPE | EXPRESSION Is that what I'm looking at? >Unfortunately, typed identifiers can also occur in parentheses. as such? ( VARNAME1, VARNAME2 : DATATYPE | EXPRESSION ) A handy trick for these situation would be to make a tag like this: (VARIABLE-TAG list-o-names-here (car $2) ...) so the representation might be: (("a" "b" "c") 'variable (:type "int)) and then write a function for the variable semantic-tag-expand-function. In this function you might write a function that turns your one compounded tag into three little tags overlapping the same area. This used to be the only way to do what you are doing with mapc above. I have not examined closely the possibility on which style is better so you can use your judgment. >My lex depth is zero, thus I have a rule of the form: > >binding : typedid > | PAREN_BLOCK > *** > ; > >I don't know what to put where the *** is. While I could put >something like > > (EXPANDFULL $1 some_typedids-expandfull) EXPAND, and EXPANDFULL have two different purposes. Use EXPAND if you want to require a perfect match on rules within the block. A missed match will cause the upper expression to become a failed match as well. I'm pretty sure it still does that. If you have a repeating entity in the block and you use EXPAND, you need to use yacc/bison style iteration to capture them, so EXPAND is also useful if you have only a single entity to match. Use EXPANDFULL when you have a repeating entity, of if you want parser failures inside the block to be ignored, and treated as unmatched syntax. >where the rule some_typedids-expandfull is going to be semantically >equivalent to typedids, but will have to be written in the >"expandfull" style. I wish to avoid this redundancy. In this case, you can use a helper-rule that calls back to the original rule you want to use. In c.by, see arg-list, and arg-sub-list. >I have tried something like > > (EXPAND $1 parenthesized-typedids) > >parenthesized-typedids : LPAREN typedids RPAREN > (identity $2) > ; > >but, for some reason, with this expression, >semantic-show-unmatched-syntax-mode shows no unmatched characters in >my test buffer, but bovinate returns only *unparenthesized* typedids >when binding is the top-most rule. Short answer: You need to save to return value of EXPAND macros Long answer: The EXPAND or EXPANDFULL macros have a return value. If you do not use the return value, it is lost. For example, you might do this: mythingy : pre-thing PAREN_BLOCK post_thing (TAG $1 'thing :innerthing (car (EXPAND $2 'inner-thing))) ; inner-thing: ... ; If innerthing above is not stored in some way, it will be lost. In c.by, see the rule "extern-c" for an example. Then look in semantic-c.el at the function semantic-expand-c-tag, and how it treats tags of type 'extern. Your case may be simpler. Perhaps a new macro to promote a tag created in an EXTERN directly into the local list is needed. That is unclear to me. >Is this a standard pattern for wisent parsers? What is the suggested >structuring solution? The semantic-tag-exand-function is how we have done stuff in the past. Your recent approach with mapc is an intriguing idea (to me at least, David may have other thoughts.) >P.S. Any comments on my previous post? > http://sourceforge.net/mailarchive/message.php?msg_id=7112833 I grepped through my local mail cache and did not see it, so it has not been delivered to my mailbox yet. >>>> In the archives bu not in my mailbox, Joseph Kiniry things that: > > What are the suggestions of the Cedet developers? Should I hold off > with what I have until you release a 1.0, or should I attempt to use > a more recent version of the repository? Is there a stable tag I can > upgrade to, rather than using HEAD? Any other comments? Short Answer: Keep going. No changes for needed for you. Long Answer: David's changes do not break existing lexical analyzer creation schemes. The old schemes will (probably) always be required for complex languages. Simple languages (or more specifically, languages with simple lexical rules) can use David's new support. It may be that working in the beta1c style lexical rules will benefit you in the future because you will better understand what is going on with the auto-generated lexical analyzer. >I have one question about the results of bovination. Some of my > resulting sexps are of the form: > > ("{ |=, y }" type > (:members > ("y" "|=") > :type "enumeration") > (reparse-symbol braced_typeexpr-expandfull) > #<overlay from 665 to 668 in typeexpr.pvs>) > > What does the "reparse-symbol" bit mean? Am I forgetting to use an > EXPANDFULL somewhere, or is this just a representation issue? > Something else? Short answer: It is added by wisent automatically. Long Answer: A tags structure is: ( "NAME" CLASS ATTRIBUTES PROPERTIES OVERLAY ) See the manual entry "Tag Basics" for more. the TAG, VARIABLE-TAG and related macros only allow you to specify a NAME, CLASS, and ATTRIBUTES. PROPERTIES and OVERLAY are internal, and not related to parsing. The bovine and wisent parsing framework framework specify some properties automatically, one of which is 'reparse-symbol'. When a tag is created, the parser knows which specific %start rule created it and saves it here. When you edit a buffer, and changes are made to the body of a tag created with a reparse-symbol, the partial reparse mechanism will know where to start because of the reparse symbol. This makes reparsing an edited buffer much faster than if the entire parent tag had to be reprased from scratch. -- Eric Ludlam: za...@gn..., er...@si... Home: http://www.ludlam.net Siege: www.siege-engine.com Emacs: http://cedet.sourceforge.net GNU: www.gnu.org |
From: Joseph K. <ki...@ac...> - 2004-02-05 12:32:20
|
Why can a rule used in an EXPANDFULL not return a list? E.g., binding_list-expandfull : LPAREN () | RPAREN () | COMMA () | typedid | typedids ; typedid returns a single variable tag like ("n" 'variable (:type "BOOL")) and typeids returns a list of variable tags like ( ("n" 'variable (:type "BOOL")) ("m" 'variable (:type "BOOL"))) In such a case, semantic signals and error and provides the backtrace: Debugger entered: ((("n" variable (:type "BOOL") nil nil 637 647) ("m" variable (:type "BOOL") nil nil 637 647))) semantic--tag-expand((("n" variable (:type "BOOL") nil nil 637 647) ("m" variable (:type "BOOL") nil nil 637 647))) semantic-repeat-parse-whole-stream(((LPAREN 636 . 637) (IDENTIFIER 637 . 638) (COMMA 638 . 639) (IDENTIFIER 640 . 641) (COLON 641 . 642) (IDENTIFIER 643 . 647) (RPAREN 647 . 648)) binding_list-expandfull nil) semantic-parse-region-default(636 648 binding_list-expandfull 1 nil) semantic-parse-region(636 648 binding_list-expandfull 1) Perhaps semantic--tag-expand should be modified to handle lists of raw tags? I guess I must lift the handling of the grouped variable tag handling yet again? Joe |
From: Joseph K. <ki...@ac...> - 2004-02-12 16:41:09
|
"Eric M. Ludlam" <er...@si...> writes: >> Joe wrote: >> P.S. Any comments on my previous post? >> http://sourceforge.net/mailarchive/message.php?msg_id=7112833 > > I grepped through my local mail cache and did not see it, so it has > not been delivered to my mailbox yet. > >>>>> In the archives but not in my mailbox, Joseph Kiniry things that: >> >> What are the suggestions of the Cedet developers? Should I hold off >> with what I have until you release a 1.0, or should I attempt to use >> a more recent version of the repository? Is there a stable tag I can >> upgrade to, rather than using HEAD? Any other comments? > > Short Answer: Keep going. No changes for needed for you. > > Long Answer: > > David's changes do not break existing lexical analyzer creation > schemes. The old schemes will (probably) always be required for > complex languages. Simple languages (or more specifically, > languages with simple lexical rules) can use David's new support. > > It may be that working in the beta1c style lexical rules will benefit > you in the future because you will better understand what is going on > with the auto-generated lexical analyzer. I've begun testing my grammar with CVS HEAD today and things look good so far. I'm now generating good looking sets of tags, but my current problem is one of structure. It is just the standard loosly-typed Lispisms; things like something should be a list of lists, but I only have a list, etc. Here is a problem that has come up though that I must ask about. I have top-level constructs that have multiple meanings. For example, a PVS datatype is a datatype, a type, and a function. Unfortunately, a top-level %start rule can only return a *single* tag, whereas I would like to return a list of tags. E.g., datatype : id theoryformals_opt COLON datatype_or_codatatype with_subtypes_ids_opt BEGIN importing_semicolon_opt assumingpart_opt datatypepart END id (list (TYPE-TAG $1 $4 (append $7 $8 $9) nil) (when $2 (FUNCTION-TAG $1 $1 $2)) (TAG $1 'datatype (append $7 $8 $9))) ; Any ideas on how to handle this problem? Thanks, Joe -- Joseph R. Kiniry ID 78860581 ICQ 4344804 SOS Group, University of Nijmegen http://www.cs.kun.nl/~kiniry/ KindSoftware, LLC http://www.kindsoftware.com/ Board Chair: NICE http://www.eiffel-nice.org/ |
From: Eric M. L. <er...@si...> - 2004-02-05 14:12:16
|
Hi, The reasons are historical. I started long ago with the premise that one call into the parser returned one or fewer tags. This worked well, and the bovine parser was pretty easy to use, but quite limiting. You could write no optional lambda expressions, and still get an interesting parse tree. We have since made things more flexible, and added the TAG style macros to help simplify things which, apparently, leads to some confusion. It is unclear to me what the right answer is. At a minimum, a useful error message out of that routine would be good. Probably if you put your tag list in as the NAME slot in a tag, then your extract mechanism would be pretty easy. Eric >>> Joseph Kiniry <ki...@ac...> seems to think that: >Why can a rule used in an EXPANDFULL not return a list? > >E.g., > >binding_list-expandfull : LPAREN > () > | RPAREN > () > | COMMA > () > | typedid > | typedids > ; > >typedid returns a single variable tag like > ("n" 'variable (:type "BOOL")) >and typeids returns a list of variable tags like > ( ("n" 'variable (:type "BOOL")) ("m" 'variable (:type "BOOL"))) > >In such a case, semantic signals and error and provides the backtrace: > >Debugger entered: ((("n" variable (:type "BOOL") nil nil 637 647) ("m" variable (:type "BOOL") nil nil 637 647))) > semantic--tag-expand((("n" variable (:type "BOOL") nil nil 637 647) ("m" variable (:type "BOOL") nil nil 637 647))) > semantic-repeat-parse-whole-stream(((LPAREN 636 . 637) (IDENTIFIER 637 . 638) (COMMA 638 . 639) (IDENTIFIER 640 . 641) (COLON 641 . 642) (IDENTIFIER 643 . 647) (RPAREN 647 . 648)) binding_list-expandfull nil) > semantic-parse-region-default(636 648 binding_list-expandfull 1 nil) > semantic-parse-region(636 648 binding_list-expandfull 1) > >Perhaps semantic--tag-expand should be modified to handle lists of >raw tags? > >I guess I must lift the handling of the grouped variable tag handling >yet again? > >Joe > -- Eric Ludlam: za...@gn..., er...@si... Home: http://www.ludlam.net Siege: www.siege-engine.com Emacs: http://cedet.sourceforge.net GNU: www.gnu.org |
From: Eric M. L. <er...@si...> - 2004-02-12 18:43:14
|
>>> Joseph Kiniry <ki...@ac...> seems to think that: [ ... ] >I've begun testing my grammar with CVS HEAD today and things look >good so far. Yay! >I'm now generating good looking sets of tags, but my current problem >is one of structure. It is just the standard loosly-typed Lispisms; >things like something should be a list of lists, but I only have a >list, etc. > >Here is a problem that has come up though that I must ask about. > >I have top-level constructs that have multiple meanings. For example, >a PVS datatype is a datatype, a type, and a function. Unfortunately, >a top-level %start rule can only return a *single* tag, whereas I >would like to return a list of tags. I had identified that if I iterate in code over the same rule to generate the tags, it was more robust that using typical looping via recursive grammar rules. This is how I can identify unmatched syntax. The assumption was that a give call into the parser would then return only one tag. >E.g., >datatype : id theoryformals_opt COLON datatype_or_codatatype with_subtypes_ids_opt > BEGIN > importing_semicolon_opt > assumingpart_opt > datatypepart > END id > (list > (TYPE-TAG $1 $4 (append $7 $8 $9) nil) > (when $2 > (FUNCTION-TAG $1 $1 $2)) > (TAG $1 'datatype (append $7 $8 $9))) > ; > >Any ideas on how to handle this problem? [ ... ] I think I would return a tag of a new class. Give it a name specific to your language. Then implement a function for `semantic-tag-expand-function'. When it sees a tag of that new class, it will replace it with two or more new tags. If the function part of your declaration is a constructor, don't forget to set the attribute :constructor to non-nil. Hmmm. I think it is the symbol `constructor' without the :. I should fix that. Anyway, that is the official way to do that, and is used in C and Java for statements like this: int a, b; Good Luck Eric -- Eric Ludlam: za...@gn..., er...@si... Home: http://www.ludlam.net Siege: www.siege-engine.com Emacs: http://cedet.sourceforge.net GNU: www.gnu.org |
From: Joseph K. <ki...@ac...> - 2004-02-16 15:30:00
|
Hello again Eric, "Eric M. Ludlam" <er...@si...> writes: >>>> Joseph Kiniry <ki...@ac...> seems to think that: > [ ... ] >>I've begun testing my grammar with CVS HEAD today and things look >>good so far. > > Yay! Right-on. >>I'm now generating good looking sets of tags, but my current problem >>is one of structure. It is just the standard loosly-typed Lispisms; >>things like something should be a list of lists, but I only have a >>list, etc. >> >>Here is a problem that has come up though that I must ask about. >> >>I have top-level constructs that have multiple meanings. For example, >>a PVS datatype is a datatype, a type, and a function. Unfortunately, >>a top-level %start rule can only return a *single* tag, whereas I >>would like to return a list of tags. > > I had identified that if I iterate in code over the same rule to > generate the tags, it was more robust that using typical looping via > recursive grammar rules. This is how I can identify unmatched syntax. I understand this choice. > The assumption was that a give call into the parser would then return > only one tag. > >>E.g., >>datatype : id theoryformals_opt COLON datatype_or_codatatype with_subtypes_ids_opt >> BEGIN >> importing_semicolon_opt >> assumingpart_opt >> datatypepart >> END id >> (list >> (TYPE-TAG $1 $4 (append $7 $8 $9) nil) >> (when $2 >> (FUNCTION-TAG $1 $1 $2)) >> (TAG $1 'datatype (append $7 $8 $9))) >> ; >> >>Any ideas on how to handle this problem? > [ ... ] > > I think I would return a tag of a new class. Give it a name specific > to your language. > > Then implement a function for `semantic-tag-expand-function'. When > it sees a tag of that new class, it will replace it with two or more > new tags. > > If the function part of your declaration is a constructor, don't > forget to set the attribute :constructor to non-nil. > > Hmmm. I think it is the symbol `constructor' without the :. I should > fix that. > > Anyway, that is the official way to do that, and is used in C and > Java for statements like this: > > int a, b; Must (sub)functions of -expand-tag be written for all non-core semantics tags, or only ones that are returned by %start denoted rules? In other words, were I to use custom tags on other, non-expand(full) related production rules, would they would be expanded with my semantic-tag-expand-function as well? (I'm deep in refactor mode so I cannot even compile my grammar at the moment, thus the silly question.) I don't see any discussion of semantic-tag-expand-function in the current CVS head documentation (beyond an extremely brief mention in the Semantic Tags chapter's Misc Tag Internals subsection), FWIW. I've been updating the docs a bit in my local sandbox, fixing typos, spelling errors, grammar, and clarifying issues. Shall I send a diff eventually to someone? Joe -- Joseph R. Kiniry ID 78860581 ICQ 4344804 SOS Group, University of Nijmegen http://www.cs.kun.nl/~kiniry/ KindSoftware, LLC http://www.kindsoftware.com/ Board Chair: NICE http://www.eiffel-nice.org/ |