From: Andre B. <and...@gm...> - 2002-12-27 13:49:29
|
I started to play arround with parsing of expressions and extended the tokenizer. However I expect the expression parser gets complex, for that reason I want to make sure that all are with me so that the implementation is not useless afterwards. I implemented an expression parser that checks for <expresssion-list> which is the highest level expression and creates nodes for the list members. Additionally I implemented the next level <assignment-expression> partly. Which means only realy assignments will be recognized at the moment (conditionals and throws will be ignored). Well, implementing all levels will lead to some big implementation isn't it. But I have no idea how to handle expressions in a different way, since we need to detect the hierarchy of operators to understand the expression in the right way: Just some examples to explain the reason for detailed expression parsing: a = 10; this is a simple assigment to 'a' . * ( a + 1 ) = 10; whereas this is not an assignment to 'a' . // we cannot simply search for assignment operator fctcall( a = 10 , b = 3 ) is an assignment to 'a' and to 'b' // we need to parse until the low levels of calling fcts ? What is you feeling about this Baptiste ? Is it better to implement some heuristic -- like: search for '=' and look backwards for identifier and braces ... *no-idea-if-this-works* -- Andre |
From: Baptiste L. <gai...@fr...> - 2002-12-27 19:49:19
|
----- Original Message ----- From: "Andre Baresel" <and...@gm...> To: "CppTool Mailing List" <Cpp...@li...> Sent: Friday, December 27, 2002 2:47 PM Subject: [Cpptool-develop] Expression parsing > I started to play arround with parsing of expressions and extended the > tokenizer. > However I expect the expression parser gets complex, for that reason I > want to make > sure that all are with me so that the implementation is not useless > afterwards. > > I implemented an expression parser that checks for <expresssion-list> > which is the highest > level expression and creates nodes for the list members. > Additionally I implemented the next level <assignment-expression> > partly. Which means > only realy assignments will be recognized at the moment (conditionals > and throws will be ignored). What do you means by 'conditionals and throws' assignment ? > > Well, implementing all levels will lead to some big implementation isn't > it. > But I have no idea how to handle expressions in a different way, since > we need to detect the > hierarchy of operators to understand the expression in the right way: > > Just some examples to explain the reason for detailed expression parsing: > > a = 10; this is a simple assigment to 'a' . > * ( a + 1 ) = 10; whereas this is not an assignment to 'a' > . // we cannot simply search for assignment operator > fctcall( a = 10 , b = 3 ) is an assignment to 'a' and to 'b' > // we need to parse until the low levels of calling fcts ? > > What is you feeling about this Baptiste ? I would limit the expression parser to assignement to temporary variable for now. This should cover 99% of the cases. Writing a good expression parser is very complex (you need to deal with type, function call resolution, operator overload...). We don't have all the datas to do that now, and the it does not bring much functionnalities. My guess is that it will not be needed for a while. I would limit the current extension of the expression parser to detecting assignement to unqualified identifier: Supported: a = b = c = d in expression. Not supported: *a = *(b+1) = c = 3; This should be fairly simple to implement. I propose introducing a new node type: [locale-variable-assignement-expression] assigned-variable => [#unqualified-identifier] value => [expression] The expression mutator would be extended to detect if a given expression is a 'simple expression' or a 'assignement expression'. The recursion created by the 'value' property would deal with multiple assignement. Hopefully, this should provide us with all the datas we need for now. > Is it better to implement some heuristic -- like: > search for '=' and look backwards for identifier and braces ... > *no-idea-if-this-works* I would extend the ExpressionMutator to detect the token '=' and check that the expression starts with the tokens 'identifier, assign-op', and in that case mutate to a locale-variable-assignement-expression and do the required processing for that kind of node. Baptiste. > > -- Andre |
From: Andre B. <and...@gm...> - 2002-12-27 20:15:14
|
Baptiste Lepilleur wrote: >>I implemented an expression parser that checks for <expresssion-list> >>which is the highest >>level expression and creates nodes for the list members. >>Additionally I implemented the next level <assignment-expression> >>partly. Which means >>only realy assignments will be recognized at the moment (conditionals >>and throws will be ignored). >> >> > >What do you means by 'conditionals and throws' assignment ? > Well I was stepping down the EBNF of the C++ syntax which can be found e.g. in the MSDN. Expressions start with expression-list. expression-list-elements are assignment-expressions. (Can be easily detected by comma and balanced braces) assignment-expression can be: conditional-expression "a ? 1 : 0 " throw-expression "throw <whatever>" [term] assignment-operator [term] ...Can be detected by searching for assignment-operators (a list!) and balanced braces >>Well, implementing all levels will lead to some big implementation isn't >>it. >>But I have no idea how to handle expressions in a different way, since >>we need to detect the >>hierarchy of operators to understand the expression in the right way: >> >>Just some examples to explain the reason for detailed expression parsing: >> >> a = 10; this is a simple assigment to 'a' . >> * ( a + 1 ) = 10; whereas this is not an assignment to 'a' >>. // we cannot simply search for assignment operator >> fctcall( a = 10 , b = 3 ) is an assignment to 'a' and to 'b' >> // we need to parse until the low levels of calling fcts ? >> >>What is you feeling about this Baptiste ? >> >> > >I would limit the expression parser to assignement to temporary variable for >now. This should cover 99% of the cases. Writing a good expression parser is >very complex (you need to deal with type, function call resolution, operator >overload...). We don't have all the datas to do that now, and the it does >not bring much functionnalities. My guess is that it will not be needed for >a while. > Yep, I agree with this. I'm quite comfortable with the current solution. However I'm not sure about the AST-Representation. >I would limit the current extension of the expression parser to detecting >assignement to unqualified identifier: > >Supported: a = b = c = d in expression. >Not supported: *a = *(b+1) = c = 3; > >This should be fairly simple to implement. > >I propose introducing a new node type: > >[locale-variable-assignement-expression] > assigned-variable => [#unqualified-identifier] > value => [expression] > Well the introduction of a new node type will probably introduce some problems with the testing of nodetype "expression" ... Is there any possibility to change the direct checks of the Expression-Node-Type to check some group of nodes ? This is just something for the future since every new expression-sub-type we are introducing will lead to global changes whereever we check for expression nodes. Am I wrong with this ? I don't want to make it to complex, but I was thinking about variants of the expression-node-type just using an attribute (my prototype uses newly defined properties). What about this ? >The expression mutator would be extended to detect if a given expression is >a 'simple expression' or a 'assignement expression'. The recursion created >by the 'value' property would deal with multiple assignement. > >Hopefully, this should provide us with all the datas we need for now. > Yes, my intension was not to implement that all :-) It was just a though for the future. >>Is it better to implement some heuristic -- like: >> search for '=' and look backwards for identifier and braces ... >>*no-idea-if-this-works* >> >> > >I would extend the ExpressionMutator to detect the token '=' and check that >the expression starts with the tokens 'identifier, assign-op', and in that >case mutate to a locale-variable-assignement-expression and do the required >processing for that kind of node. > As I did - but I added also the detection of surrounding expression list since with this we can also handle the following example and all expression that use the expression list operator ',' : a = 0, b = 10; -- Andre |
From: Baptiste L. <gai...@fr...> - 2003-01-07 08:45:36
|
----- Original Message ----- From: "Andre Baresel" <and...@gm...> To: "Baptiste Lepilleur" <gai...@fr...>; "CppTool Mailing List" <Cpp...@li...> Sent: Friday, December 27, 2002 9:13 PM Subject: Re: [Cpptool-develop] Expression parsing > Baptiste Lepilleur wrote: > [...] > Well I was stepping down the EBNF of the C++ syntax which can be found > e.g. in the MSDN. > Expressions start with expression-list. > expression-list-elements are assignment-expressions. (Can be easily > detected by comma and balanced braces) > assignment-expression can be: > conditional-expression "a ? 1 : 0 " > throw-expression "throw <whatever>" > [term] assignment-operator [term] ...Can > be detected by searching for assignment-operators (a list!) and balanced > braces Beware of EBNF naming in the standard. They often put name in their rule that as little to see with the actual content. The above rule should probably be named something like 'expression-with-same-priority-as-assignment'. > >I would extend the ExpressionMutator to detect the token '=' and check that > >the expression starts with the tokens 'identifier, assign-op', and in that > >case mutate to a locale-variable-assignement-expression and do the required > >processing for that kind of node. > > > As I did - but I added also the detection of surrounding expression list > since with this we can also handle > the following example and all expression that use the expression list > operator ',' : > > a = 0, b = 10; I doubt this can be done correctly because of template: namer<string , comon_naming_traits>( firstName, lastName ) => there is no way to know if namer is a template or no. On the other hand, since we are only using this to detect assignation, I guess that we can live with this (I don't think assignation are possible in template parameters). Baptiste. > > -- Andre |
From: Andre B. <and...@gm...> - 2003-01-07 19:00:42
|
Baptiste Lepilleur wrote: >I doubt this can be done correctly because of template: > >namer<string , comon_naming_traits>( firstName, lastName ) >=> there is no way to know if namer is a template or no. > ok, just to understand the example -- this expression could be a function call to a templated function ? -- Andre |
From: Baptiste L. <gai...@fr...> - 2003-01-08 18:47:20
|
----- Original Message ----- From: "Andre Baresel" <and...@gm...> To: "CppTool Mailing List" <Cpp...@li...> Sent: Tuesday, January 07, 2003 8:00 PM Subject: Re: [Cpptool-develop] Expression parsing > Baptiste Lepilleur wrote: > > >I doubt this can be done correctly because of template: > > > >namer<string , comon_naming_traits>( firstName, lastName ) > >=> there is no way to know if namer is a template or no. > > > ok, just to understand the example -- this expression could be a > function call to a templated function ? Yes. Any templates with multiple parameters will be wrongly splitted, but as I said this is not an issue at the current time, just something we should be aware of. Baptiste. > > -- Andre |