From: Stefan S. <se...@sy...> - 2004-07-16 01:57:48
|
hi there, The following valid code doesn't parse with opencxx: --- template <typename T> void Foo(); void (*bar)() = &Foo<int>; --- I'll look into it, but in case someone else (Grzegorz, Chiba ?) has an idea, I'll send it to the list. Here is the call graph: rInitializeExpr rExpression rConditionalExpr rLogicalOrExpr rLogicalAndExpr rInclusiveOrExpr rExclusiveOrExpr rAndExpr rEqualityExpr rRelationalExpr rShiftExpr and there it happens: the parser just read the 'shift expression' ('[& Foo]'), and now runs into the '<', which it wrongly interprets as a relational operator. What can I do to resolve this conflict ? Kind regards, Stefan |
From: Stefan S. <se...@sy...> - 2004-07-16 18:04:06
|
Stefan Seefeld wrote: > hi there, > > The following valid code doesn't parse with opencxx: > > --- > > template <typename T> void Foo(); > > void (*bar)() = &Foo<int>; > > --- > > I'll look into it, but in case someone else (Grzegorz, Chiba ?) > has an idea, I'll send it to the list. A little follow-up: after '&Foo' has been parsed, 'Parser::isTemplateArgs' is been called. However, as the comment above that method indicates: /* template.args : '<' any* '>' template.args must be followed by '(' or '::' */ which obviously doesn't account for this case where a function pointer is passed, i.e. no terminating '('. Now I'm a bit lost. I guess the '(' constraint is used to disambigate from a relational expression, i.e. if we drop it, we'll misinterpret other expressions. Could I simply add ';' and ',' to the list of symbols potentially following the template args ? May be I should have a look into the new C++ parser used in gcc 3.4... Any ideas ? Chiba ? Kind regards, Stefan |
From: Grzegorz J. <ja...@ac...> - 2004-07-17 10:17:02
|
Hi, Stefan Seefeld wrote: [...] > > A little follow-up: after '&Foo' has been parsed, 'Parser::isTemplateArgs' > is been called. However, as the comment above that method indicates: > > /* > template.args : '<' any* '>' > > template.args must be followed by '(' or '::' > */ > > which obviously doesn't account for this case where a function pointer > is passed, i.e. no terminating '('. Now I'm a bit lost. I guess the '(' > constraint is used to disambigate from a relational expression, i.e. if > we drop it, we'll misinterpret other expressions. It is worse. 'f<1>(0)' will always parse as a template, even if 'f' is an int variable. :-( > Could I simply add ';' > and ',' to the list of symbols potentially following the template args ? AFAICS it will not break things further. > May be I should have a look into the new C++ parser used in gcc 3.4... Honest parsing of templates requires parser to be able to tell if an identifier is a template in current scope and this is how gcc does it (gcc/gcc/cp/parser.c, see cp_parser_lookup_name()). BR Grzegorz > > Any ideas ? Chiba ? > > Kind regards, > Stefan > > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic Workshop > FREE Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click > _______________________________________________ > Opencxx-users mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/opencxx-users |
From: Stefan S. <se...@sy...> - 2004-07-17 14:02:42
|
Grzegorz Jakacki wrote: > It is worse. 'f<1>(0)' will always parse as a template, even if 'f' is > an int variable. :-( > >> Could I simply add ';' >> and ',' to the list of symbols potentially following the template args ? > > > AFAICS it will not break things further. > >> May be I should have a look into the new C++ parser used in gcc 3.4... > > > Honest parsing of templates requires parser to be able to tell if an > identifier is a template in current scope and this is how gcc does it > (gcc/gcc/cp/parser.c, see cp_parser_lookup_name()). Ah, now I understand again ! So, the solution is to add a type dictionary ('Environment' ?) to the Parser and then let the 'isTypeSpecifier' use that instead of just detecting built-in types. Right ? Stefan |
From: Grzegorz J. <ja...@ac...> - 2004-07-18 10:01:19
|
Stefan Seefeld wrote: > Grzegorz Jakacki wrote: > >> It is worse. 'f<1>(0)' will always parse as a template, even if 'f' is >> an int variable. :-( >> >>> Could I simply add ';' >>> and ',' to the list of symbols potentially following the template args ? >> >> >> >> AFAICS it will not break things further. >> >>> May be I should have a look into the new C++ parser used in gcc 3.4... >> >> >> >> Honest parsing of templates requires parser to be able to tell if an >> identifier is a template in current scope and this is how gcc does it >> (gcc/gcc/cp/parser.c, see cp_parser_lookup_name()). > > > Ah, now I understand again ! So, the solution is to add a type dictionary > ('Environment' ?) to the Parser and then let the 'isTypeSpecifier' use that > instead of just detecting built-in types. Right ? For many cases this should work. However in general it will not. The top-level environment is maintained by ClassWalker. Driver in the main loop runs Parser::rProgram() and feeds the obtained AST into ClassWalker. Parser::rProgram() returns AST representing one top-level construct (definition or declaration), so one iteration analyzes one top-level construct. Only when ClassWalker traverses the AST, the information about types defined in it are stuffed into Environment. This is fair enough when there is no namespaces or nested templates. However, if top-level construct is a namespace, then Environment does not have a clue about anything defined in it until it is fully parsed. I don't think there is an easy and complete solution. I have one solution in mind, but it is neither cheap, nor easy. BR Grzegorz > > Stefan > > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic Workshop > FREE Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click > _______________________________________________ > Opencxx-users mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/opencxx-users |
From: Stefan S. <se...@sy...> - 2004-07-18 14:36:43
|
Grzegorz Jakacki wrote: >> Ah, now I understand again ! So, the solution is to add a type dictionary >> ('Environment' ?) to the Parser and then let the 'isTypeSpecifier' use >> that >> instead of just detecting built-in types. Right ? > > > For many cases this should work. However in general it will not. understood. Putting a type dictionary right into the parser means that the type recovery can't be done in a second pass, but instead has to be done as soon as a new type declaration has been detected. The separation between parser and mop libraries doesn't make sense in that context any more... > The top-level environment is maintained by ClassWalker. Driver in the > main loop runs Parser::rProgram() and feeds the obtained AST into > ClassWalker. Parser::rProgram() returns AST representing one top-level > construct (definition or declaration), so one iteration analyzes one > top-level construct. Only when ClassWalker traverses the AST, the > information about types defined in it are stuffed into Environment. what does 'ClassWalker' do ? The name suggests it is only inspecting class definitions, but not you are suggesting it deals with arbitrary type recovery. > This is fair enough when there is no namespaces or nested templates. > However, if top-level construct is a namespace, then Environment does > not have a clue about anything defined in it until it is fully parsed. hmm, I don't understand the design of Environment et al. yet. Why doesn't it 'have a clue' until it is 'fully parsed' ? In the example namespace Foo { typedef int Int; typedef std::vecor<Int> IntVector; } Shouldn't it know 'Int' when it sees the typedef for IntVector ? > I don't think there is an easy and complete solution. I have one > solution in mind, but it is neither cheap, nor easy. Please document the current design and the solutions you envision. It will help a lot to work out the remaining issues to make opencxx a very powerful tool that supports all standard C++. More and more people use synopsis, and the bug reports about parse errors I get become more and more sophisticated, indicating that it becomes more and more important to make the parser work correct not only for the most used C++ features, but really covering the full standard. Regards, Stefan |
From: Grzegorz J. <ja...@ac...> - 2004-07-18 15:04:03
|
Stefan Seefeld wrote: > Grzegorz Jakacki wrote: > >>> Ah, now I understand again ! So, the solution is to add a type >>> dictionary >>> ('Environment' ?) to the Parser and then let the 'isTypeSpecifier' >>> use that >>> instead of just detecting built-in types. Right ? >> >> >> >> For many cases this should work. However in general it will not. > > > understood. Putting a type dictionary right into the parser means > that the type recovery can't be done in a second pass, but instead > has to be done as soon as a new type declaration has been detected. > > The separation between parser and mop libraries doesn't make sense > in that context any more... I think it still does. First of all, Parser should not depend on Environment, just on an interface that would allow to get what it needs from environment: (iface) +---------+ +--------------------+ | Parser |---->| TemplatesResolver | +---------+ +--------------------+ | /_\ | +---------------+ | Environment | +---------------+ It would allow at least to test parser in separation from the MOP. Moreover, observe, that this interface needs to cover only a part of Environment functionality, in particular it does not have to deal with variable declarations at all. Even if we move all processing to one-pass model, I would strongly recommend that parser do not depend on concrete MOP classes, but only on interfaces. It gives a chance to reuse parser in another context, also to test it without touching the rest of the system. Good example of such parser design is XercesC++ (XML parser from Apache Foundation). > >> The top-level environment is maintained by ClassWalker. Driver in the >> main loop runs Parser::rProgram() and feeds the obtained AST into >> ClassWalker. Parser::rProgram() returns AST representing one top-level >> construct (definition or declaration), so one iteration analyzes one >> top-level construct. Only when ClassWalker traverses the AST, the >> information about types defined in it are stuffed into Environment. > > > what does 'ClassWalker' do ? The name suggests it is only inspecting > class definitions, but not you are suggesting it deals with arbitrary > type recovery. AFAIK it traverses the AST and registers all definitions and declarations in the Environment it maintains. >> This is fair enough when there is no namespaces or nested templates. >> However, if top-level construct is a namespace, then Environment does >> not have a clue about anything defined in it until it is fully parsed. > > > hmm, I don't understand the design of Environment et al. yet. Why doesn't > it 'have a clue' until it is 'fully parsed' ? In the example > > namespace Foo > { > typedef int Int; > typedef std::vecor<Int> IntVector; > } > > Shouldn't it know 'Int' when it sees the typedef for IntVector ? The top-level loop will call parser. Parser will return AST representing all the above text. At this moment the text is already parsed, but Environment object still does not know anything about Int or IntVector. Only now the main loop calls ClassWalker, which begins to traverse the AST. The traversal reaches first typedef and registers Int in top-level Environment, then second typedef and registers IntVector. So, Environment does not know about Int or IntVector until the parsing of the whole namespace ends. > >> I don't think there is an easy and complete solution. I have one >> solution in mind, but it is neither cheap, nor easy. > > > Please document the current design and the solutions you envision. It will > help a lot to work out the remaining issues to make opencxx a very powerful > tool that supports all standard C++. I am putting it really high on my TODO list, I will try to post it this week. As for existing design, however, my knowledge is limited to what I absorbed or figured out, so I am afraid there are going to be some dark corners. > More and more people use synopsis, and the bug reports about parse errors > I get become more and more sophisticated, indicating that it becomes more > and more important to make the parser work correct not only for the most > used C++ features, but really covering the full standard. That's great news. There is still a lot of work, but let's keep moving forward. BR Grzegorz |
From: Stefan S. <se...@sy...> - 2004-07-18 19:03:40
|
Grzegorz Jakacki wrote: >> what does 'ClassWalker' do ? The name suggests it is only inspecting >> class definitions, but not you are suggesting it deals with arbitrary >> type recovery. > > > AFAIK it traverses the AST and registers all definitions and > declarations in the Environment it maintains. > >>> This is fair enough when there is no namespaces or nested templates. >>> However, if top-level construct is a namespace, then Environment does >>> not have a clue about anything defined in it until it is fully parsed. >> >> >> >> hmm, I don't understand the design of Environment et al. yet. Why doesn't >> it 'have a clue' until it is 'fully parsed' ? In the example >> >> namespace Foo >> { >> typedef int Int; >> typedef std::vecor<Int> IntVector; >> } >> >> Shouldn't it know 'Int' when it sees the typedef for IntVector ? > > > The top-level loop will call parser. Parser will return AST representing > all the above text. At this moment the text is already parsed, but > Environment object still does not know anything about Int or IntVector. > Only now the main loop calls ClassWalker, which begins to traverse the > AST. The traversal reaches first typedef and registers Int in top-level > Environment, then second typedef and registers IntVector. > > So, Environment does not know about Int or IntVector until the parsing > of the whole namespace ends. well, but as we just discovered, this doesn't work for C++ in general where certain ambiguities have to be resolved using the knowledge about previously declared types, variables, etc. >> Please document the current design and the solutions you envision. It >> will >> help a lot to work out the remaining issues to make opencxx a very >> powerful >> tool that supports all standard C++. > > > I am putting it really high on my TODO list, I will try to post it this > week. As for existing design, however, my knowledge is limited to what I > absorbed or figured out, so I am afraid there are going to be some dark > corners. Yeah. But if we have very specific questions I'm sure we can always get back to Chiba to help us out. >> More and more people use synopsis, and the bug reports about parse errors >> I get become more and more sophisticated, indicating that it becomes more >> and more important to make the parser work correct not only for the most >> used C++ features, but really covering the full standard. > > > That's great news. There is still a lot of work, but let's keep moving > forward. Sure. I'm aiming for a new synopsis release within the coming week, and after that I'll start working on a refactoring of the opencxx code synopsis uses. I'm sure I'll have lots of questions in the process... Regards, Stefan |
From: Shigeru C. <ch...@is...> - 2004-07-20 14:09:36
|
Hi, Just a short comment. From: Stefan Seefeld <se...@sy...> Date: Sun, 18 Jul 2004 11:00:48 -0400 > > So, Environment does not know about Int or IntVector until the parsing > > of the whole namespace ends. > > well, but as we just discovered, this doesn't work for C++ in general where > certain ambiguities have to be resolved using the knowledge about previously > declared types, variables, etc. Yes, but maintaining the table of type and template names is really complicated. For example, the parser must be able to recognize Link as a type name in this code snippet: typedef struct { Link* next; int value; } Link; (sorry if the syntax above is wrong. I haven't written C++ code for years. :<) So my solution was to avoid maintaining such a table. Instead, the parser tries to parse given code in different ways till it gets a syntax tree without errors. So the parser does a lot of backtracking. However, I'm not sure that this approach still works with the current specifications of C++. Remember that C++ had not supported templates yet when I wrote the parser. Chiba |
From: Stefan S. <se...@sy...> - 2004-07-20 22:13:25
|
Hi Chiba, thanks for joining in ! Shigeru Chiba wrote: >>well, but as we just discovered, this doesn't work for C++ in general where >>certain ambiguities have to be resolved using the knowledge about previously >>declared types, variables, etc. > > > Yes, but maintaining the table of type and template names is really > complicated. For example, the parser must be able to recognize Link > as a type name in this code snippet: > > typedef struct { > Link* next; > int value; > } Link; > > (sorry if the syntax above is wrong. I haven't written C++ code for > years. :<) the syntax is wrong precisely because at the point where a variable of 'Link *' is generated nothing is known about 'Link'. The solution is to first issue a forward declaration of 'Link'. And that would indeed solve our problem, too (as all we want to know is that 'Link' is indeed a type, not what kind of type). I believe that this is true in general for C++: either a type has previously been declared (forward declaration counts), or we have to provide hints (isn't that why the language contains the 'typename' keyword ?) Kind regards, Stefan |
From: Christophe A. <chr...@la...> - 2004-07-21 09:05:40
|
Forward declaration is not necessary with pointer (always a 4-byte unsigned integer for ia32), but you must use a tag name for the structure to resolve this lack of information : typedef struct Link_s { struct Link_s *next; int value; } Link; Anyway, having a typedefed structure without a tag name is something permissible but bad because we are not sure how each compiler (gcc, vc++, etc.) would be able to mangle the name of the structure for a parameter in a global function for example. If someone knows, I would be glad to know it. Regards. ----- Original Message ----- From: "Stefan Seefeld" <se...@sy...> To: "Shigeru Chiba" <ch...@is...> Cc: <ope...@li...> Sent: Tuesday, July 20, 2004 8:10 PM Subject: Re: [Opencxx-users] parse error on valid code > Hi Chiba, > > thanks for joining in ! > > Shigeru Chiba wrote: > > >>well, but as we just discovered, this doesn't work for C++ in general where > >>certain ambiguities have to be resolved using the knowledge about previously > >>declared types, variables, etc. > > > > > > Yes, but maintaining the table of type and template names is really > > complicated. For example, the parser must be able to recognize Link > > as a type name in this code snippet: > > > > typedef struct { > > Link* next; > > int value; > > } Link; > > > > (sorry if the syntax above is wrong. I haven't written C++ code for > > years. :<) > > the syntax is wrong precisely because at the point where a variable of 'Link *' > is generated nothing is known about 'Link'. The solution is to first issue a forward > declaration of 'Link'. And that would indeed solve our problem, too (as all we > want to know is that 'Link' is indeed a type, not what kind of type). > > I believe that this is true in general for C++: either a type has previously > been declared (forward declaration counts), or we have to provide hints (isn't > that why the language contains the 'typename' keyword ?) > > Kind regards, > Stefan > > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic Workshop > FREE Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click > _______________________________________________ > Opencxx-users mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/opencxx-users |
From: Stefan S. <se...@sy...> - 2004-07-21 11:20:59
|
Christophe Avoinne wrote: > Forward declaration is not necessary with pointer (always a 4-byte unsigned > integer for ia32), but you must use a tag name for the structure to resolve > this lack of information : > > typedef struct Link_s { > struct Link_s *next; > int value; > } Link; interesting. I wasn't sure how standard compliant such a construct is (knowing that it comes from C...). Anyways, the important point here seems to be that the 'struct' tag serves the same purpose as the 'typename' keyword, i.e. it provides a hint to the parser to disambiguate the expression. Regards, Stefan |
From: Christophe A. <chr...@la...> - 2004-07-21 22:11:17
|
Well, more precisely : in C you always need the prefix "struct" for a struct name. for example : struct object { ... }; struct object *new_object(); void init_object(struct object *object); if you want to avoit to prefix "struct" you must use a typedef declaration : typedef struct object { ... } object; Now, in C++, you are not forced to prefix "struct", just give the name and it knows whether it is a struct name, but indeed I noticed that g++ is okay when using the prefix "struct" explicitly (prefix "class" also works most time). My opinion is that it should be standard. Anyway if you want OpenC++ to be able to parse C source, you absolutely need to have this prefix, because using "object" without its typedef declaration would lead a normal C compiler to think it is an unknown type, whereas a C++ compiler would recognize as a "struct object". Regards ----- Original Message ----- From: "Stefan Seefeld" <se...@sy...> To: <ope...@li...> Sent: Wednesday, July 21, 2004 1:17 PM Subject: Re: [Opencxx-users] parse error on valid code > Christophe Avoinne wrote: > > Forward declaration is not necessary with pointer (always a 4-byte unsigned > > integer for ia32), but you must use a tag name for the structure to resolve > > this lack of information : > > > > typedef struct Link_s { > > struct Link_s *next; > > int value; > > } Link; > > interesting. I wasn't sure how standard compliant such a construct is > (knowing that it comes from C...). > Anyways, the important point here seems to be that the 'struct' tag > serves the same purpose as the 'typename' keyword, i.e. it provides > a hint to the parser to disambiguate the expression. > > Regards, > Stefan > > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic Workshop > FREE Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click > _______________________________________________ > Opencxx-users mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/opencxx-users |
From: Christophe A. <chr...@la...> - 2004-07-21 09:05:54
|
Forward declaration is not necessary with pointer (always a 4-byte unsigned integer for ia32), but you must use a tag name for the structure to resolve this lack of information : typedef struct Link_s { struct Link_s *next; int value; } Link; Anyway, having a typedefed structure without a tag name is something permissible but bad because we are not sure how each compiler (gcc, vc++, etc.) would be able to mangle the name of the structure for a parameter in a global function for example. If someone knows, I would be glad to know it. Regards. ----- Original Message ----- From: "Stefan Seefeld" <se...@sy...> To: "Shigeru Chiba" <ch...@is...> Cc: <ope...@li...> Sent: Tuesday, July 20, 2004 8:10 PM Subject: Re: [Opencxx-users] parse error on valid code > Hi Chiba, > > thanks for joining in ! > > Shigeru Chiba wrote: > > >>well, but as we just discovered, this doesn't work for C++ in general where > >>certain ambiguities have to be resolved using the knowledge about previously > >>declared types, variables, etc. > > > > > > Yes, but maintaining the table of type and template names is really > > complicated. For example, the parser must be able to recognize Link > > as a type name in this code snippet: > > > > typedef struct { > > Link* next; > > int value; > > } Link; > > > > (sorry if the syntax above is wrong. I haven't written C++ code for > > years. :<) > > the syntax is wrong precisely because at the point where a variable of 'Link *' > is generated nothing is known about 'Link'. The solution is to first issue a forward > declaration of 'Link'. And that would indeed solve our problem, too (as all we > want to know is that 'Link' is indeed a type, not what kind of type). > > I believe that this is true in general for C++: either a type has previously > been declared (forward declaration counts), or we have to provide hints (isn't > that why the language contains the 'typename' keyword ?) > > Kind regards, > Stefan > > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic Workshop > FREE Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click > _______________________________________________ > Opencxx-users mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/opencxx-users |
From: Christophe A. <chr...@la...> - 2004-07-21 09:07:04
|
Forward declaration is not necessary with pointer (always a 4-byte unsigned integer for ia32), but you must use a tag name for the structure to resolve this lack of information : typedef struct Link_s { struct Link_s *next; int value; } Link; Anyway, having a typedefed structure without a tag name is something permissible but bad because we are not sure how each compiler (gcc, vc++, etc.) would be able to mangle the name of the structure for a parameter in a global function for example. If someone knows, I would be glad to know it. Regards. ----- Original Message ----- From: "Stefan Seefeld" <se...@sy...> To: "Shigeru Chiba" <ch...@is...> Cc: <ope...@li...> Sent: Tuesday, July 20, 2004 8:10 PM Subject: Re: [Opencxx-users] parse error on valid code > Hi Chiba, > > thanks for joining in ! > > Shigeru Chiba wrote: > > >>well, but as we just discovered, this doesn't work for C++ in general where > >>certain ambiguities have to be resolved using the knowledge about previously > >>declared types, variables, etc. > > > > > > Yes, but maintaining the table of type and template names is really > > complicated. For example, the parser must be able to recognize Link > > as a type name in this code snippet: > > > > typedef struct { > > Link* next; > > int value; > > } Link; > > > > (sorry if the syntax above is wrong. I haven't written C++ code for > > years. :<) > > the syntax is wrong precisely because at the point where a variable of 'Link *' > is generated nothing is known about 'Link'. The solution is to first issue a forward > declaration of 'Link'. And that would indeed solve our problem, too (as all we > want to know is that 'Link' is indeed a type, not what kind of type). > > I believe that this is true in general for C++: either a type has previously > been declared (forward declaration counts), or we have to provide hints (isn't > that why the language contains the 'typename' keyword ?) > > Kind regards, > Stefan > > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic Workshop > FREE Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click > _______________________________________________ > Opencxx-users mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/opencxx-users |
From: Shigeru C. <ch...@is...> - 2004-07-21 15:18:39
|
Hi Stefan, From: Stefan Seefeld <se...@sy...> Subject: Re: [Opencxx-users] parse error on valid code Date: Tue, 20 Jul 2004 14:10:40 -0400 > > (sorry if the syntax above is wrong. I haven't written C++ code for > > years. :<) > > the syntax is wrong precisely because at the point where a variable of > 'Link *' is generated nothing is known about 'Link'. The solution is > to first issue a forward declaration of 'Link'. Ahh.... my good days as a C++ programmer has been over.... > I believe that this is true in general for C++: either a type has previously > been declared (forward declaration counts), or we have to provide hints > (isn't that why the language contains the 'typename' keyword ?) Yes, parsing C++ code should not need backward references with respect to types. So now I must say that I cannot remember why I chose the approach currently used in OpenC++. However, there should be some examples in which the lexical analyzer must help the parser to record all the declared type names etc. Thus, as far as I remember, the code of the lexical analyzer of gcc 2.x(?) was quite tangled with the code of the parser. I didn't like to make the OpenC++ code tangled. Hope this info helps you guys. I think the parser needs to maintain the type name table for completely supporting templates. Chiba |
From: Grzegorz J. <ja...@ac...> - 2004-07-21 23:45:51
|
Hi all, Shigeru Chiba wrote: [...] > > Yes, parsing C++ code should not need backward references with respect to > types. AFAIK with this exception: struct A { void f(B); typedef B int; }; > So now I must say that I cannot remember why I chose the approach > currently used in OpenC++. However, there should be some examples in > which the lexical analyzer must help the parser to record all the > declared type names etc. Thus, as far as I remember, the code of the > lexical analyzer of gcc 2.x(?) was quite tangled with the code of the > parser. I didn't like to make the OpenC++ code tangled. Hope this info > helps you guys. I think the parser needs to maintain the type name > table for completely supporting templates. I think we can keep the OpenC++ parser clean with this design: <<iface>> +--------+ +--------------+ | Parser |---->| TypeResolver | +--------+ +--------------+ Where TypeResolver instance provides all the functionality requiring maintaining the type name table. Perhaps ClassWalker or Environment can be made an implementation of TypeResolver. That way code would stay not tangled (i.e. Parser still does not implement type name table, nor does it depend on concrete classes from the next layers, like Environment or ClassWalker). Best regards Grzegorz > > Chiba > > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic Workshop > FREE Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click > _______________________________________________ > Opencxx-users mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/opencxx-users |
From: Stefan S. <se...@sy...> - 2004-07-22 00:19:11
|
Grzegorz Jakacki wrote: > Hi all, > > Shigeru Chiba wrote: > > [...] > >>Yes, parsing C++ code should not need backward references with respect to >>types. > > > AFAIK with this exception: > > struct A > { > void f(B); > typedef B int; > }; huh ? Make that struct A { typedef int B; void f(B); }; and everything is fine (and covered by what was already said). > I think we can keep the OpenC++ parser clean with this design: > > <<iface>> > +--------+ +--------------+ > | Parser |---->| TypeResolver | > +--------+ +--------------+ > > Where TypeResolver instance provides all the functionality requiring > maintaining the type name table. Perhaps ClassWalker or Environment can > be made an implementation of TypeResolver. That way code would stay not > tangled (i.e. Parser still does not implement type name table, nor does > it depend on concrete classes from the next layers, like Environment or > ClassWalker). yes, agreed. The Parser is complex enough as it is right now. What is needed (and what the Parser will have to interact with) is a symbol resolver that can report whether a symbol is known, and what its (meta) type is, i.e. one of 'type', 'variable', 'compile-time const'. This symbol resolver needs to know the rules for symbol lookup with multiple scopes. That's quite hairy in itself. However, I don't see how ClassWalker relates to that. Well, if ClassWalker currently serves to report back known symbols, that's one thing. However, symbol lookup mustn't be done in a second AST traversal parse, but (as we now seem to agree) as soon as information about new symbols gets available. Parser and Symbol resolver really need to work hand in hand. Just think about declarations such as template <size_t FOOBAR> struct Baz {}; where the parser needs to know that FOOBAR is a compile-time constant. It's not even a type ! Regards, Stefan |
From: Grzegorz J. <ja...@ac...> - 2004-07-22 11:38:43
|
Stefan Seefeld wrote: > Grzegorz Jakacki wrote: > >> Hi all, >> >> Shigeru Chiba wrote: >> >> [...] >> >>> Yes, parsing C++ code should not need backward references with >>> respect to >>> types. >> >> AFAIK with this exception: >> >> struct A >> { >> void f(B); >> typedef B int; >> }; > > > huh ? Well, please ignore. If I wasn't answering in the morning I could pretend I was too tired. [...] >> I think we can keep the OpenC++ parser clean with this design: >> >> <<iface>> >> +--------+ +--------------+ >> | Parser |---->| TypeResolver | >> +--------+ +--------------+ >> >> Where TypeResolver instance provides all the functionality requiring >> maintaining the type name table. Perhaps ClassWalker or Environment can >> be made an implementation of TypeResolver. That way code would stay not >> tangled (i.e. Parser still does not implement type name table, nor does >> it depend on concrete classes from the next layers, like Environment or >> ClassWalker). > > > yes, agreed. The Parser is complex enough as it is right now. What is > needed (and what the Parser will have to interact with) is a symbol > resolver that can report whether a symbol is known, and what its (meta) > type > is, i.e. one of 'type', 'variable', 'compile-time const'. > This symbol resolver needs to know the rules for symbol lookup with > multiple > scopes. That's quite hairy in itself. > > However, I don't see how ClassWalker relates to that. Currently it is an entry point to the symbol table. > Well, if ClassWalker > currently serves to report back known symbols, that's one thing. However, > symbol lookup mustn't be done in a second AST traversal parse, but (as > we now seem to agree) as soon as information about new symbols gets > available. Parser and Symbol resolver really need to work hand in hand. > Just think about declarations such as > > template <size_t FOOBAR> struct Baz {}; > > where the parser needs to know that FOOBAR is a compile-time constant. > It's not even a type ! I don't get this example, but I think that in general I agree with you. BR Grzegorz |
From: Stefan S. <se...@sy...> - 2004-07-26 23:03:16
|
Shigeru Chiba wrote: > No, my point is not an issue of OO design. It is of parsing algorithm. > > Since the parser of OpenC++ is LL(k) --- a backtracking parser ---, > the token analyzer allows the parser to look up several tokens in > advance. The parser can read several tokens and rewind back to > several tokens before. If the rewinding happens, maybe TypeResolver > must also discard some recorded type names to distinguish duplicated > type declarations from revisited type declarations. Therefore, > TypeResolver must work with not only Parser but also TokenAnalyzer. I thought that new types would only be registered to the TypeResolver when the parser can confirm that what it is looking at is indeed a type, and not an expression. That would mean that while arbitrary 'look ahead' is required, tokens wouldn't be consumed until all ambiguities are resolved for the current token sequence, and thus, all type instantiations are final. By the way: what is a TokenAnalyzer ? How is it related to Lexer and Parser ? > I'm now remembering my desing decision. GCC used a LR(1) parser, > i.e. bison parser (I don't know whether or not gcc3 still uses bison). no, gcc 3.4 has a completely (manually) rewritten parser for C++. Regards, Stefan |
From: Shigeru C. <ch...@is...> - 2004-07-27 11:16:12
|
From: Stefan Seefeld <se...@sy...> Subject: Re: [Opencxx-users] parse error on valid code Date: Mon, 26 Jul 2004 19:00:12 -0400 > > Since the parser of OpenC++ is LL(k) --- a backtracking parser ---, > > the token analyzer allows the parser to look up several tokens in > > advance. The parser can read several tokens and rewind back to > > several tokens before. If the rewinding happens, maybe TypeResolver > > must also discard some recorded type names to distinguish duplicated > > type declarations from revisited type declarations. Therefore, > > TypeResolver must work with not only Parser but also TokenAnalyzer. > > I thought that new types would only be registered to the TypeResolver > when the parser can confirm that what it is looking at is indeed a type, > and not an expression. That would mean that while arbitrary 'look ahead' is > required, tokens wouldn't be consumed until all ambiguities are resolved > for the current token sequence, and thus, all type instantiations are > final. Yes, in theory! But due to the complexity of the C++ grammar, the OpenC++ parser sometime backtracks in a complex way. So please be careful although I don't think it's difficult. > By the way: what is a TokenAnalyzer ? How is it related to Lexer and Parser ? Ah, I mean Lexer. > > I'm now remembering my desing decision. GCC used a LR(1) parser, > > i.e. bison parser (I don't know whether or not gcc3 still uses bison). > > no, gcc 3.4 has a completely (manually) rewritten parser for C++. I see. I don't know today's fashion but using a bison/yacc for writing a compiler such as gcc 2 (not 3!) was a right decison untill mid 1990s, in which the C++ grammar had been extended beyond LR(1). Chiba |
From: Gabriel D. R. <gd...@in...> - 2004-07-27 11:19:14
|
Shigeru Chiba <ch...@is...> writes: | No, my point is not an issue of OO design. It is of parsing algorithm. | | Since the parser of OpenC++ is LL(k) --- a backtracking parser ---, | the token analyzer allows the parser to look up several tokens in | advance. The parser can read several tokens and rewind back to | several tokens before. If the rewinding happens, maybe TypeResolver | must also discard some recorded type names to distinguish duplicated | type declarations from revisited type declarations. Therefore, | TypeResolver must work with not only Parser but also TokenAnalyzer. | | I'm now remembering my desing decision. GCC used a LR(1) parser, | i.e. bison parser (I don't know whether or not gcc3 still uses bison). All GCC-3.x.y, x < 4, do. GCC-3.4.x use a handwirtten recursive descent parser. | However, I didn't think that LR(1) is appropriate for C++ or that C++ | grammar can be parsed by a pure LR(1) parser. As we are talking now, Agreed. That is extensively discussed in "Design and Evolution of C++". | a C++ parser based on LR(1) needs a table of type names. However, I think even a LL(k) parser needs such thing. | writing a LR(1) grammar that can maintain such a type-name table is a | really messy job. So, as far as I remember, the parser of GCC does | not do this job but the token analyzer of GCC does although the code | of the token analyzer of GCC is still complicated. | | So my solution when I wrote OpenC++ was to write a LL(k) parser, which | is a LL parser that may perform backtracking. Then, since LL(k) | parser is really powerful, a LL(k) parser does not need to maintain a | type-name table for parsing the C++ grammar at that time. That's why | the OpenC++ parser does not maintain a type-name table. Pardon my ignorance, but: Did you really handle full C++ (at the time), or a specific subset. The reason I'm asking is that if you do not know which names are type-names, e.g. f(a); ? | Now, almost 10 years have been passed. Sure, a LL(k) parser needs a | type-name table for parsing the current grammar of C++. So I agree to | implement a type-name table but please note that the OpenC++ parser | may backtrack. So the type-name table must be aware of that | backtracking. -- Gaby |
From: Shigeru C. <ch...@is...> - 2004-07-27 12:26:42
|
From: Gabriel Dos Reis <gd...@in...> Subject: Re: [Opencxx-users] parse error on valid code Date: 27 Jul 2004 13:20:13 +0200 > | So my solution when I wrote OpenC++ was to write a LL(k) parser, which > | is a LL parser that may perform backtracking. Then, since LL(k) > | parser is really powerful, a LL(k) parser does not need to maintain a > | type-name table for parsing the C++ grammar at that time. That's why > | the OpenC++ parser does not maintain a type-name table. > > Pardon my ignorance, but: Did you really handle full C++ (at the > time), or a specific subset. The reason I'm asking is that if you do > not know which names are type-names, e.g. > > f(a); > > ? The answer depends on what a parser should do. :) The OpenC++ parser accepts the statement above as a correct program but never gives the semantics of that statement since it does not know whether or not f is a type name. ClassWalker traverses on a syntax tree to maintain a type-name table and give the semantics to statements. So maybe I should say the OpenC++ parser is not a complete parser. It accepts not only correct C++ programs but also some wrong ones and produces a syntax tree. Error messages for the wrong programs are reported by ClassWalker etc (at the next stage). The grammar dealt with by the OpenC++ parser is a bit looser than the correct(?) one. But I think the border between syntax and semantics is ambiguous at least from the implementation viewpoint. Chiba |
From: Shigeru C. <ch...@is...> - 2004-07-26 14:13:30
|
No, my point is not an issue of OO design. It is of parsing algorithm. Since the parser of OpenC++ is LL(k) --- a backtracking parser ---, the token analyzer allows the parser to look up several tokens in advance. The parser can read several tokens and rewind back to several tokens before. If the rewinding happens, maybe TypeResolver must also discard some recorded type names to distinguish duplicated type declarations from revisited type declarations. Therefore, TypeResolver must work with not only Parser but also TokenAnalyzer. I'm now remembering my desing decision. GCC used a LR(1) parser, i.e. bison parser (I don't know whether or not gcc3 still uses bison). However, I didn't think that LR(1) is appropriate for C++ or that C++ grammar can be parsed by a pure LR(1) parser. As we are talking now, a C++ parser based on LR(1) needs a table of type names. However, writing a LR(1) grammar that can maintain such a type-name table is a really messy job. So, as far as I remember, the parser of GCC does not do this job but the token analyzer of GCC does although the code of the token analyzer of GCC is still complicated. So my solution when I wrote OpenC++ was to write a LL(k) parser, which is a LL parser that may perform backtracking. Then, since LL(k) parser is really powerful, a LL(k) parser does not need to maintain a type-name table for parsing the C++ grammar at that time. That's why the OpenC++ parser does not maintain a type-name table. Now, almost 10 years have been passed. Sure, a LL(k) parser needs a type-name table for parsing the current grammar of C++. So I agree to implement a type-name table but please note that the OpenC++ parser may backtrack. So the type-name table must be aware of that backtracking. Chiba From: Grzegorz Jakacki <ja...@ac...> Subject: Re: [Opencxx-users] parse error on valid code Date: Thu, 22 Jul 2004 07:41:13 +0800 > > So now I must say that I cannot remember why I chose the approach > > currently used in OpenC++. However, there should be some examples in > > which the lexical analyzer must help the parser to record all the > > declared type names etc. Thus, as far as I remember, the code of the > > lexical analyzer of gcc 2.x(?) was quite tangled with the code of the > > parser. I didn't like to make the OpenC++ code tangled. Hope this info > > helps you guys. I think the parser needs to maintain the type name > > table for completely supporting templates. > > I think we can keep the OpenC++ parser clean with this design: > > <<iface>> > +--------+ +--------------+ > | Parser |---->| TypeResolver | > +--------+ +--------------+ > > Where TypeResolver instance provides all the functionality requiring > maintaining the type name table. Perhaps ClassWalker or Environment can > be made an implementation of TypeResolver. That way code would stay not > tangled (i.e. Parser still does not implement type name table, nor does > it depend on concrete classes from the next layers, like Environment or > ClassWalker). |