From: Stefan S. <sse...@ar...> - 2004-08-13 13:51:44
|
Hi Gilles, Gilles J. Seguin wrote: >On Thu, 2004-08-12 at 21:42, Stefan Seefeld wrote: >>What's the semantics (goal in general, invariants, etc.) >>of the following methods: >> >>* Encoding::GetBaseName() >>* Encoding::ResolveTypedefName() >>* TypeInfo::Reference() >>* TypeInfo::Dereference() >>* TypeInfo::Normalize() >>* TypeInfo::SkipCv() >>* TypeInfo::SkipName() >>* TypeInfo::ResolveTypedef() >> >>Thanks for any clarification ! > > >Allo stephan, > >One word, mangling. yeah, I know. I do understand the principles of type encoding and name mangling in general. My problem is with the specifics of the above methods. To explain my question a bit better: right now everything in opencxx is done with garbage collected character buffers. That's almost as bad as passing void pointers around. I'd like to replace 'char *' by 'const char *' where ptrees point into the file buffer that is being parsed. For the encodings I'm using std::basic_string<unsigned char>. Now, there are public methods that generate new 'derived' encodings from old ones (the same way you can derive types). However, some methods are merely used to parse an encoding, i.e. they don't need to create a new encoding but simply return a pointer *into* an existing encoding (I'm of course using iterators for that). Take the 'SkipCV' method: it sounds as if it returns the sub-encoding with potential 'cv' tags stripped off. An ideal candidate for an iterator, one would think. However, if you look into the code, you realize that there's quite a bit more going on. And that's exactly what I'd like to understand in detail... :-) Regards, Stefan |
From: Stefan S. <sse...@ar...> - 2004-08-13 17:22:52
|
Hi Grzegorz, thanks for taking this up :-) I'll try to rephrase the documentation. Let's iterate over it until it is correct. Once everything is covered and correct, it should be put into the code, so synopsis can extract it into a reference manual :-) Grzegorz Jakacki wrote: >> * Encoding::GetBaseName() > > > This is now EncodingUtil::GetBaseName(). Comment says: > > // GetBaseName() returns "Foo" if ENCODE is "Q[2][1]X[3]Foo", for > // example. > // If an error occurs, the function returns 0. > > Let me know what is unclear. Here is my rephrased doc: The result of GetBaseName is a substring of the 'encoding' argument. [but what is 'base' ?] The Environment will point to the scope containing the declaration. >> * Encoding::ResolveTypedefName() > > > Also has been moved to EncodingUntil::ResolveTypedefName(). > > Takes environment 'env' and typedef encoding ('name' & 'len'), returns environment in which the given typedef is defined. so the precondition is that 'name' is a typedef ? By the way, I wonder whether this method can be reimplemented without TypeInfo. AFAICS it's the only place where Encoding depends on TypeInfo, thus introducing a circular dependency. > Typeinfo --- this class is intended to represent types, however the representation is not unique, because: > > - typedefs are expanded lazily (thus many operations need > an environment, to be able to look up typedefs definitions), > - dereferences ('*') are applied lazily. yeah. Is this lazyness really necessary ? Does it gain performance ? I'm just wondering because without it (i.e. putting 'Normalize' into the constructor), most methods could become 'const'... Or alternatively the inner representation (encoding) could become 'mutable', such that normalizing a TypeInfo keeps the invariants intact and thus can semantically be considered a no-op. >> * TypeInfo::Reference() > > > Precondition: *this represents some type T > Postcondition: *this represents type T* fine >> * TypeInfo::Dereference(t) > > > Precondition: *this represents some type equivalent to T* > Postcondition: *this represents type T fine >> * TypeInfo::Normalize() > > > [I fail to understand this function clearly; my findings:] > - strips top-level cv-qualifiers > - if *this represents dereferenced function pointer changes *this > so that it represents function return type, > - if *this represents dereferenced typedef, expands typedef and > tries to proceed with dereferencing > > Name suggests that the actual represented type should not change, although its representation may (e.g. typedefs are expanded if necessary). However some details are misterious (member function pointers for instance). > This method is indeed the most puzzling: it only operates on temporary variables, so what does it *really* do ? It looks like the lazy evaluation of (de)reference and typedefs, but the result is never used, as the member variables aren't reset to the values of the local variables...?!? >> * TypeInfo::SkipCv() > > > Strips all top-level cv-qualifiers from type represented by *this; if after stripping *this represents a typedef, expands typdef and proceeds. good. (though here again: all the complexity stems from the fact that refs and typedefs are handled lazily. If it wouldn't, the implementation would become *much* simpler !) >> * TypeInfo::SkipName(encode, e) > > > Preconditions: > - encode points into cstring which encodes a typeinfo. > - encode points to the beginning of class or template name withing > the encoded typeinfo cstring > Postcondition: > - return value points immediately after the class or template name > in the encoded typeinfo cstring; if cstring contains typedef names, > they are expanded in environment e > >> * TypeInfo::ResolveTypedef(e, ptr, resolvable) > > > Preconditions: > - ptr points to the beginning of a typedef name in an encoding > - e is an environment, in which the typedef occures (lookup begins > in this environment) > Postconditions: > - if typedef can be correctly looked up, *this represents the same > type, but with the typedef name expanded. > - otherwise *this is unchanged if 'resolvable == false' or > becomes unknown (special state) otherwise. nice explanation ! > Hope this helps. It does quite a lot. Thanks ! Best regards, Stefan |
From: Grzegorz J. <ja...@ac...> - 2004-08-14 10:43:49
|
Hi, Stefan Seefeld wrote: > Hi Grzegorz, > > thanks for taking this up :-) No problem. > I'll try to rephrase the documentation. Let's > iterate over it until it is correct. > > Once everything is covered and correct, it should > be put into the code, so synopsis can extract it > into a reference manual :-) That's what I had in mind. > Grzegorz Jakacki wrote: > >>>* Encoding::GetBaseName() >> >> >>This is now EncodingUtil::GetBaseName(). Comment says: >> >>// GetBaseName() returns "Foo" if ENCODE is "Q[2][1]X[3]Foo", for >>// example. >>// If an error occurs, the function returns 0. >> >>Let me know what is unclear. > > > Here is my rephrased doc: > > The result of GetBaseName is a substring of the 'encoding' argument. > [but what is 'base' ?] > The Environment will point to the scope containing the declaration. Preconditions: - (encode, len@pre) is encoding of type T Postconditions: - U is type T with qualification stripped, assuming all qualifying typedefs and templates were looked up starting from environment 'env'@pre - 'env'@post is environment in which U is declared. - (return_value, len@post) is encoding of U Example: Assuming (encode, len@pre) is "Q[2][1]X[3]Foo" (i.e. 'X::Foo'), (return_value, len@post) is "[3]Foo" (i.e. 'Foo'). > > >>>* Encoding::ResolveTypedefName() >> >> >>Also has been moved to EncodingUntil::ResolveTypedefName(). >> >>Takes environment 'env' and typedef encoding ('name' & 'len'), returns > > environment in which the given typedef is defined. > > so the precondition is that 'name' is a typedef ? I think the precondition is that (name, len) is name (not sure how to call it really), e.g. typedef name, class name, but not e.g. pointer type. > By the way, I wonder whether this method can be reimplemented without > TypeInfo. Looks like so, but I am not 100% sure. Perhaps it is just enough to replace bind->GetType(tinfo, env); c = tinfo.ClassMetaobject(); with c = bind->ClassMetaobject() ??? ['env' is anyway reset to 0 in a moment] > AFAICS it's the only place where Encoding depends on TypeInfo, thus > introducing > a circular dependency. Encoding no longer depends on TypeInfo, offending methods have been promoted to EncodingUtil. >>Typeinfo --- this class is intended to represent types, however the > > representation is not unique, because: > >> - typedefs are expanded lazily (thus many operations need >> an environment, to be able to look up typedefs definitions), >> - dereferences ('*') are applied lazily. > > > yeah. Is this lazyness really necessary ? Does it gain performance ? > I'm just wondering because without it (i.e. putting 'Normalize' into > the constructor), most methods could become 'const'... > Or alternatively the inner representation (encoding) could become 'mutable', > such that normalizing a TypeInfo keeps the invariants intact and thus can > semantically be considered a no-op. This is a profound question. In theory this laziness is not necessary. However in practice it may be very useful (for the users, not implementors). 'gcc' is very much praised for expanding typedefs lazily, which results in much more readable error messages (especially if one uses lots of templates). My suggestion is not to remove it unless it is really really really troublesome to maintain it. The solution with 'mutable' has a drawback: the const operations will actually change the observable state of the object, because 'FullTypeName()' and 'MakePtree()' will not necesarily return the same result after normalization. Quick solution allowing to keep laziness, yet introduce constness (and preserve correct behaviour of 'FullTypeName()' and co.) is e.g.: (1) Privatize mutating member function 'Foo()' and rename to, say, 'MutatingFoo()' (2) Introduce public bool TypeInfo::Foo() const { TypeInfo tmp(*this); return tmp.MutatingFoo(); } What do you think? [...] >>>* TypeInfo::Normalize() >> >> >>[I fail to understand this function clearly; my findings:] >>- strips top-level cv-qualifiers >>- if *this represents dereferenced function pointer changes *this >> so that it represents function return type, >>- if *this represents dereferenced typedef, expands typedef and >> tries to proceed with dereferencing >> >>Name suggests that the actual represented type should not change, although > > its representation may (e.g. typedefs are expanded if necessary). However > some details are misterious (member function pointers for instance). Another mysterious details is why cv-qualifiers are stripped. > This method is indeed the most puzzling: it only operates > on temporary variables, so what does it *really* do ? > It looks like the lazy evaluation of (de)reference and typedefs, > but the result is never used, as the member variables aren't reset > to the values of the local variables...?!? It calls ResolveTypedef(), and this is mutating member function. >>>* TypeInfo::SkipCv() >> >> >>Strips all top-level cv-qualifiers from type represented by *this; if > > after stripping *this represents a typedef, expands typdef and proceeds. > > good. (though here again: all the complexity stems from the fact that refs > and typedefs > are handled lazily. If it wouldn't, the implementation would become *much* > simpler !) Hm... Maybe we really should not bother with this laziness for now... However we may have problems later, since e.g. in refactoring when using some existing type to produce new declarations it is very desirable to have it in least expanded form. Could you briefly show what exactly would be substantially easier? Best regards Grzegorz >>>* TypeInfo::SkipName(encode, e) >> >> >>Preconditions: >> - encode points into cstring which encodes a typeinfo. >> - encode points to the beginning of class or template name withing >> the encoded typeinfo cstring >>Postcondition: >> - return value points immediately after the class or template name >> in the encoded typeinfo cstring; if cstring contains typedef names, >> they are expanded in environment e >> >> >>>* TypeInfo::ResolveTypedef(e, ptr, resolvable) >> >> >>Preconditions: >> - ptr points to the beginning of a typedef name in an encoding >> - e is an environment, in which the typedef occures (lookup begins >> in this environment) >>Postconditions: >> - if typedef can be correctly looked up, *this represents the same >> type, but with the typedef name expanded. >> - otherwise *this is unchanged if 'resolvable == false' or >> becomes unknown (special state) otherwise. > > > nice explanation ! > > >>Hope this helps. > > > It does quite a lot. Thanks ! > > Best regards, > Stefan > > > ------------------------------------------------------- > SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 > Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > _______________________________________________ > Opencxx-users mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/opencxx-users |
From: Stefan S. <se...@sy...> - 2004-08-16 01:12:37
|
Grzegorz Jakacki wrote: > Hm... Maybe we really should not bother with this laziness for now... > However we may have problems later, since e.g. in refactoring when using > some existing type to produce new declarations it is very desirable to > have it in least expanded form. Could you briefly show what exactly > would be substantially easier? well, right now it is for me still a matter of understanding, not implementing. Your hint at intelligible error messages using the un-normalized name is very valuable. It suggests that normalizing is not just an internal detail of optimizing, but rather that it may be useful to make this more explicit, i.e. make it possible for the user to access a typeinfo both in its expanded and non-expanded form. Hmm, but to understand the matter better I have to see encodings and typeinfos in use. This may even be a good candidate for some unit tests: * parse some declarations * inspect (dump) the generated encodings * regenerate new ptrees from these encodings * compare these with the original declarations That would be a very good demonstration / documentation as well as a good test case. Regards, Stefan |
From: Grzegorz J. <ja...@ac...> - 2004-08-16 15:01:04
|
Stefan Seefeld wrote: [...] > Hmm, but to understand the matter better I have to see encodings and > typeinfos > in use. This may even be a good candidate for some unit tests: > > * parse some declarations > * inspect (dump) the generated encodings > * regenerate new ptrees from these encodings > * compare these with the original declarations > > That would be a very good demonstration / documentation as well as a good > test case. Definitely. I can try to tackle it, at least to build several model unit test (perhaps with Qmtest), however I would like to get 2.8 out first. I will try to get down to these unit tests within three weeks from now. BR Grzegorz > > Regards, > Stefan > > > > > ------------------------------------------------------- > SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 > Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > _______________________________________________ > Opencxx-users mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/opencxx-users |
From: Stefan S. <se...@sy...> - 2004-08-17 04:47:45
|
Grzegorz Jakacki wrote: > Encoding no longer depends on TypeInfo, offending methods have been > promoted to EncodingUtil. yeah, I noticed. But is that really a solution ? The Encoding class should encapsulate the representation of declarator type/names, and moving half of it out into 'EncodingUtil' breaks that encapsulation, i.e. even though EncodingUtil is now a separate piece of code, it very much depends on the intrinsics of the Encoding class. In fact even the TypeInfo class depends on a large degree on Encoding, so I wonder whether we shouldn't try to refactor these classes. Another issue is that of TypeInfo's expanded vs. non-expanded form... >>> Typeinfo --- this class is intended to represent types, however the >> >> >> representation is not unique, because: >> >>> - typedefs are expanded lazily (thus many operations need >>> an environment, to be able to look up typedefs definitions), >>> - dereferences ('*') are applied lazily. >> >> >> >> yeah. Is this lazyness really necessary ? Does it gain performance ? >> I'm just wondering because without it (i.e. putting 'Normalize' into >> the constructor), most methods could become 'const'... >> Or alternatively the inner representation (encoding) could become >> 'mutable', >> such that normalizing a TypeInfo keeps the invariants intact and thus can >> semantically be considered a no-op. > > > This is a profound question. > > In theory this laziness is not necessary. However in practice it may be > very useful (for the users, not implementors). 'gcc' is very much > praised for expanding typedefs lazily, which results in much more > readable error messages (especially if one uses lots of templates). My > suggestion is not to remove it unless it is really really really > troublesome to maintain it. > > The solution with 'mutable' has a drawback: the const operations will > actually change the observable state of the object, because > 'FullTypeName()' and 'MakePtree()' will not necesarily return the same > result after normalization. Speaking of which, what does 'MakePtree' actually generate ? Or, to push the question further, what do the two encodings of a declarator contain ? What is a 'type encoding', and what is a 'name encoding' ? Why isn't one enough ? A declarator either declares a type or an instance (let's count function as an instance for now), and in the first case I need the type encoding, in the second the name encoding, no ? What am I missing ? Thanks for any clarifications, Stefan |
From: Grzegorz J. <ja...@ac...> - 2004-08-17 23:44:52
|
Stefan Seefeld wrote: > Grzegorz Jakacki wrote: > > >> Encoding no longer depends on TypeInfo, offending methods have been >> promoted to EncodingUtil. > > > yeah, I noticed. But is that really a solution ? Strictly speaking not. However the details of encoding were not encapsulated even before EncodingUtil, since e.g. TypeInfo knew about it. > The Encoding > class should encapsulate the representation of declarator type/names, > and moving half of it out into 'EncodingUtil' breaks that encapsulation, > i.e. even though EncodingUtil is now a separate piece of code, it > very much depends on the intrinsics of the Encoding class. In fact > even the TypeInfo class depends on a large degree on Encoding, so I > wonder whether we shouldn't try to refactor these classes. Yes. > Another > issue is that of TypeInfo's expanded vs. non-expanded form... What's unclear about it? Best regards Grzegorz >>>> Typeinfo --- this class is intended to represent types, however the >>> >>> >>> >>> representation is not unique, because: >>> >>>> - typedefs are expanded lazily (thus many operations need >>>> an environment, to be able to look up typedefs definitions), >>>> - dereferences ('*') are applied lazily. >>> >>> >>> >>> >>> yeah. Is this lazyness really necessary ? Does it gain performance ? >>> I'm just wondering because without it (i.e. putting 'Normalize' into >>> the constructor), most methods could become 'const'... >>> Or alternatively the inner representation (encoding) could become >>> 'mutable', >>> such that normalizing a TypeInfo keeps the invariants intact and thus >>> can >>> semantically be considered a no-op. >> >> >> >> This is a profound question. >> >> In theory this laziness is not necessary. However in practice it may >> be very useful (for the users, not implementors). 'gcc' is very much >> praised for expanding typedefs lazily, which results in much more >> readable error messages (especially if one uses lots of templates). My >> suggestion is not to remove it unless it is really really really >> troublesome to maintain it. > > > > > The solution with 'mutable' has a drawback: the const operations will > > actually change the observable state of the object, because > > 'FullTypeName()' and 'MakePtree()' will not necesarily return the same > > result after normalization. > > Speaking of which, what does 'MakePtree' actually generate ? Or, to > push the question further, what do the two encodings of a declarator > contain ? What is a 'type encoding', and what is a 'name encoding' ? > Why isn't one enough ? A declarator either declares a type or an instance > (let's count function as an instance for now), and in the first case > I need the type encoding, in the second the name encoding, no ? > What am I missing ? > > Thanks for any clarifications, > Stefan > > > ------------------------------------------------------- > SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 > Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 > _______________________________________________ > Opencxx-users mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/opencxx-users |