Re: [Opencxx-users] OpenC++ Core Lib -- prototype

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Grzegorz Jakacki wrote:

 >> I'd generally much prefer to start with an assessment (i.e. documentation
 >> in terms of actual text, unit tests, and examples) of the existing design.
 >
 >
 > I think it is suboptimal in this case for two reasons. First, there is
 > very little documentation of existing system, so almost all documentation
 > would need to be reverse engineered (I already did some of it, as you may
 > remember).

Yes. But don't we need to reverse engineer if we want to apply changes in a
controlled manner ?
I can tell you how much head-aches I have these days trying to understand
why a given method is implemented the way it is. The fact that the old code
wasn't const-correct obviously doesn't help, i.e. I have to guess whether a
given argument is going to be modified in-place or whether it's only
an 'in' parameter.

So reverse-engineering is what takes 90% of my time. Once I understand a
given design it's quite easy to see whether it can be enhanced.

 > Second, we cannot provide accurate unit tests post factum, because we are
 > not sure what the code should really do.

granted. Though they would serve as regression tests, i.e. we could measure
what kind of changes we incure to the data handling while we work on the
redesign. In fact, there are, as we discussed, a couple of cases that are known
to fail. Those known failures should be accompagnied by (annotated) unit
tests, if only to document that we know about them.

 > I think that the way to go is to read and clean the code step by step,
 > adding unit tests as the knowledge about the code grows.
 >
 >> Some months ago I proposed a rigorous separation between 'parser'
 >> (including ptree API) and higher level classes (Environment, TypeInfo,
 >> Class, etc.).
 >
 >
 > In fact parser separation started several months before you joined in April.
 > (http://sourceforge.net/mailarchive/message.php?msg_id=6971830)
 >
 >> Now that my understanding of occ internals is a little better, I would
 >> revise my suggestions I then made.
 >
 >
 > Go ahead.

nothing dramatic :-) I simply concluded that the boundary between 'low' and
'high' level is a little blury, or not where I originally expected it to be.
That's mainly because I didn't realize at the time that what we called 'parsing'
was only the first half, the second half including the use of 'ClassWalker' and co.

As we now agree to redesign the parser such that it does the declarator
recording (Environment...) itself instead of letting it done by a Walker, it
could mean that the Walker could be part of the high level.
Right now it can't because parsing isn't complete (i.e. self-contained)
without ClassWalker.

 >> and so I believe in order to be able
 >> to make 'educated suggestions' we should start by discussing the existing
 >> design.
 >> In particular, how does parsing work in detail ?
 >
 >
 > More detailed question solicited.

details follow:

 >> What is the Walker's role ?
 >
 >
 > I believe that:
 >
 > Walker -- does preety nothing, just traverses the subtrees. It is a
 > "skeletal implementation".
 >
 > ClassWalker --- traverses the AST and creates metaobjects (objects
 > representing semantic entities of C++, e.g. objects of classes Class, Member
 > etc.)

not quite. That's precisely one of the spots I have problems with: the
ClassWalker plays a role as well in the low level (parsing) as well as the
high level (mop).
It records declarations to the Environments, and these are then necessary to
complete the parsing, no ? Remember when we discussed ambiguous C++ statements, i.e.
statements that are ambiguous as long as you can't check whether a given
symbol refers to a type or not...

 > ClassBodyWalekr --- traverses the AST of classes and also creates
 > metaobjects (perhaps difference between ClassWalker and ClassBodyWalker
 > stems from the fact that in C++ binding in class scopes works differently
 > than in other scopes).
 >
 >> How are Encoding and TypeInfo related ?
 >
 >
 > It seems to me that Encoding is always an "expanded" type, while TypeInfo
 > is "lazy" type, e.g. it can buffer referencing/dereferencing
 > and also it can keep typedefs unexpanded (that's why it needs to have a
 > link to Environment).
 >
 > There also another point to it: if I am not mistaken Encoding can encode
 > not only type, but also scoped identifier.

yes, it encodes names (function names, in particular, which include the
signature, i.e. parameter types !), as well as types.
I think that your separation of Encoding and EncodingUtil is a good start.
'Encoding' is a 'passive' data holder (similar to std::string). In this role
I placed it into PTree nodes, to make them more type safe.
All the active parts (name lookup, code generation, etc.) should be kept
outside. By the way, why isn't the EncodingUtil part of Environment or TypeInfo ?
Wouldn't that make things slightly simpler ?

 >> PS: I consider the redesign I'm doing in Synopsis as an iterative
 >>     process.
 >>     I work my way up starting from the lowest layers, and while I get a
 >>     better understanding on the individual classes I'm able to improve their
 >>     design.
 >
 >
 > How does it relate to OpenC++? What would you suggest for OpenC++?

oh, the process is simply the only way for me to understand OpenC++. The
design itself is what I would suggest OpenC++ to use for its core. But as I said,
it's an iterative process, so I'm far from done with it, so I expect more structural
changes to the repository (thank svn this is easy ;-)

Regards,
		Stefan