Re: [Opencxx-users] OpenC++ Core Lib -- prototype

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi,

On Mon, 13 Sep 2004, Stefan Seefeld wrote:

> Grzegorz Jakacki wrote:
>
>  >> I'd generally much prefer to start with an assessment (i.e. documentation
>  >> in terms of actual text, unit tests, and examples) of the existing design.
>  >
>  >
>  > I think it is suboptimal in this case for two reasons. First, there is
>  > very little documentation of existing system, so almost all documentation
>  > would need to be reverse engineered (I already did some of it, as you may
>  > remember).
>
> Yes. But don't we need to reverse engineer if we want to apply changes in a
> controlled manner ?
>
> I can tell you how much head-aches I have these days trying to understand
> why a given method is implemented the way it is. The fact that the old code
> wasn't const-correct obviously doesn't help, i.e. I have to guess whether a
> given argument is going to be modified in-place or whether it's only
> an 'in' parameter.
>
> So reverse-engineering is what takes 90% of my time. Once I understand a
> given design it's quite easy to see whether it can be enhanced.

My take is that 100% understanding in these circumstances is not
precondition to introducing changes. But this is MHO.

>  > Second, we cannot provide accurate unit tests post factum, because we are
>  > not sure what the code should really do.
>
> granted. Though they would serve as regression tests, i.e. we could measure
> what kind of changes we incure to the data handling while we work on the
> redesign.

I was trying to take this patch for the past 3 years and for me it did not
work out with OpenC++. Writting reasonable regression tests is lots of
hard work and is not very exciting for prospective developers. I took such
road at work where I inherited undocumented system without tests, but with
open-source volunteers-based project things look differently.

Thus my proposal to focus on the core functionality and put "Greater
OpenC++" aside for some time. In particular I no longer think that it
makes sense to add regression tests to "Greater OpenC++" now.

> In fact, there are, as we discussed, a couple of cases that are known
> to fail. Those known failures should be accompagnied by (annotated) unit
> tests, if only to document that we know about them.

Agreed, but I vote for mounting those unit tests as low as possible ---
not on top of 'occ' interface, but e.g. on top of parser library.

>  > I think that the way to go is to read and clean the code step by step,
>  > adding unit tests as the knowledge about the code grows.
>  >
>  >> Some months ago I proposed a rigorous separation between 'parser'
>  >> (including ptree API) and higher level classes (Environment, TypeInfo,
>  >> Class, etc.).
>  >
>  >
>  > In fact parser separation started several months before you joined in April.
>  > (http://sourceforge.net/mailarchive/message.php?msg_id=6971830)
>  >
>  >> Now that my understanding of occ internals is a little better, I would
>  >> revise my suggestions I then made.
>  >
>  >
>  > Go ahead.
>
> nothing dramatic :-) I simply concluded that the boundary between 'low' and
> 'high' level is a little blury, or not where I originally expected it to be.
> That's mainly because I didn't realize at the time that what we called 'parsing'
> was only the first half, the second half including the use of 'ClassWalker' and co.
>
> As we now agree to redesign the parser such that it does the declarator
> recording (Environment...) itself instead of letting it done by a Walker, it
> could mean that the Walker could be part of the high level.
> Right now it can't because parsing isn't complete (i.e. self-contained)
> without ClassWalker.

Wait a moment. I don't think it is *that* easy with C++. As I already
mentioned here several weeks ago, I don't see a way to perform type
elaboration in one pass. This is because

    The potential scope of a name declared in a class consists not only of
    the declarative region following the name s declarator, but also of all
    function bodies, default arguments, and constructor ctor-initializers in
    that class. [3.6.6p1]

which in particular makes this code legal:

    class A
    {
        void f() { g(); };
        void g() {};
    };

The more tricky example:

    class A
    {
        void f() { B < 1 > ( 2 ) ; }
        template <int n> void B(int);
    };

Here we just cannot rely just on one pass. It seems to me that a reasonable
solution is that parser stores the function bodies, default arguments and
ctor-initializers and parses them after the class is parsed. At that
time the symbol table can meaningfully answer if a given name represents
a template.

>  >> and so I believe in order to be able
>  >> to make 'educated suggestions' we should start by discussing the existing
>  >> design.
>  >> In particular, how does parsing work in detail ?
>  >
>  >
>  > More detailed question solicited.
>
> details follow:
>
>  >> What is the Walker's role ?
>  >
>  >
>  > I believe that:
>  >
>  > Walker -- does preety nothing, just traverses the subtrees. It is a
>  > "skeletal implementation".
>  >
>  > ClassWalker --- traverses the AST and creates metaobjects (objects
>  > representing semantic entities of C++, e.g. objects of classes Class, Member
>  > etc.)
>
> not quite. That's precisely one of the spots I have problems with: the
> ClassWalker plays a role as well in the low level (parsing) as well as the
> high level (mop).
> It records declarations to the Environments, and these are then necessary to
> complete the parsing, no ?

I don't think OpenC++ does it now in such manner. AFAIK parser does not use
type information to resolve any syntax ambiguities (ie. at template
instantiations).

> Remember when we discussed ambiguous C++ statements, i.e.
> statements that are ambiguous as long as you can't check whether a given
> symbol refers to a type or not...
>
>  > ClassBodyWalekr --- traverses the AST of classes and also creates
>  > metaobjects (perhaps difference between ClassWalker and ClassBodyWalker
>  > stems from the fact that in C++ binding in class scopes works differently
>  > than in other scopes).
>  >
>  >> How are Encoding and TypeInfo related ?
>  >
>  >
>  > It seems to me that Encoding is always an "expanded" type, while TypeInfo
>  > is "lazy" type, e.g. it can buffer referencing/dereferencing
>  > and also it can keep typedefs unexpanded (that's why it needs to have a
>  > link to Environment).
>  >
>  > There also another point to it: if I am not mistaken Encoding can encode
>  > not only type, but also scoped identifier.
>
> yes, it encodes names (function names, in particular, which include the
> signature, i.e. parameter types !), as well as types.
> I think that your separation of Encoding and EncodingUtil is a good start.

It is original design, I have not touched it.

> 'Encoding' is a 'passive' data holder (similar to std::string). In this role
> I placed it into PTree nodes, to make them more type safe.

Personally I don't like it, because it forces clients data structure on
parser library.

> All the active parts (name lookup, code generation, etc.) should be kept
> outside. By the way, why isn't the EncodingUtil part of Environment or TypeInfo ?
> Wouldn't that make things slightly simpler ?

Putting functions from EncodingUtil into Encoding makes parser library
depend on Environment, and Walker, and Bind, and TypeInfo, and Class,
so all the parser isolation collapses.

Putting functions from EncodingUtil into Environment makes sense only if
they need private access to Environment. They don't.

If you feel that 4 functions in EncodingUtil namespace does not warrant its
existence, then I would suggest putting them in Environment.h (but not as
member functions), according to Sutter's "interface principle".

>  >> PS: I consider the redesign I'm doing in Synopsis as an iterative
>  >>     process.
>  >>     I work my way up starting from the lowest layers, and while I get a
>  >>     better understanding on the individual classes I'm able to improve their
>  >>     design.
>  >
>  >
>  > How does it relate to OpenC++? What would you suggest for OpenC++?
>
> oh, the process is simply the only way for me to understand OpenC++. The
> design itself is what I would suggest OpenC++ to use for its core. But as I said,
> it's an iterative process, so I'm far from done with it, so I expect more structural
> changes to the repository (thank svn this is easy ;-)

I think that your contributions to original OpenC++ that currently
constitute C++ submodule of Synopsis are very valuable and I would be glad
to leverage them in OpenC++ Core. As I understand the only changes that were
made under LGPL to OpenC++ 2.5.12 under Synopsis are yours, right? If so,
would you consider donating them to OpenC++ under OpenC++ license?

BR
Grzegorz

##################################################################
# Grzegorz Jakacki                       Huada Electronic Design #
# Senior Engineer, CAD Dept.              1 Gaojiayuan, Chaoyang #
# tel. +86-10-64365577 x2074               Beijing 100015, China #
# Copyright (C) 2004 Grzegorz Jakacki, HED. All Rights Reserved. #
##################################################################