[Opencxx-users] Re: [opencxx] Let's break OpenC++ up

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi *.*,

On Wed, Mar 19, 2003 at 01:38:26PM +0800, Grzegorz Jakacki wrote:
> On Tue, 18 Mar 2003, Stefan Reuther wrote:
> > I don't think it can be done much different. Ptree and Walker
> > have to "know each other".
> 
> I believe that it is possible to introduce a layer in between,
> which would encapsualte the actual implementation of Ptree.

Another possibility would be to put that knowledge into the
Walker interface. Here, we have such a thing for
PtreeDeclaration. Depending upon whether a PtreeDeclaration is a
class declaration ("class foo;"), a name declaration ("int x;")
or a function implementation ("void x() { }"), it nicely
dissects the tree node and calls different functions. "Class"
and "ClassWalker" already seem to do some other parts of this.

Backwards-compatibility is not too hard with that (it can be
implemented atop Walker). Forward-compatibility works, too: when
the tree is changed, we add a new function which dissects the
new node, and translates it into a call to old functions.

I'm unsure which ("mine" or "yours") I like better.

> > Another nice thing to have would be a Walker returning other
> > things than Ptrees. I'm still unsure how that could work.
> 
> In fact Walker is implementation of pattern called Visitor.
> I am not sure if I have already mentioner "Design Patterns"
> by Gamma, Johnson, Ralph & Vlissides.

Damn, I wanted to read that for ages, but didn't get around
fetching it from the library :-) But it is more the
implementation than the understanding which worries me.

> Walker (or more generally visitor) can return arbitrary thing. I assume
> that what you have in mind is that it returns nodes of another tree
> (or rather handles/pointers to such nodes), thus implementing some
> homomoprhic tree transformation. It that it?

I even want to return complex things. For example, when
"compiling" an expression, I want to return
  class Expr_result {
      Ptree*            tree;     // result tree
      Type              type;     // result type
      Function_symbol*  symbol;   // function symbol, if result is an ambiguous function name
      Kind              kind;     // whether it's l- or r-value,
                                  // function, bound member function, ...
      Ptree*            obj_tree; // associated object for a member function call
      // methods omitted
  };

Michael Hohmuth solved that by making a template class
Ptree_visitor<T> which inherits from Walker. Instead of using
the Ptree* return value somehow, it saves the return value:
    Ptree* TranslateClassSpec(Ptree* p) {
       _saved = visit_classspec(dynamic_cast<PtreeClassSpec*>(p));
       return 0;
    }
and returns it later, approximately like this:
    T visit(Ptree* p) {
       Walker::Translate(p);
       return _saved;
    }

In full epic detail: <http://os.inf.tu-dresden.de/~sr21/c++/> That's
all my code. It's still incomplete. Michael's visitor is
<http://os.inf.tu-dresden.de/~sr21/cgi-bin/zipserv/c++/c++annotator-0.1.zip/c++annotator-0.1/comp/ptree_visitor.cpp>

> > For example, the recent addition of the "typeid" operator
> > caused our code to silently translate
> >    typeinfo ti = typeid(p);
> > as
> >    typeinfo ti = p;
> 
> Why was that (I could go figure out myself, but probably you still
> remember)?

Outline: a Walker to translate expressions, with the semantics:
to translate an expression, call "Translate()" and it will
deposit code to do that expression somewhere.

So "- x" is translated as "translate x, then negate". Because the
default action for "typeid" is to recurse (instead of generating
code), only the expression "p" is translated, and we have no way
of knowing that.

> > > > 2) The bigger problem is with the content of the nodes.
> > > > [...]
> > > "ifthenelse->Car()->Car()->Cdr()", which potentially can coredump
> > > at each arrow, one would conveniently write "ifthenelse->GetThen()".
> >
> > Sir, where do I sign? :-)
> 
> You mean NDA ?

Sign your petition for a better interface :-) Got tired of just
saying "Yes".

> > Still, in the *current* code, TranslateIfStatement takes a
> > Ptree* and not a PtreeIfStatement*, although the latter would be
> > a safe assumption for most cases.
> 
> There is a little bit of confusion, since lispy data structure is mixed up
> with 'PtreeIfStatement'-like classes, which on the other hand do not
> provide methods like "GetThen()". Why cannot we change
> TranslateIfStatement so that it takes "Ptree*" argument?

Currently, TranslateIfStatement takes a Ptree* argument. As far
as I can tell, it is only called with PtreeIfStatement* nodes,
so I would opt for changing that. Type safety, nothing more.

It can hurt on some pathological cases, but I don't think that's
really a problem: I had one place or two where I wanted to put a
simple Ptree* into TranslateName (which should, following my
logics, take a PtreeName*), but that's easily solvable by adding
a private third method.

> > One problem we'd still have to solve is that people can still
> > construct invalid trees. For example, if I have a Ptree
> > representing "if (cond) statement;", someone could Snoc a "&" at
> > the end, although that'd yield invalid syntax, and nothing could
> > prevent it (although adding an "else" and another statement
> > still must be valid). That's what I meant with "Lispy" data
> > structure: in Lisp, you have structured lists, too, but no-one
> > can prevent you from mangling them somehow.
> 
> This cannot be avoided if there is lispy interface to create the tree
> nodes available to the client. I think we should just assert or throw once
> such a situation (invalid tree) is identified.

Alternatively, allow to construct valid trees only. Things like
  class PtreeIfStatement : public PtreeStatement {
      PtreeIfStatement(Ptree* keyword, Ptree* lparen,
                       PtreeExprOrDecl* decl, Ptree* rparen,
                       PtreeStatement* statement);
      void AddElseClause(Ptree* keyword, PtreeStatement* statement);
  }
and deprecating (or "protected"ing) "Snoc", "Nconc" and friends.

I think having the "lispy" Ptree representation is nice for
traversal, output, GetLineNumber, etc, but allowing everyone to 
manipulate them might not be good.

> > I think adding those accessor functions ("GetCondition",
> > "GetThen", "GetIf") would be a good start, although it doesn't
> > work nicely for everything: for example, PtreeDeleteExpr can
> > have the following forms:
[...]
> Accessor functions can be refactored in the future, if the tree
> implementation is changed/normalized.
> 
> (BTW: I do not think that accessors should be free functions; see the
> write-up).

I didn't mean that, sorry. Accessors would be members of
PtreeDeleteExpr.

> > Another nice thing to have would be a "better" class hierarchy.
> > Improvements I see so far are:
> > - all PtreeFooExpr should share a common base class,
> >   PtreeExpression
> 
> What would this class contain? (Not to discourage you, I am just looking
> for arguments).

It would just be a base class for type safety. When I write
something which deals with expressions, I want to specify that
in the interface:
   void DoSomethingWith(PtreeExpression* p) { ... }
If I take a raw "Ptree", people can give me a PtreeIfStatement
or something, and I'll rightfully choke on it.

It's not a pressing problem, though. AbstractWalker would help
more for the instant.

> Where would this help? In function signatures, I guess, but at the moment
> I cannot see where exactly.

I guess we mean the same thing.

> > - all PtreeFooStatement should share a common base class,
> >   PtreeStatement. Hmmm, this seems to collide with PtreeBrace
> >   and PtreeDeclaration, so maybe leave it out.
> 
> Could you elaborate on where is the problem exactly?

Right now, PtreeBlock derives from PtreeBrace, which is IMHO a
sensible relation.

Since PtreeBlock can be a statement, it would have to derive
from PtreeStatement somehow. So we'd have to either cut the
relation between PtreeBlock and PtreeBrace, or make PtreeBrace
derive from PtreeStatement which is not good.

> > > > [1] Since the CVS code currently can't be built, I'll wait with
> > > > checking it in.
> > >
> > > Ok.
> >
> > It's in now.
> 
> Great. However could I ask you to add the testcases to verify this
> functionality? There are instructions somewhere under ./testsuite, perhaps
> those tests should go into ./testsuite/comp.

I'm working on it. I'll probably have to learn a bit more about
the meta-object protocol.

Essentially I want to call a Walker on some code. Any hints how
to do that most easily? Put it into a class, and override
TranslateFunctionImplementation, or is there a simpler way?

> > > But this requires adding *some* support of new nodes to the translator
> > > anyway, right?
> >
> > Yes, but I can't seem to find a way around that, because there
> > is no universal fallback. Some want just to recurse into the new
> > nodes, some do not want to miss them.
> 
> I am thinking about adding the code into the "view" layer (between tree
> and walker), what would express the new node in terms of old nodes
> (or at least abort).

Agreed.

> As promised, this is a write-up of thoughts I had wrt. improvind the
> design of OpenC++ in evolutionary way.
> 
> As I already wrote, there is too much coupling between Walker and Ptree
> hierarchy (this is usual problem with Visitor pattern).
> 
> I was thinking about providing more isolation by introducing a level of
> indirection. In particular I would like Ptree* to be replaced by some
> proxy type (a.k.a. "smart pointer", "handler" or "wrapper").

Short question: why? I would just put the accessor methods:
> template <>
> class PtreePtr<PtreeIfStmt>
> {
>     ...
>     PtreePtr<Ptree> GetThen()
>     {
>         return PtreePtr<Ptree>(ptr_->Cdr()->Car()->Cdr());
>     }
> };
into the respective Ptree nodes. It's all the same to me whether
I write p->GetThen() or p.GetThen().

Maybe there are subtle implications for forward- and backward
compatibility, but I'm currently unable to see them.

Anyway, if someone wants to add a new form of If, he can still
make PtreeIfStatement abstract (maybe with virtual GetThen())
and derive a PtreeNewIfStatement and PtreeOldIfStatement. Or am
I missing something here?

> The next step would be to get rid of 'new Ptree...' operators in the code.
> It should be gradually replaced by something like 'create<Ptree...>()',
> which is supposed to give back "something that behaves like Ptree",

Okay. Maybe constructors already suffice? But right, if the
returned type changes (PtreeIfStatement ->
PtreeNewFunkyIfStatement), constructors are problematic.

> As for visitors (aka. walkers): currently the "view" of the tree that
> visitor "sees" is fixed by the tree implementation. However, not every
> visitor wants to see the same --- e.g. sometimes you want to see comments
> or whitespace as nodes, sometimes not. It is especially true when you have
> deveopled several visitors and you add some new functionality to the
> parser (e.g. "explicit" keyword), so you need to add syntax node, but you
> would like to keep your old visitors running (and e.g. ignore the new
> node). Another example are parenthesis --- if you have visitor that
> simplifies arithmetic expressions, you most likely are not interested in
> it seeing the nodes representing parenthesis.
> 
> This can be taken care of by the use of tag classes. One can define a set
> of tag classes (which represent a "view" for visitor) and trait class that
> maps node implementation type to a tag (in a given view), e.g.:
> 
>     class BinaryNode {};
>     ...
>     class BasicViev;
> 
>     template <class View, class Node> class tree_traits;
> 
>     template<> struct tree_traits<BasicView,PtreePtr<PtreeAssign> >
>     {
>         typedef BinaryNode type;
>     }
>     ...
> 
>     tree_traits<BasicView,Node>::tag    <--- what 'Node' means in 'BasicView'
>                 ^^^^^^^^^ ^^^^
>              view          implementation type
> 
> To make the mechanism complete, one needs also a "dispatcher" or
> "destructor" --- function or functor, that for given view, visitor and node
> determines the node "nature" --- e.g. given Ptree* determines if it is
> "if statment", "assignment" etc., casts the node representation to
> appropriate type (or wraps it in appropriate proxy) and calls the visitor
> on it.

That's what Walker does, right? I think we can just take Walker
and derive a number of views from it. If someone wants to skip
over whitespace, their view would override TranslateWhitespace
to do nothing.

Maybe I'm too inexperienced to see the great advantages. I
usually aim to minimize work, and "your" approach sounds like a
lot of work :)

  Stefan