From: Grzegorz J. <ja...@he...> - 2004-06-09 09:08:03
|
On Tue, 8 Jun 2004, Stefan Seefeld wrote: > Grzegorz Jakacki wrote: > > >>>I did not mean what you understood. In fact I want Node<Definition> to be > >>>"abstract". The second part of example was a miss, let me fix it: > >>> > >>> Node<Definition> d = ParseDefinition("int main() {}"); > >>> SomeVisitor v; > >>> v(d); //<--- calls v.Visit(Node<FunctionDefinition>) > >> > >>Ah, so 'ParseDefinition()' is an 'abstract factory' ? > > > > > > Exactly. > > > > (Observe, that rParseDefition() in current code is an Abstract > > Factory for Ptree hierarchy.) > > Indeed. But Then you can't simply assign to a Node<Definition> without > a downcast. That was what I stumbled over :-) > > >>That means > >>that all those 'Node<>' classes derive from an abstract 'NodeBase' class, > > > > > > Nope. Read "Node<>", think "smart pointer". > > ok. But that's a different issue. Whether we use smart pointers or not, > somewhere we need a class hierarchy with a type system that covers *all* > the C++ grammar. Hm, looks like I did not convey what I ment to. Perhaps a piece of code will speak more clearly than myself (attached at the end of e-mail). > > > > Node<> is intended to be a smart pointer with shallow copy. > > > > Node<FunctionDefinition> will have default conversion to Node<Definition>. > > > > Visitor for Node<> hierarchy will dispatch Node<Definition> to > > Visit(Node<FunctionDefition>), Visit(Node<ClassDefinition>) etc. > > that's possible but sounds a bit complicated, since even though > 'FunctionDefinition' IsA 'Definition', but 'Node<FunctionDefinition>' > is *not* a 'Node<Definition>'. You restrict yourself to modelling IsA by derived-to-base conversion. > I.e. in the former case the compiler > and the C++ type system would do the proper dispatching for us, but > when using the indirection over Node<> smart pointers we'd have to > do that manually. Excatly. See the code. > That's why I'd like to keep the discussion to the AST type hierarchy > separate from the issue of whether or not to use Node<> smart pointers. But Node<> pointers are meant to "simulate" AST hierarchy. > >>at which point I'm wondering what the advantage of such a templated > >>class hierarchy is as opposed to a simple traditional one > >>(i.e. instead of 'Node<Foo>' just using 'Foo', as 'Foo' would need to > >>be defined anyways) > > > > > > Foo does not need to be defined, declaration alone is sufficient. But we > > can also reuse PtreeXxxx classes for Foo (that would make sense that in > > general Node<PtreeIfStatement> is a wrapper for PtreeIfStatement*). > > When is a declaration sufficient ? Pls. let me know if the code makes it clear to you. > > The advantage of using Node<Foo> over using Foo* is that > > Node<> wrappers do not expose Car/Cdr. > > yeah, I see your point. Well, what about deriving privately from Ptree > and then raising the const methods into the public API via 'using' directives ? Hm, I don't think I meant it. > That has the additional benefit of preserving the type system for us to play with. Not sure if i get you. > > > Moreover, in some cases determining the nature of a Ptree (e.g. if it is > > if-then, or if-then-else) requires some code (e.g. go to > > Car()->Car()->Car()->Cdr() and check if it is NULL). Ptree hierarchy has > > only PtreeIfStatement, which covers both if-then and if-then-else. Having > > Node<> wrapper we can have Node<IfThen> and Node<IfThenElse>, both > > wrapping PtreeIfStatement*, but carrying additionial information in the > > wrapper type. > > hmm, the 'ctool' backend I was talking about earlier has an 'IfStatement' > that looks about so: > > struct IfStatement : Statement > { > Expression *condition; > Statement *then_block; > Statement *else_block; > }; > > where the 'else_block' can be empty. I like the simplicity of this, even > though I'd encapsulate it a little more (especially if this is just a > ptree wrapper). That's OK, it just a matter of how you want the wrappers to behave. > > Yet another argument is that this kind of API is trurly and *interface* to > > the tree data. Node<> scheme allows to have many interfaces. In particular, > > recall my example of how people want "+" node to be exposed in API --- some > > want to see "+" as a binary operator, others as a multiary one. Assuming > > that Node<> shows binary plus, client can write/generate another API, that > > will expose "multiary" plus. Clients using different APIs can still exchange > > the underlying Ptree datastructure, they just see different views. > > yeah, I agree in general that delegation is often better than derivation. > My real issue here is, as I said, the lost of the type system. Could you > demonstrate how a Node<> based visitor would be implemented (i.e. how it > would resolve the correct type) ? As that's the central point I believe, > I'll wait before I continue arguing about the other points until my > understanding of this Node<> visitor as you see it is more complete. Voila: ================================================================== #include <string> using std::string; // ----- low-level tree implementation (a la existing Ptree) ----- struct Ptree { virtual ~Ptree() {} }; struct Leaf : public Ptree { Leaf(const string& txt) : txt_(txt) {} string txt_; }; struct NonLeaf : public Ptree { NonLeaf(Ptree* l, Ptree* r) : l_(l), r_(r) {} Ptree* l_; Ptree* r_; }; struct PtreeBinaryOp : public NonLeaf { PtreeBinaryOp(Ptree* l, Ptree* r) : NonLeaf(l, r) {} }; struct PtreeVar : public Leaf { PtreeVar(const string& id) : Leaf(id) {} }; /* The above AST is capable of representing the sentences derived from the following grammar (The Grammar): (plus) E -> E "+" E (uminus) E -> "-" E (var) E -> id using the following mappings: T( E1 "+" E2 ) = PtreeBinaryOp / \ T( E1 ) NonLeaf / \ Leaf T( E2 ) "+" T( "-" E ) = NonLeaf / \ Leaf T( E ) "-" T( id ) = PtreeVar "id" This mapping is quite irregular. My point is to demonstrate how to deal with similar irregularities in existing Ptree. */ // ------ high-level API ------ /* High level API defined below presents The Grammar in a regular, canonical way. */ // high-level API declares names of nonterminals... struct Expr; // ...and productions. struct Plus; struct UMinus; struct Var; // Moreover, it adds extra "error" production, to represent // Ptree subtrees that do not encode any legal tree. struct Invalid; // First part of API is a set of wrapper classes // (think about them as non-owning smart pointers // or handles) // // It is not difficult to automate creation of this code // from the grammar description. template <class T> class Node; class AbstractNodeVisitor; template <> class Node<Expr> { public: Node() : p_(0) {} private: Node(Ptree* p) : p_(p) {} template <class U> friend class Node; friend void Dispatch(AbstractNodeVisitor&, Node<Expr>); Ptree* p_; }; template <> class Node<Plus> { public: static Node New(Node<Expr> l, Node<Expr> r) { return new PtreeBinaryOp(l.p_, new NonLeaf(new Leaf("+"), r.p_)); } Node() : p_(0) {} Node<Expr> GetLeft() const { return Node<Expr>(p_->l_); } void SetLeft(Node<Expr> e) const { p_->l_ = e.p_; } Node<Expr> GetRight() const { return Node<Expr>(static_cast<NonLeaf*>(p_->r_)->r_); } void SetRight(Node<Expr> e) const { static_cast<NonLeaf*>(p_->r_)->r_ = e.p_; } operator Node<Expr>() const { return p_; } private: Node(PtreeBinaryOp* p) : p_(p) {} template <class U> friend class Node; friend void Dispatch(AbstractNodeVisitor&, Node<Expr>); PtreeBinaryOp* p_; }; template <> class Node<UMinus> { public: static Node New(Node<Expr> c) { return new NonLeaf(new Leaf("-"), c.p_); } Node() : p_(0) {} Node<Expr> GetChild() const { return Node<Expr>(p_->r_); } void SetChild(Node<Expr> e) const { p_->r_ = e.p_; } operator Node<Expr>() const { return p_; } private: Node(NonLeaf* p) : p_(p) {} template <class U> friend class Node; friend void Dispatch(AbstractNodeVisitor&, Node<Expr>); NonLeaf* p_; }; template <> class Node<Var> { public: static Node<Var> New(const string& id) { return new PtreeVar(id); } Node() : p_(0) {} string GetId() const { return p_->txt_; } void SetId(const string& id) const { p_->txt_ = id; } operator Node<Expr>() const { return p_; } private: Node(PtreeVar* p) : p_(p) {} template <class U> friend class Node; friend void Dispatch(AbstractNodeVisitor&, Node<Expr>); PtreeVar* p_; }; template <> class Node<Invalid> { public: Node() : p_(0) {} operator Node<Expr>() const { return p_; } private: Node(Ptree* p) : p_(p) {} template <class U> friend class Node; friend void Dispatch(AbstractNodeVisitor&, Node<Expr>); Ptree* p_; }; // Third part of high-level API is an abstract visitor for Node<>'s class AbstractNodeVisitor { public: virtual void Visit(Node<Plus>) = 0; virtual void Visit(Node<Var>) = 0; virtual void Visit(Node<UMinus>) = 0; virtual void Visit(Node<Invalid>) = 0; }; // Fourth part of high-level API is a dispatcher -- // a mechanism that lets you find out more detailed // type information about node (e.g. you pass // Node<Expr>, and dispatcher calls, say, // v.Visit(Node<Plus>), if your Node<Expr> was indeed // a plus expression). // // There are many ways to implement Dispatch(). Usually // it will use some technique to find out // actual (dynamic) type of Ptree* wrapped in Node<>'s, // plus a little bit of peeking into the tree, if // dynamic type information itself is not enough // to find out what high-level node should be reported. // // Note1: This implementation uses dynamic_cast in Ptree* // hierarchy to learn about dynamic type; other implementations // are possible, e.g. type tags ('p_->What()'), type querries // ('p_->IsA(...)') or visitation in Ptree hierarchy. // // Note2: If the syntax presented by high-level API matches closely // the ihneritance hierarchy in Ptree, then it is just enough // to find out the dynamic type of Ptree* to be able to build // Node<> wrapper; in particular, no further "peeking" is necessary // (in fact some peeking may be administered to check if the // Ptree tree has the correct topolofy, so that later, say, // Node<>::GetRight() does not hit a NULL pointer somewhere // along Cdr/Car path). // // Note3: I think that for many AST mappings a workable implementation // of Dispatch can be generated automatically (although there may // be a quagmire somewhere here; e.g. finding if an AST mapping is // ambiguous seems undecidable in general). // // Note4: Dispatch() is effectively a "tree parser". void Dispatch(AbstractNodeVisitor& v, Node<Expr> e) { if (PtreeBinaryOp* b = dynamic_cast<PtreeBinaryOp*>(e.p_)) { if (NonLeaf* br = dynamic_cast<NonLeaf*>(b->r_)) { if (Leaf* brl = dynamic_cast<Leaf*>(br->l_)) { if (brl->txt_ == "+") { if (b->l_ && br->r_) { return v.Visit(Node<Plus>(b)); } } } } } if (NonLeaf* b = dynamic_cast<NonLeaf*>(e.p_)) { if (Leaf* bl = dynamic_cast<Leaf*>(b->l_)) { if (bl->txt_ == "-") { if (b->r_) { return v.Visit(Node<UMinus>(b)); } } } } if (PtreeVar* b = dynamic_cast<PtreeVar*>(e.p_)) { return v.Visit(Node<Var>(b)); } v.Visit(Node<Invalid>(e.p_)); } // // General observations: // // * There is no inheritance relation whatsoever between // instantiations of Node<>. // // * Node<> behaves like polymorphic pointer. // // * Internally Node<> can use raw, smart or gc-managed pointers. // // * Node<T>::New is analogon of "new T" // // * Node<T>() is analog of default pointer constructor; // Note<T>() creates "null pointer" // // * Node<Expr> is "abstract" -- it does have New() member function, // you can either create null Node<Expr> or assign/initialize it // from another, non-abstract Node<T> // // * There is a user-defined conversion from "concrete" Node<>s to // "abstract" Node<Expr> (this is analogon of derived-to-base conversion) // // * Dispatch() is an analogon of virtual dispatch. // // * Dispatch() dispatches to a classic, polymorphic visitor, i.e. its // first argument must be derived from AbstractNodeVisitor; it is also // possible to arrange for generic visitation, i.e. templatize the first // argument of Dispatch(). // // * AFAICS code pertaining to Node<> wrappers is inlineable on modern // compilers. // // * It is possible to model "multiple inheritance" on Node<>s, // which corresponds to grammar in which many nonterminals produce // the same RHS. Observe, that unlike standard implementation // in C++ typesystem, such "multiple inheritance" does not involve // memory overhead. // -------- Example --------- class MyVisitor : public AbstractNodeVisitor { public: MyVisitor(ostream& os) : os_(os) {} void Visit(Node<Plus> p) { os_ << "Plus("; Dispatch(*this, p.GetLeft()); os_ << ","; Dispatch(*this, p.GetRight()); os_ << ")"; } virtual void Visit(Node<Var> v) { os_ << "Var(" << v.GetId() << ")"; } virtual void Visit(Node<UMinus> u) { os_ << "UMinus("; Dispatch(*this, u.GetChild()); os_ << ")"; } virtual void Visit(Node<Invalid>) { os_ << "Invalid()"; } private: ostream& os_; }; int main() { Node<Plus> p = Node<Plus>::New( Node<UMinus>::New( Node<Var>::New("x") ) , Node<Var>::New("y") ); Node<Expr> e = p; // "derived-to-base" MyVisitor v(std::cout); Dispatch(v, e); cout << endl; Node<Expr> f = Node<UMinus>::New(Node<Var>::New("z")); p.SetRight(f); Dispatch(v, e); cout << endl; } ================================================================== Best regards Grzegorz ################################################################## # Grzegorz Jakacki Huada Electronic Design # # Senior Engineer, CAD Dept. 1 Gaojiayuan, Chaoyang # # tel. +86-10-64365577 x2074 Beijing 100015, China # # Copyright (C) 2004 Grzegorz Jakacki, HED. All Rights Reserved. # ################################################################## |