From: Grzegorz J. <ja...@he...> - 2004-06-08 06:36:13
|
On Mon, 7 Jun 2004, Stefan Seefeld wrote: > Grzegorz Jakacki wrote: [...] > > I did not mean what you understood. In fact I want Node<Definition> to be > > "abstract". The second part of example was a miss, let me fix it: > > > > Node<Definition> d = ParseDefinition("int main() {}"); > > SomeVisitor v; > > v(d); //<--- calls v.Visit(Node<FunctionDefinition>) > > Ah, so 'ParseDefinition()' is an 'abstract factory' ? Exactly. (Observe, that rParseDefition() in current code is an Abstract Factory for Ptree hierarchy.) > That means > that all those 'Node<>' classes derive from an abstract 'NodeBase' class, Nope. Read "Node<>", think "smart pointer". Node<> is intended to be a smart pointer with shallow copy. Node<FunctionDefinition> will have default conversion to Node<Definition>. Visitor for Node<> hierarchy will dispatch Node<Definition> to Visit(Node<FunctionDefition>), Visit(Node<ClassDefinition>) etc. > at which point I'm wondering what the advantage of such a templated > class hierarchy is as opposed to a simple traditional one > (i.e. instead of 'Node<Foo>' just using 'Foo', as 'Foo' would need to > be defined anyways) Foo does not need to be defined, declaration alone is sufficient. But we can also reuse PtreeXxxx classes for Foo (that would make sense that in general Node<PtreeIfStatement> is a wrapper for PtreeIfStatement*). The advantage of using Node<Foo> over using Foo* is that Node<> wrappers do not expose Car/Cdr. Moreover, in some cases determining the nature of a Ptree (e.g. if it is if-then, or if-then-else) requires some code (e.g. go to Car()->Car()->Car()->Cdr() and check if it is NULL). Ptree hierarchy has only PtreeIfStatement, which covers both if-then and if-then-else. Having Node<> wrapper we can have Node<IfThen> and Node<IfThenElse>, both wrapping PtreeIfStatement*, but carrying additionial information in the wrapper type. Yet another argument is that this kind of API is trurly and *interface* to the tree data. Node<> scheme allows to have many interfaces. In particular, recall my example of how people want "+" node to be exposed in API --- some want to see "+" as a binary operator, others as a multiary one. Assuming that Node<> shows binary plus, client can write/generate another API, that will expose "multiary" plus. Clients using different APIs can still exchange the underlying Ptree datastructure, they just see different views. > >>In particular, if I want to expose this ast to a scripting frontend > >>such as python, it is impractical to have these wrapper classes be > >>temporary objects, as that would make the binding quite complex and > >>slow. > > > > > > (1) Why? (I have never seen how you create a binding, so I don't have > > an idea what happens.) > > In general, the idea for this particular binding would be to allow > users to define 'Walker' classes in both, C++, as well as python. > If I'm in python and I get hold of a 'Declaration' object, calling > a method (or attribute or property etc.) will result in the invocation > of the associated C/C++ method. But since python has its own idea about > function invocation, parameter passing, etc., each C++ method needs to > be wrapped by a C function that deals with parameter / return value > conversion / wrapping. So if a method returns a reference to another > C++ object, that has to be wrapped in its respective python object. > If these objects are returned by value, you get into a lot of trouble > because it's hard to track dependencies (i.e. reference counts) as > nodes refer to and depend on each other in a parse tree. It would > be far more easy to manage child / parent links internally, so the > python binding wouldn't need to care as long as the referer is still > alive. I think I need an example. Also I believe that you are referring to situation when Node<> wrappers constitute a polymorhpic hierarchy, which is not what I had in mind. > > (2) What if instead of coding these wrappers, we generate them > > based on the "Cdr/Car Ptree -> highlevel Ptree" mapping? > > We could generate "C++ wrappers" and "Python wrappers" > > How / where would that mapping be defined ? As a text file, e.g.: IfPtree : IfThenElse Cond Cdr()->Car() Then Cdr()->Cdr()->Cdr()->Car() Else Cdr()->Cdr()->Cdr()->Cdr()->Cdr()->Car() or as Python data, e.g.: {("IfPtree", "IfThenElse": { "Cond" : "Cdr()->Car()", "Then" : "Cdr()->Cdr()->Cdr()->Car()", "Else" : "Cdr()->Cdr()->Cdr()->Cdr()->Cdr()->Car() } ... } > > My point is to avoid making intrusive changes to Ptree hierarchy, as it > > breaks essentially all OpenC++, so compensating for such changes is both > > expensive and error-prone. > > I understand. I don't have a good understanding how widespread opencxx' > use is these days, i.e. how disruptive a change like this would be to its > users. In fact I had in mind just the damages in OpenC++ backend (and of course the bill is higher when you keep external clients in mind). > > You decide to store more information, say type annotation, at declaration > > node. I means as a design decision, not run-time decision. There are > > several kinds of declarations, each is encoded with some Ptree shape. You > > decide, that the type annotation will be added atop the declaration tree, > > so where up to now you had > > > > NonLeaf(Decl) > > / \ > > NonLeaf NonLeaf > > ... ... > > > > you want to have > > > > NonLeaf(Decl) > > / \ > > [annotation] NonLeaf <--- the old tree > > / \ > > NonLeaf NonLeaf > > ... ... > > No. I didn't mean to suggest a topological change. Rather, I suggest > that instead of using raw 'NonLeaf' (say) objects, we use a richer type > system with types such as 'Declaration', 'Statement', etc. that all *derive from* > NonLeaf. That's how the code works today. > And, as these types know the topology of the sub-trees they are composed > of, they could provide typed access to the subtrees: > > struct Declaration : NonLeaf > { > Type *type() { return static_type<Type *>(Car()->Car());} > ... > }; > > which is technically nothing else but what you'v been describing above with > '[annotation]', Ok, I see. I think I had yet another use case where extending the topology is useful, but I am unable to recall it now (I suppose I was thinking along these lines where I was trying to find a way for clients to put their typed data in the ptree, e.g. OpenC++ backend needs to store type encodings in some ptrees, but in general not all clients need to.) > but these metadata are not stuffed into the ptree by the user, > but by the compiler, i.e. API compatibility is preserved. I think I don't understand. > > Clients should be > > safe when they commit exclusively to high-level API. This warranty is void > > once they start tampering with tree using Ptree API, as clearly Ptree API > > lets you create a structure, that does not map onto any type-correct tree > > in the sense of Node<> API. I would be happy with Node<> API coredumping > > or throwing as soon as it finds out that somebody put any kind of rubbish > > into underlying Ptree tree. Alternatively Node<> API could contain > > something like "Invalid" node type that would be exposed in places where > > underlying Ptre structure is broken in the sense of Node<> API. > > I see your point and I agree to a certain degree. However, I believe that > we could enforce validity constraints by making the ptree access const > through the 'Node<>' API. > In other words, if you want the freedom to manipulate the ptree disregarding > the C++ syntax, you'd have to get hold of a (non-const) pointer to a ptree. > Hmm, that could mean that we provide two separate parsers, one generating > a ptree, the other generating a 'Node<>' tree. > But then it may be simpler to have a single parser generating the ptree > as before, and provide a Walker that maps that to a 'Node<>' tree (would > that be an 'AST' ??) I think we are converging, but: * Why do you think that Node<> API would need to be const? * Even with two API-s (const/non-const) why would we need two parsers? We have one parser now with non-const API. We can create a wrapper that wraps this parser in const-API, period. (Maybe this is what you have in mind writting about Walker that maps ptree to Node<> ?) > [...] > > > Oh, I see now. So to restate my point, I think we should not invest time > > in validating the OpenC++ input. I would say that we should assume that > > input source code is valid C++. > > But the lexer and parser already look for the 'class' token (and so > some walker may already recognize 'class Foo;' to be a forward declaration, say). Sorry, I was not clear enough. I understand that we cannot use C++ parser as is to parse C code. > All I'm pondering about now is whether optionally removing the C++ keywords > would get us closer to a C parser, and if so, what else needs to be done to > complete the step so opencxx could be used for both languages. This is an interesting question, i.e. can the "common factor" of C and C++ parser be factored out and how. Switching between C/C++ keywords should be easy with existing code. Moreover, lexer is encapsulated, so this is not an issue too. The fun begins in parser. > > My concern is that we are trying to go into too many directions: > > > > * making type elaborator and program object model into library > > * typesafe API > > * Python bindings > > * C compatibility > > > > (Not to mention areas where we need quality improvements as templates and > > overloading.) > > yeah, that's too much to be worked on at the same time. I started > this whole thread to get feedback about possible use cases and to > have a discussion about how to support them, at some point in the future. > I'm not working on all these fronts in parallel. I think the first and > third point (making opencxx a library and providing scripting access to it) > is the easiest and most useful one in the short run. This is exactly what I think. > C compatibility is > something quite appart, i.e. I don't expect this to have much (if any) > impact on the rest. Providing a type-safe ptree / AST API is probably > the hardest part of this all. I think it depends. There are many possible AST object models. In particular, existing Ptree hierarchy gives raise to one of them. Having read-only type-safe API along this model is quite easy, it is just a matter of determining the Car/Cdr paths. If this API is useful and convenient is another question. > > Regards, > Stefan > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: GNOME Foundation > Hackers Unite! GUADEC: The world's #1 Open Source Desktop Event. > GNOME Users and Developers European Conference, 28-30th June in Norway > http://2004/guadec.org > _______________________________________________ > Opencxx-users mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/opencxx-users > > ################################################################## # Grzegorz Jakacki Huada Electronic Design # # Senior Engineer, CAD Dept. 1 Gaojiayuan, Chaoyang # # tel. +86-10-64365577 x2074 Beijing 100015, China # # Copyright (C) 2004 Grzegorz Jakacki, HED. All Rights Reserved. # ################################################################## |