Re: [Opencxx-users] future directions

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Grzegorz Jakacki wrote:

>>* provide a much better regression test coverage on different levels so we can
>>   measure to what degree changes break compatibility (some will be unavoidable)
> 
> 
> This seems to be a lot of work.

but it's worth it, I believe.

[...ptree -> AST abstraction...]

> I can see potential problems where existing code keeps e.g. definitions and
> expressions on the same list, which now is a list of Ptree*, but could be no
> more if we want to inject more type information.

agreed. On the other hand, I'm not suggesting that we change the ptree structure,
just that we make the ptree class tree more rich (type and API wise) such that
users *can* use the high level type information and API. Walking the ptree via
the 'untyped' Ptree nodes will still be possible.

>>* I'm looking into the 'ctool' code I imported into synopsis.
>>   http://synopsis.fresco.org/viewsvn/synopsis-Synopsis/trunk/Synopsis/Parsers/C/
>>   contains a class hierarchy for C, which shouldn't be that different from a C++ grammar,
>>   so it may serve as inspiration.
> 
> 
> Observe that existing Ptree classes already constitute a hierarchy
> (however majority of the code flattens it by upcasting to Leaf/NonLeaf).

exactly. I realize the attempt to get more type info into the ptree, but it doesn't
look complete. I think completeness could/should be defined on the criteria that
I could write a Visitor that is able to *completely* traverse the ptree recovering
all the type info without ever touching methods 'Cdr()' and 'Car()'.

> Not at all. Node<IfStatement> is meant to work as a smart pointer to
> Ptree. Is is to be used by value and not stored anywhere, e.g.:
> 
>   Node<Definition> d = ParseDefinition("int main() {}");
> 
>   string id = d.GetIdentifier(); // instead of d->Cdr()->Car()-> etc.

hmm, but then the user already knows that he is parsing a definition,
so the type info isn't adding much value. Read: I want an *abstract*
syntax tree, such that I can run 'StatementList *ast = parse(my_file);'
and then inspect the returned 'ast' by some custom visitors.

In particular, if I want to expose this ast to a scripting frontend
such as python, it is impractical to have these wrapper classes be
temporary objects, as that would make the binding quite complex and
slow.

> Example 1: You decide to change the structure of a Ptree nodes
> representing some C++ construct (e.g. to store more information in it).
> This will break all clients depending directly on Ptree, because
> they need to update Car/Cdr paths to reflect new structure. However,
> if there is Node<> iface in the middle, you update Car/Cdr paths
> only in Node<>.

But what about the Node's type ? If my wrapped ptree is a declaration,
but by simply modifying the ptree I change that to be a function call,
the wrapper's type ('Node<Declaration>') would be wrong.

Of course, if we expose the two APIs in parallel we'll always have
this problem. May be the ptree API should not expose any modifiers,
i.e. exclusively operate on const ptrees.

> Example 2: You add new kind of node, e.g. for "using" declaration.
> All existing clients of Node<> will break if you just add new type to
> Node visitor. However, you can branch Node<> iface into two versions:

That's a good point. The Visitor pattern really assumes the type hierarchy
of the visited objects to be stable.

[...]

>>Of course, such a move is not a simple decision. You really have to evaluate
>>what future directions you want to take, and whether that fits with the
>>synopsis framework. As I said, my interest into opencxx is in the context
>>of synopsis, so I'll probably do most of my work from there. Please take
>>this as an offer and invitation, not an attempt to fork your project.
> 
> 
> This requires some thought indeed.

There is no need for a quick decision. I'm just observing that right now
we are each working on a separate branch, so merging them would be practical.
And that even more so if we are going to look into adapting qmtest as a
unit testing framework (Right now I don't do unit testing on the opencxx
backend, just the generated synopsis AST which I dump).

The most important thing I believe is that we define what we each expect
from opencxx (and synopsis) in the future, i.e. whether we are aiming at
the same things, and whether the common goals suggest that we both work
from a common code base, or whether the overlap is simply not large enough
to be worth a merge.

>>It shouldn't be hard, and, it should be able to do that incrementally.
>>The important step is some basic restructuring of the Parser class into
>>an interface and implementation of the statements that are common between
>>all flavours of C and C++, and put the rest into subclasses (K&R, ansi C, C89, etc.)
> 
> 
> An important thing to consider here is if we want OpenC++ to be validating.
> If not, then we don't need to bother that "class" occures in C source.
> Personally I think that validating parser/elaborator is much harder to write
> and IMO this should not be our priority, since we have no chances to reach
> the quality of validation that e.g. g++, MSVC or EDG present today.

I don't understand what you mean with 'validate'. opencxx (and ctool)
are well able to indicate 'parse errors', even though the error message
is not very high level, as that would indeed require that more language specific
semantics be available to the parser (or the object that is trying to issue
a meaningful error message).
What do you mean with 'bother that "class" occures in C' ? Right the lexer
will return a 'CLASS' token if it runs into the 'class' string. That doesn't
make sense in C, as it would have to be an ordinary identifier. Similarly
for all the other keywords.
So removing those C++-specific keywords when scanning in C-mode sounds
like the (easy) first step towards C compatibility.

> However, we do have a chance to do something new and useful in providing
> refactoring tool and frontend libraries, and I think this should be our
> focus now.

agreed. I'm just wondering how much efford it would be to extend opencxx's
scope to C, as I can see a lot of use in a tool like that for example to
the GNOME folks.

Regards,
		Stefan