From: Stefan S. <se...@sy...> - 2004-08-31 06:13:55
|
hi there, as I have been getting into the opencxx internals over the last weeks, I'v started to refactor / redesign the code. Here are some thoughts about this efford. I renamed some classes to make their goal clearer, and I changed the method names to be more consistent with the rest of synopsis' naming conventions. Anyways, here is the meat: * reduce unnecessary abstraction: The 'Program' classes are gone and replaced by a single 'Buffer' class, which holds the data the ptree is constructed on top of. If you want to read from file, string, or console, just use the appropriate streambuf type and pass that to the Buffer's constructor. * rewrite the Lexer: enhance Token and Lexer classes. In particular, consider the Buffer data const, i.e. make all methods take 'const char *' instead of 'char *'. Simplify the Lexer code to use std::map and std::queue instead of its own types. The performance penalty is tolerable (< 10%). The benefit is that individual keywords can be added/removed at runtime, which allows for 'cross-compilation' as well as support for C++ as well as C. * clean up the parse tree types: Some type renaming aside, I cleaned up the 'PTree::Node' (former 'Ptree') interface substantially: I introduced a generic 'Visitor' class and added 'accept' methods to all ptree node types. I then reimplemented the 'display' and 'write' methods by Visitor subclasses. I did the same for the 'type_of' and 'translate' methods, so now PTree nodes don't know about neither TypeInfo nor Walker any more. (Phew !!) Finally, I cleaned up the Encoding class such that PTree::Nodes now hold PTree::Encoding instances instead of 'char *'. That's about where I am right now. The PTree namespace looks quite clean already. One issue I'm not quite satisfied with is the design of Encoding and TypeInfo (and Environment, just to name all classes that touch this design): I'd like to get back to this in the not-so-far future. Next comes the 'Environment' class. As I said in my last mail, that looks a bit monolithic, i.e. instead of putting specific cases into subclasses (class scope with lookup in base classes, to name one example), it's all contained in this single class. Once these points are addressed, I believe it would be a good time to start thinking about the Parser and how the symbol lookup could be integrated right into the parse stage (instead of doing a second pass over the ptree via ClassWalker). What do you think about these changes ? Whether we decide to work on the parser or keep it as it is, I believe the changes I have applied already provide an important cleanup of the design, and thus make it much more easy to maintain and extend the existing functionality. Comments are highly appreciated, Stefan |
From: <se...@in...> - 2004-08-31 16:03:15
|
On Tue, 2004-08-31 at 02:10, Stefan Seefeld wrote: > hi there, > > as I have been getting into the opencxx internals over the last weeks, > I'v started to refactor / redesign the code. Here are some thoughts > about this efford. > > I renamed some classes to make their goal clearer, and I changed the > method names to be more consistent with the rest of synopsis' naming > conventions. Anyways, here is the meat: [...] > That's about where I am right now. The PTree namespace looks quite clean > already. One issue I'm not quite satisfied with is the design of Encoding > and TypeInfo (and Environment, just to name all classes that touch this > design): I'd like to get back to this in the not-so-far future. > Next comes the 'Environment' class. As I said in my last mail, that looks > a bit monolithic, i.e. instead of putting specific cases into subclasses > (class scope with lookup in base classes, to name one example), it's all > contained in this single class. > > Once these points are addressed, I believe it would be a good time to > start thinking about the Parser and how the symbol lookup could be > integrated right into the parse stage (instead of doing a second pass > over the ptree via ClassWalker). > > What do you think about these changes ? Whether we decide to work on the > parser or keep it as it is, Templates make a C++ parser at least 'x' times harder. Can you explain how we can handle template specialization with current parser. What I understand, - specializing templates requires that the tokens for a template declaration be saved so that the template can be reprocessed when the actual template arguments are specified. - it requires the syntax phase to perform near-complete type, scope, and expression analysis, including function overloading. Present design predate such change to the language. In my vocabulary, present design is "C-Front style". So my believed is that we need to rejuvenate the design. But my question is how. > I believe the changes I have applied already provide an important > cleanup of the design, and thus make it much more easy to maintain > and extend the existing functionality. |
From: Stefan S. <se...@sy...> - 2004-09-01 00:27:35
|
Gilles J. Seguin wrote: > Templates make a C++ parser at least 'x' times harder. > Can you explain how we can handle template specialization with current > parser. > > What I understand, > - specializing templates requires that the tokens for a template > declaration be saved so that the template can be reprocessed when the > actual template arguments are specified. I'm not sure whether that is necessary. It might be if you wanted to do some validation such as make sure that the types on which a template should be instantiated fulfill the requirements of the template. I don't think this is all that useful at the moment. Existing difficulties in resolving ambiguities during template parsing aside, I believe everything we need is already in place. The parser can parse both template declarations as well as template specializations. We only need to look up the first when we run into the second. All we need is an environment... > - it requires the syntax phase to perform near-complete type, scope, and > expression analysis, including function overloading. Right, it would, if we tried to validate the code. > So my believed is that we need to rejuvenate the design. > But my question is how. Functionality-wise I believe the big issue is with ambiguities the current parser can't resolve due to the two-phase parsing. We should try to merge these two phases into one, as we discussed in another thread. Another issue is that of the code being quite monolithic. Now that I'v had some first cut at a refactoring I'm more convinced than ever that it is possible to clear up the code substantially, and thus make it simpler to use and enhance (the old topic of a new AST API on top of the PTree stuff comes to mind !). As far as synopsis is concerned, I'm looking into exposing these APIs to python and then being able to script on top of that. That would remove the need for a complete, portable C++ application frontend, i.e. I believe it's far easier to write an 'occ' application in python, where running subprocesses and loading plugins is far more simple and portable then doing the same in C/C++. Just imagine, we could completely drop all this libtool / lddl business...! Regards, Stefan |
From: Grzegorz J. <ja...@ac...> - 2004-09-01 14:18:49
|
Stefan Seefeld wrote: > Gilles J. Seguin wrote: > >> Templates make a C++ parser at least 'x' times harder. >> Can you explain how we can handle template specialization with current >> parser. >> >> What I understand, >> - specializing templates requires that the tokens for a template >> declaration be saved so that the template can be reprocessed when the >> actual template arguments are specified. > > > I'm not sure whether that is necessary. It might be if you wanted to > do some validation such as make sure that the types on which a template > should be instantiated fulfill the requirements of the template. > I don't think this is all that useful at the moment. Unfortunately it all is necessary. Consider this: class Q : public Foo<int>::type { }; You need to have full templates support in order to get the base class of Q. However, if you restrict yourself just to parsing, then you need only one thing: you need to know if an identifier is a template (and this knowledge is necessary only when parsing expressions, to resolve between '<' and '>' being angle brackets or relational operators). > Existing difficulties in resolving ambiguities during template parsing > aside, I believe everything we need is already in place. The parser > can parse both template declarations as well as template specializations. > We only need to look up the first when we run into the second. All we > need is an environment... > >> - it requires the syntax phase to perform near-complete type, scope, and >> expression analysis, including function overloading. > > > Right, it would, if we tried to validate the code. > >> So my believed is that we need to rejuvenate the design. >> But my question is how. > > > Functionality-wise I believe the big issue is with ambiguities the current > parser can't resolve due to the two-phase parsing. We should try to merge > these two phases into one, as we discussed in another thread. Agreed. > Another issue is that of the code being quite monolithic. Now that I'v > had some first cut at a refactoring I'm more convinced than ever that > it is possible to clear up the code substantially, and thus make it simpler > to use and enhance (the old topic of a new AST API on top of the PTree > stuff comes to mind !). > > As far as synopsis is concerned, I'm looking into exposing these APIs > to python and then being able to script on top of that. That would also facilitate unit testing, right? BR Grzegorz > That would remove > the need for a complete, portable C++ application frontend, i.e. I believe > it's far easier to write an 'occ' application in python, where running > subprocesses and loading plugins is far more simple and portable then > doing the same in C/C++. Just imagine, we could completely drop all this > libtool / lddl business...! > > Regards, > Stefan > > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic Workshop > FREE Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click > _______________________________________________ > Opencxx-users mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/opencxx-users |
From: Stefan S. <se...@sy...> - 2004-09-02 00:30:17
|
Grzegorz Jakacki wrote: >> As far as synopsis is concerned, I'm looking into exposing these APIs >> to python and then being able to script on top of that. > > > That would also facilitate unit testing, right? yes indeed. Regards, Stefan |
From: Grzegorz J. <ja...@ac...> - 2004-09-01 14:11:12
|
Gilles J. Seguin wrote: [...] > > Templates make a C++ parser at least 'x' times harder. > Can you explain how we can handle template specialization with current > parser. Almost none in HEAD (there is some code in 'exp_templates' branch, but dated and perhaps it is not worth dragging in). > What I understand, > - specializing templates requires that the tokens for a template > declaration be saved so that the template can be reprocessed when the > actual template arguments are specified. More than that. Types in template definition that do not depend on template arguments must be bound in the point of definition; remaining types are bound in the point of instantiation. This means, that either you have to bind the types after you read the definition or you need to store the environment "snapshot" from the point of definition, to be able to bind appropriately at the point of instantiation. > - it requires the syntax phase to perform near-complete type, scope, and > expression analysis, including function overloading. Not really. First of all, if you just want to *parse*, then parser does have to perform all this analysis; for parser it is enough to know if a given identifier denotes template or not. Second, even if you want to perform all this analysis during parsing, you don't have to (and IMO you should not) put the implementation into parser. The design I am suggesting is that Parser stores a pointer to AbstractStaticAnalyzer. AbstractStaticAnalyzer is an interface that Parser uses to inform the rest of the system, that it parsed certain constructs. The implementation of AbstractStaticAnalyzer may, based on the information from Parser, maintain the data structures such as representations of scopes, classes or templates. However, another implementation of AbstractStaticAnalyzer may just do nothing, and it can be used for testing or in applications that are not concerned with full templates processing (e.g. pretty-printer). > Present design predate such change to the language. > In my vocabulary, present design is "C-Front style". That's right. > So my believed is that we need to rejuvenate the design. > But my question is how. Let me know if the description above makes sense for you. BR Grzegorz |
From: Grzegorz J. <ja...@ac...> - 2004-09-01 13:57:07
|
Stefan Seefeld wrote: > hi there, > > as I have been getting into the opencxx internals over the last weeks, > I'v started to refactor / redesign the code. Here are some thoughts > about this efford. > > I renamed some classes to make their goal clearer, and I changed the > method names to be more consistent with the rest of synopsis' naming > conventions. Anyways, here is the meat: [...] > That's about where I am right now. Cool! Would that make sense to take this code as a base for OpenC++ Core Lib? Ideally Core Lib should be reused with no changes between Synopsys and the rest of OpenC++. BR Grzegorz |
From: Stefan S. <se...@sy...> - 2004-09-02 00:29:03
|
Grzegorz Jakacki wrote: > Would that make sense to take this code as a base for OpenC++ Core Lib? I certainly hope so. That's why I'd like everybody who's interested in OpenC++ development to review what I'v been doing ! > Ideally Core Lib should be reused with no changes between Synopsys and > the rest of OpenC++. obviously, yes. Regards, Stefan |
From: Grzegorz J. <ja...@ac...> - 2004-09-02 13:52:35
|
Stefan Seefeld wrote: > Finally, I cleaned up the Encoding class such that PTree::Nodes now hold > PTree::Encoding instances instead of 'char *'. Does PTree now need to know about Encoding? BR Grzegorz |
From: Stefan S. <se...@sy...> - 2004-09-02 22:59:20
|
Grzegorz Jakacki wrote: > > Stefan Seefeld wrote: > >> Finally, I cleaned up the Encoding class such that PTree::Nodes now >> hold >> PTree::Encoding instances instead of 'char *'. > > > Does PTree now need to know about Encoding? yes. In my view Encoding is nothing but a type-safe version of the 'unsigned char *' that already was there before. The stuff that the PTree shouldn't care about should eventually be factored out, possibly into TypeInfo. One of the issues we should address next... Regards, Stefan |