From: Stefan S. <se...@sy...> - 2004-09-18 04:20:15
|
Grzegorz Jakacki wrote: >>yeah, I get your point, but I don't agree about the usefulness of >>a split there. I don't think parser is the right place to hook up >>user code. As I said, Visitors to introspect the generated ptree >>seems to me a much more elegant way. > > > In general I agree. However, how should parser know if given identifier is > a template (it has to know to resolve 'a < 1 > ( 0 )' ) ? I don't want to > put whole symbol table in parser, to prevent e.g. parser depending on > Class (which would introduce cyclic dependency between, say, Class and > Ptree). the symbol table I have in mind doesn't depend on 'Class' at all. It's something the parser adds declarations (types, variables, consts) to while it encounters declarators. Then, while it parses code, it can look up the symbol table to find out whether the symbol is known, and if so whether it refers to a type. As you mentioned, special precautions have to be taken because method implementations that are given in the class body references could appear to symbols not yet defined. I believe this issue can be dealt with by doing a first pass over a class body ignoring all function implementations that only collects member declarations, and then a second pass (still *inside* the same class body parse) that parses functions implementations. That's about it ! This symbol table could be something as simple as a hierarchical map from encodings to ptrees (declarations). This map can either be thrown away once the parser has finished, or it can be linked into the ptree (i.e. those ptree elements that correspond to scopes such as 'class spec', 'brace', 'for statement', etc., etc.) to ease the AST API that can be defined on top. >>>>, or you >>>>have to regenerate it in a second pass (aren't we trying to get rid of >>>>the second pass ?). > > > What do you mean by "second pass" exactly? What is the first pass then? My Right now the first pass is whatever happens inside Parser::rProgram(). The second pass is ClassWalker::Translate(), which constructs the symbol lookup tables ('Environments'). > impression was that the first pass is parsing, second pass is anything > that goes after that. If so, then I kind of don't understand if you are > aiming and minimizing or maximizing second pass. You are right. The problem is precisely that the second pass is a mix of things that (IMO) don't belong together. As I said above, I acknowledge the parsing stage can't be fully linear / sequential, but it can't simply be a matter of passing twice over the code (once over the code and once over a preliminary ptree). I suggest we get rid of 'first pass' and 'second pass', but instead talk about the 'parsing', resulting in a *correct* and fully typed ptree, and any number of ptree transformations that belong entirely into user (client) code. Again, I do believe that the parser can be considered a ptree factory, and it can work without any need for extensibility. It creates a parse tree, which may contain some special markup elements corresponding to user defined tokens such as 'metaclass'. Extensibility is then achieved by letting users provide Visitors that will run over this parse tree and hook up these markup elements with user code (such as metaclasses). >>>It does not matter when they are generated. My point is that dependence of >>>parser on one particular encoding used by one particular client is >>>improper. >> >>As I said, I don't see Encoding as 'one particular' representation. It's >>OpenC++'s internal representation. clients can certainly hook up their >>own with it. > > > The whole point of my effort is to make different parts of OpenC++ > reusable in separation. Encoding is a class with operations specific to > its semantics, which is representing a type. This is not a generic data, > this is data specific to one of the parser clients. I don't agree. 'Type' is a concept that is part of the parse tree layer, there's nothing client specific. > I am thinking of the > parser reusable also outside of the context of OpenC++ and Synopsis. fine. Can you come up with an example where the Encoding as you see it would be client specific ? May be we are just talking about different things here. > What is "ptree factory"? I view a parser as a black-box that gets: > > * a generator of token (Lex interface), > * a facility to report errors (ErrorLog interface), > > and outputs: > > * a Ptree (as a return value of a member function), > * errors (as a sequence of calls to member functions of ErrorLog object). fine with me. > This is how it works today. In such form it is usable for external > clients. I don't understand what you mean by "not 'public' for clients". > Could you explain? all but the Parser::rProgram() methods are private. All a user does is call Parser::rProgram() to get back a parse tree. That's just what you said above (let's ignore the error handling for now). My impression was that you wanted to somehow 'open up' this interface to make the Parser class extensible, such as change the behavior of individual 'ParseSomething()' methods (even if this just means to call into user code at appropriate places). That's what I objected to, because * it breaks encapsulation (i.e. destabilizes the API) * it's very complex (i.e. easy to use wrongly) * not necessary Again, may be there is a fundamental misunderstanding, so may be a concrete example (use case etc.) would help. Regards, Stefan |