Re: [Cheetahtemplate-discuss] Fwd: stuff

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Chuck Esterbrook <Chu...@ya...> wrote:
> >As for CheetahXP  (I guess I asked for it ;-)  )  I
> >disagree with the parsing approach used for several reasons:
> >
> >1) the parser attempts to do ALL the parsing itself,
> >whereas Cheetah piggy-backs onto Python's parser
> >using the 'tokenize' module.  I believe piggy-backing is
> >vastly superior as it allows the full range of Python
> >expressions to be used in cheetah placeholder tags and
> >directives.  Furthermore, future changes to Python syntax
> >should require no changes to the Cheetah piggy-back parser.

I had problems dealing with the interface of tokenize.  The generator
interface would probably work fine (should such a thing come to
exist).

Basically, since the entire file isn't necessarily tokenizable, I have
to use a less aggressive lexer.  If tokenize can be less aggressive
(i.e., only find tokens when asked, not a line at a time), then I
could easily plug it in.  Maybe there's already a way of using it that
I didn't consider.

> >2) It makes it harder to implement new directives.  This is
> >very simple with the current implementation.  Remember how
> >fast I was able to implement #raw, #stop, #slurp  ... and
> >more recently #while, #try, #except, #finally, and #raise.
> >The last set took approximately half an hour of coding and
> >testing.

It's not difficult to add directives to this parser either.  #break,
for instance, consists of this code:

class Parser:
    def initDirectives(self):
        self.directiveEaters['break'] = self.eatBreak
    def eatBreak(self):
        self.addBreak()
    def addBreak(self):
        pass

class PythonGenerator(Parser):
    def addBreak(self):
        self.addBody('break\n')

Of course, #break is trivial.  But #while would only be slightly more
difficult.

> >CheetahXP requires changes to several places in
> >VelocityParser.py and PythonGenerator.py to implement a new
> >directive.   

VelocityParser is a relic, and should be integrated into Parser (or
maybe vice versa).  However, I think it's reasonable that you keep a
conceptual seperation between parsing and generation, and that's why
PythonGenerator and Parser are seperate.

> >3) It doesn't softcode the start and end tokens for
> >placeholders and directives.

No, but I don't really see any reason to softcode those.  What use
case would call for this?

> >4) It doesn't have a preprocessing stage.  This is very
> >useful for providing higher-level interfaces to low-level
> >#directives in Cheetah.  Case in point: #raw

I'm not sure I understand.  I don't do any translation from one
directive to another in the implementation, and I can't forsee any
reason to do so.  The way #raw is implemented, for instance, is
unnecessary if you don't have a preprocessing stage -- since there's
only one pass, when you encounter #raw you find #end raw, and then
call self.addConstant(foundText) and continue.  Each piece of text is
looked at once and only once.

> >5) It has no way of implementing #macros

#define is like #macro, only better.

> >Ian wrote:
> > >I was trying to dig around in the parser, and there were
> > >like 5 stages, and I just couldn't deal with it.
> >...
> > > The result is a one-stage parser/compiler -- directly
> > > from lexer to Python code.  It just seems much more
> > > sensible.
> >
> >There's really only three stages to the 'compiler': 1)
> >pre-process, 2) code-generation and 3) code-wrapping.  The
> >"parsing" is spread throughout 1 and 2.  Parser.py lines
> >214-280 do the parsing of $placeholders.
> >
> >A one-stage parser/compiler is definitely easier to
> >understand at first glance and is almost certainly faster.
> >However, it is also
> >a) significantly less flexible

Not at all.  Perhaps more flexible, since there's no place in the code
that needs to understand the global structure of the document.  Each
place in the code only looks at the next token, no further.

> >b) incompatible with Python-parser piggy-backing (see
> >above)

Only to the degree that I'm not willing to break out expressions
before I lex them.  The Python tokenizer is dumb in this respect.  I
just didn't realize it until I tried to use it :)

> >c) less likely to be understandable as you start adding all
> >directives and functionality of Cheetah.

I see no reason that this is true.  CheetahXP uses the traditional
top-down recursive-descent structure for a parser.  This is a design
that has been around since time immemorial (or at least sometime in
the 60's).  It's a good design.  It's a fast design.  It's a
conventional design.  It's sufficiently flexible for any sane
language.

>There are considerable advantages to sticking with the
>multi-stage approach that outweigh the initial learning
>curve.

It's not just a learning curve, it's about conventionality.  Why
reinvent the wheel?  They made it round for a reason.

> >Other comments/questions about CheetahXP:
> >----------------
> >* I can't get it to work

Ummm... maybe it's still a bit too raw.  CheetahXP/ has to be in
PYTHONPATH, and you have to run CheetahXP/compile in your directory
with .tmpl files.  Then (if you made "sample.tmpl") you import sample,
and sample.sample is the generated Python class.

> >* How would it run from an interactive session?   What's
> >the equivalent of "print Template(TD, {'var':1234})?

Well, if you put it in sample.tmpl, run compile, then do:
  import sample
  print str(sample.sample({'var':1234'}))

The Template class in CheetahXP is abstract.  Each template is a
class.  There should be some functions to make dealing with this
easier, of course -- it doesn't have to be any more difficult than any
other object model.

> >* I suspect the CheetahXP parser is faster, but
> >this doesn't concern me as Cheetah's parser is fast enough
> >and this is a once off cost.

It might be, but I don't really care much either.

> >* Why 'VelocityParser.py' instead of 'CheetahParser.py'?

It's the parser I wrote for Cheetah expressions before Cheetah had
that name (or any name?)

  Ian