Re: [Spyce-users] Re: RE: spyce parser error

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi Igor,

>I do not really see a problem. Let's say you are going through python
>code from top to bottom. When you find special begin block tag you
>start indenting each line of code with n tabs. When you find a starting
>tripple quote you suspend indenting until closing tripple quote is
>found.

Not so simple... It actually simplifies your life to convert a
multi-line string into a regular string. It's harder to do indentation
with multi-line strings, because you don't want to insert spaces into
the string data. In order to convert a multi-line string into a regular
string, you'll need to parse the text. I agree that it's doable, but the
Python tokenizer module does it for me already in two lines!

>>your chosen indentation delimiter could appear somewhere in the code
>Generally it is up to a programmer what can appear in his code and what
>can not :)

Yes, but now you're placing restrictions on the programmer. He can't
just copy/paste existing code. All I'm saying is that it seems
arbitrary.

>> I don't like the choice of delimiter: @{ ...
>> Also, the braces are rather universal, accepted, and intuitive.
>
>Braces look natural if you are coming from C (Perl, PHP) background. If
>you have programmed mostly in Pascal, Visual Basic or Python braces do
>not look as appealing. What kind of delimiter to use to separate blocks
>is not so important. You can even make it a configurable option.
>Personally, I would prefer to type #begin and #end each time.

I actually do come from a Pascal background, and I'm a fan of the
begin-end style. I guess that we can support begin and end in addition
to the {-based indentation. I don't expect that it should not be too
much of a problem. I would go with a simple 'begin', not a '#begin'.

>>If you don't support braces, you're going to break existing code.
>Not a problem once again, you can allow to use braces as well.

Well, no... If you want to support braces then you need to parse the
code. Braces are common in Python, so it's unacceptable to process them
without context. We would need the Python tokenizer for this or at least
some complicated regular expression-based approach which might take just
as long. That would invalidate the gains that you are trying to attain.

>>You shouldn't design the language syntax based on an implementation
>>limitation in the regular expression library.
>It is not about regular expressions, it is about NOT passing each page
>through python lexer and tokenizer. So that instead of dealing with all
>the complexity of python syntax, you would be able to deal only with a
>couple of block tags and tripple quotes. If (what is still questionable
>at this point) it can make Spyce 10 times faster, I think it will be
>well worth the efforts.

First, we will still need the tokenizer, as mentioned above, for the
braces, unless you want to break existing code. Second, it will NOT make
Spyce 10 times faster. It will, at most, make Spyce compilation faster.
It will not affect the processing time. Moreover, compilation does not
need to occur on each request. You can cache files either in memory or
on disk. Furthermore, if compilation speed is your problem then I have a
better solution for you. Let's write an optional Python module written
in C that (if available) generates the Python code using a proper and
much faster C-based implementation of the parser. That's the way to go,
if compilation speed is what you are ultimately looking for. The
Python-based parser will give us portability, and the C-based parser
will give us speed, if it is available.

All the best,
Rimon.