Re: [Phpwiki-talk] transform.php

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

>Had anybody already thought about coding transform.php completely=20
>in object oriented style ?

Here's a bunch of random thoughts I've been having regarding the
transform code:

Jeffs_hacks-branch has an OO transform.php.  Nobody really liked
it completely (not even myself).

The "new" transform code in 1.3.x is not really very different from
the "old" code.  This is both good and bad.

I think that the transform code could be improved.  Turning it into
a bona fide parser, as Thomas suggests, would really help in several areas:

1. It would help guarantee correctness of generated HTML.

2. It would make extensions to the markup such as Reini's <code></code>
   blocks cleaner to implement.  (I'll refrain from commenting on whether
   <code> blocks are a good idea.)

3. I've starting thinking about how to colorize (or otherwise) mark the
   transformed text to highlight diffs.  It's not easy to fit a colorizing
   scheme into the current transform code --- all the ways I've thought of
   to do this would surely not receive the Arno seal of code simplicity.

   If done correctly a two stage transformation --- parsing then output would
   allow this to be done in a much cleaner way.

I think that any such new parser should probably operate in two steps.  First
it should parse the "in-line" markup elements (bold, italic, links) --- then
the "block-level" elements (paragraphs, lists, tables, <code> blocks).
Perhaps this can all be handled in one parse step; but the distinction between
the two types of mark-up needs to be made pretty clear for the sake of
correct HTML generation, among other things.

(I'm thinking perhaps that the inline markup should be handled by adding
the ability to mark regions of text as having specific "flavors".  Flavors
would include bold, italic, link (I think), as well as things like 'deleted',
'added', 'modified-add', 'modified-del'.)

(This is all brain-storming, don't take it too seriously.)

----

I'm not sure that the ability to generate other mark-up (TeX or whatever)
is of great importance, but turning transform into a proper parser would
certainly make that easier as well.

----

To change topics slightly, a personal peeve I have with the current markup,
is this thing about having to put entire paragraphs (& list items, etc...)
on one line.  Having been raised on troff and then TeX, those really-long
lines just drive me batty.

Looking at the textarea in the edit page on my browser, it's impossible to 
tell whether there's a real \n or not between lines.  I often find myself 
manually
deleting all spaces from the ends of lines to make sure there is no \n 
in there.

To confuse matters more , for plain paragraphs, the requirement that the
paragraph be on one line is silently waived --- currently, it is still
enforced for list items (& tables).

I think it would be a good idea to make it so that all lines which do
not begin with some sort of block type mark-up (e.g. a '*', '#', ';', '|',
or space) are interpreted as continuations of the preceding line.
(The only reason I see for not making this change is that it will break
existing pages.) 

Ie.

A sentence.
Another.
* Item.
More item.

Should be interpreted as

<p>A sentence. Another.</p>
<ul>
<li>Item. More item.
</ul>

instead of

<p>A sentence. Another.</p>
<ul><li>Item.</ul>
<p>More item.</p>

(I don't see any reason why italicization and boldizization shouldn't be able 
to
span these continuation lines as well.)  E.g.:

;:''Here are some followup comments.
I really don't like this at all. --Jeff''

should work.

-------
Comments?

Jeff