From: Jeff Dairiki <dairiki@da...> - 2000-07-18 23:48:46
In message <14708.53062.991116.230868@...>,Arno Hollosi writes:
>The one place I can think of right now is the use of preg_match_all()
>in wiki_transform. Also, eregs don't have non-greedy matches. Can't
>remember which one, but I recall that there is at least one match
>which needs non-greediness.
Of course, "need" is always relative. :-)
> > Perhaps we can live with [invalid HTML]?
>I can, because the above case will not appear very often, will it?
Not except as a result of typos and brainos.
If the wiki markup is esoteric or just wrong, I don't mind if it comes
out looking like garbage (in fact, it should). However broken HTML makes
me nervous. Who knows what it will come out looking like on whatever
random browser I happen to be using? (I'll admit the world is unlikely to
>Btw, as your FIXME states: the recursive logic does not work as
>advertised: "__''word''__" renders ok, but "''__word__''" is not
>rendered - instead __ is inserted verbatim. Just looking at the code it
>becomes clear where the "fault" lies: you are always processing $line.
>Real recursion means processing the created tokens. (I guess you are
>aware of that already) Oddly enough replacing __ with ''' makes it
>work in both cases, but that is due to the regexp and not
>because of the recursion.
Actually, my original intent was to handle this via regexps.
My intent (not that it made it into the code) was that none of the
"''", "'''", or "__" quoted expressions are recognized unless they
contain no (untokenized) occurrence of either "''" or "__".
Ie. the regexp for the __Bold__ expressions should have been:
There! Haha. Make sense?
No really, you're right. It's broken.
> > I suppose we could eliminate the recursable logic, while keeping the
> > tokenization by applying each of the currently recursed transformations
> > twice.
>Apart from doing ''' before '' (otherwise '''word''' becomes '<i>word</i>')
>it does not immediately solve the problem. You need to transfrom the
>tokens and not $line as you do right now.
Of course. Okay, so never mind...
>So my conclusion is: recursion adds complexity (while having its benefits).
>Let's start with HTML-in-place right now, and once some time has
>passed and the dust settled, we can do the recursion stuff - we will
>then have a better understanding of the issue.
>[Or you write a functioning and beautiful recursion right away ;o)]
Let me search for a nicer solution for a little while more. (A week or two.)
As I see it, there's no big rush for this, as the present
wiki_transform works just fine.
From: Jeff Dairiki <dairiki@da...> - 2000-07-21 18:45:51
I've cleaned up the new wiki_transform code somewhat. I'm still
tokenizing the __bold__ and ''italic''s, but I think I've found a
cleaner way around the "recursable" problem.
I've also added support for pagenames in $PATH_INFO
(configurable in wiki_config with the WIKI_PAGENAME_IN_PATHINFO
I've created a new branch ("jeffs_hacks-branch") in the CVS repository
which contains both of these hacks. (To get it, you need
to add the '-rjeffs_hacks-branch' option to the 'cvs checkout' command.)
Comments are hereby solicited.