From: Jeff D. <da...@da...> - 2000-07-18 21:25:40
|
In message <147...@da...>,Arno Hollosi writes: >Some Windows PHP's don't have preg_* functions. >You can do without them in most places, but there are some where you >absolutely need them. Not that I doubt you, but, out of curiosity: where? >Instead of tokenizing $line, you directly subsitute the HTML into $line. >So, step 1 $line is changed to >"<strong>Bold and ''bold italics''</strong>" >Step 2 does nothing and step three executes without nesting (no tokens >in $line): >"<strong>Bold and <i>bold italics</i></strong>" > >Voila :o) Okay, I get it now. The one drawback I see offhand is that it's possible for (invalid ?) wiki markup to generate invalid HTML. Eg.: "''__'' ''__''" becomes "<i><b></i> <i></b></i>". Perhaps we can live with that? >Problem solved. Only use tokens where they are absolutely necessary. >I don't see the need to tokenize emphasis markup or things like >'%%%' and '^-{4,}' Yes you could tokenize the <br> and <hr> or not --- since the tokenizing mechanism is already in place (an must remain so for the links, at least) it really makes no difference readability, or complexity, and negligible difference in run time. My thinking was that by tokenizing anything containing HTML markup, the HTML is protected from being mangled by subsequent transforms. As long as each transform individually produces complete (and correct) HTML entities, the proper nesting of the final HTML output is guaranteed. This helps to minimize the sensitivity on the ordering of the transforms. I view this as somewhat important since it will make the writing of (well-behaved) transforms in (as yet unimagined) future extension modules simpler. I suppose we could eliminate the recursable logic, while keeping the tokenization by applying each of the currently recursed transformations twice. 1. Transform "''"s 2. Transform "'''"s 3. Transform "__"s 4. Transform "''"s again 5. Transform "'''"s again This, I think, handles everything that your method does (while eliminating the possibility of invalid HTML output.) Jeff |