From: Jeff D. <da...@ma...> - 2001-03-05 17:59:35
|
>Had anybody already thought about coding transform.php completely=20 >in object oriented style ? Here's a bunch of random thoughts I've been having regarding the transform code: Jeffs_hacks-branch has an OO transform.php. Nobody really liked it completely (not even myself). The "new" transform code in 1.3.x is not really very different from the "old" code. This is both good and bad. I think that the transform code could be improved. Turning it into a bona fide parser, as Thomas suggests, would really help in several areas: 1. It would help guarantee correctness of generated HTML. 2. It would make extensions to the markup such as Reini's <code></code> blocks cleaner to implement. (I'll refrain from commenting on whether <code> blocks are a good idea.) 3. I've starting thinking about how to colorize (or otherwise) mark the transformed text to highlight diffs. It's not easy to fit a colorizing scheme into the current transform code --- all the ways I've thought of to do this would surely not receive the Arno seal of code simplicity. If done correctly a two stage transformation --- parsing then output would allow this to be done in a much cleaner way. I think that any such new parser should probably operate in two steps. First it should parse the "in-line" markup elements (bold, italic, links) --- then the "block-level" elements (paragraphs, lists, tables, <code> blocks). Perhaps this can all be handled in one parse step; but the distinction between the two types of mark-up needs to be made pretty clear for the sake of correct HTML generation, among other things. (I'm thinking perhaps that the inline markup should be handled by adding the ability to mark regions of text as having specific "flavors". Flavors would include bold, italic, link (I think), as well as things like 'deleted', 'added', 'modified-add', 'modified-del'.) (This is all brain-storming, don't take it too seriously.) ---- I'm not sure that the ability to generate other mark-up (TeX or whatever) is of great importance, but turning transform into a proper parser would certainly make that easier as well. ---- To change topics slightly, a personal peeve I have with the current markup, is this thing about having to put entire paragraphs (& list items, etc...) on one line. Having been raised on troff and then TeX, those really-long lines just drive me batty. Looking at the textarea in the edit page on my browser, it's impossible to tell whether there's a real \n or not between lines. I often find myself manually deleting all spaces from the ends of lines to make sure there is no \n in there. To confuse matters more , for plain paragraphs, the requirement that the paragraph be on one line is silently waived --- currently, it is still enforced for list items (& tables). I think it would be a good idea to make it so that all lines which do not begin with some sort of block type mark-up (e.g. a '*', '#', ';', '|', or space) are interpreted as continuations of the preceding line. (The only reason I see for not making this change is that it will break existing pages.) Ie. A sentence. Another. * Item. More item. Should be interpreted as <p>A sentence. Another.</p> <ul> <li>Item. More item. </ul> instead of <p>A sentence. Another.</p> <ul><li>Item.</ul> <p>More item.</p> (I don't see any reason why italicization and boldizization shouldn't be able to span these continuation lines as well.) E.g.: ;:''Here are some followup comments. I really don't like this at all. --Jeff'' should work. ------- Comments? Jeff |