From: Arno H. <aho...@in...> - 2000-07-18 18:56:09
|
Jeff, I had a look at your new wiki_transform. Overall impressive work. In some places it seems a little bit awkward. Actually, I had problems to understand how it works at first. I'm not sure I like the split into groups of the transfrom objects. The distinction final/normal/recursive seems necessary, but I'm sure it can be solved in a different way. See below (we can do away with recursive tokenization and the distinction final/normal can be dealt with one easy if-clause in render_line() instead of having groups and two different loops) Random thoughts: - the class interface (functions and variables) looks ok. Some functions will have to be added in order to make it useable for extracting links when using it from wiki_savepage. (e.g. some way to access the array in WikiTokenizer()) - the regexp's are too complex in some places (which makes the overall rendering slower than necessary): Take for example: /__ ( (?: [^_] (?:[^_]|_[^_])* )? ) __/x which renders __strong__ emphasis. Apparantly this regexp ensures two things: no "_" after "__" and no "__" inside the emphasis. How about: /__(.*?)__/ instead? ".*?" is non-greedy and thus "__" cannot appear inside the emphasis. Also, why forbid "_" after "__"? In your case "___text__" is rendered as "_<strong>text</strong>" in my case it's rendered as "<strong>_text</strong>". What's the difference? - Also, I don't think that all those "(?" extended regex syntax is really necessary. It may be in some places, where it's important to have a proper \0 match-value. But in all other places it adds to complexity without any benefits (and makes the regexp slower, no?) - Ok, I don't like the groups. But groupTransforms() is plain ugly. I understand that this stems from your goal to combine as many transforms into a single $group as possible. I don't understand the benefit of this approach - the only difference is that the inner loops of render_line() are executed more often than the outer for-loop. So what? - Maybe you are trying too hard with the concept of tokenization of a line. E.g. is it really necessary to tokenize emphasises like "__" and "'''"/"''"? Why not generate the HTML directly (<strong><b><i>)? All you have to do is make sure, that later transforms don't mess with the inserted HTML tags. By ordering the transforms (as you plan to do anyway) this can be achieved easily. This would also solve your problem of recursive transforms. Take the easy route first. If we ever come accross a markup that requires recursive stuff, then we can add recursive transforms. Right now I don't see the need for them. So my suggestions: - get rid of groups - implement priority sort order instead - get rid of recursive markup - right now it's only needed for emphasis. Insert the HTML tags instead. - final transfroms can be dealt with one if-clause like if($t->final) break; - make your regexps simpler. And if one regexp becomes too complex split it into two transfroms. Again, very promising start. Good work. /Arno |