You can subscribe to this list here.
| 2000 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(103) |
Jul
(105) |
Aug
(16) |
Sep
(16) |
Oct
(78) |
Nov
(36) |
Dec
(58) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2001 |
Jan
(100) |
Feb
(155) |
Mar
(84) |
Apr
(33) |
May
(22) |
Jun
(77) |
Jul
(36) |
Aug
(37) |
Sep
(183) |
Oct
(74) |
Nov
(235) |
Dec
(165) |
| 2002 |
Jan
(187) |
Feb
(183) |
Mar
(52) |
Apr
(10) |
May
(15) |
Jun
(19) |
Jul
(43) |
Aug
(90) |
Sep
(144) |
Oct
(144) |
Nov
(171) |
Dec
(78) |
| 2003 |
Jan
(113) |
Feb
(99) |
Mar
(80) |
Apr
(44) |
May
(35) |
Jun
(32) |
Jul
(34) |
Aug
(34) |
Sep
(30) |
Oct
(57) |
Nov
(97) |
Dec
(139) |
| 2004 |
Jan
(132) |
Feb
(223) |
Mar
(300) |
Apr
(221) |
May
(171) |
Jun
(286) |
Jul
(188) |
Aug
(107) |
Sep
(97) |
Oct
(106) |
Nov
(139) |
Dec
(125) |
| 2005 |
Jan
(200) |
Feb
(116) |
Mar
(68) |
Apr
(158) |
May
(70) |
Jun
(80) |
Jul
(55) |
Aug
(52) |
Sep
(92) |
Oct
(141) |
Nov
(86) |
Dec
(41) |
| 2006 |
Jan
(35) |
Feb
(62) |
Mar
(59) |
Apr
(52) |
May
(51) |
Jun
(61) |
Jul
(30) |
Aug
(36) |
Sep
(12) |
Oct
(4) |
Nov
(22) |
Dec
(34) |
| 2007 |
Jan
(49) |
Feb
(19) |
Mar
(37) |
Apr
(16) |
May
(9) |
Jun
(38) |
Jul
(17) |
Aug
(31) |
Sep
(16) |
Oct
(34) |
Nov
(4) |
Dec
(8) |
| 2008 |
Jan
(8) |
Feb
(16) |
Mar
(14) |
Apr
(6) |
May
(4) |
Jun
(5) |
Jul
(9) |
Aug
(36) |
Sep
(6) |
Oct
(3) |
Nov
(3) |
Dec
(3) |
| 2009 |
Jan
(14) |
Feb
(2) |
Mar
(7) |
Apr
(16) |
May
(2) |
Jun
(10) |
Jul
(1) |
Aug
(10) |
Sep
(11) |
Oct
(4) |
Nov
(2) |
Dec
|
| 2010 |
Jan
(1) |
Feb
|
Mar
(13) |
Apr
(11) |
May
(18) |
Jun
(44) |
Jul
(7) |
Aug
(2) |
Sep
(14) |
Oct
|
Nov
(6) |
Dec
|
| 2011 |
Jan
(2) |
Feb
(6) |
Mar
(3) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2012 |
Jan
(11) |
Feb
(3) |
Mar
(11) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
(4) |
Dec
|
| 2013 |
Jan
|
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(8) |
Dec
(1) |
| 2015 |
Jan
(3) |
Feb
(2) |
Mar
|
Apr
(3) |
May
(1) |
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
| 2016 |
Jan
|
Feb
(4) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(3) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
(5) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2021 |
Jan
|
Feb
(4) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
(6) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
| 2022 |
Jan
(11) |
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2023 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
(3) |
Dec
(3) |
| 2024 |
Jan
(7) |
Feb
(2) |
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2025 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
(1) |
Jun
|
Jul
(3) |
Aug
|
Sep
(5) |
Oct
|
Nov
|
Dec
|
|
From: Steve W. <sw...@wc...> - 2000-07-28 03:50:12
|
Jeff, I'm adding things to PhpWikiBrainstorm and I'm wondering.. is it easy to store the diffs instead of the whole pages? And would it be hard to iterate through the diffs and reconstitute version 5 of a page that's up to version 7? 10? 30? I know we want to store all versions of a page for safety reasons, and I'm wrangling with what the user interface should look like. sw ...............................ooo0000ooo................................. Hear FM quality freeform radio through the Internet: http://wcsb.org/ home page: www.wcsb.org/~swain |
|
From: Steve W. <sw...@wc...> - 2000-07-27 03:46:40
|
On Wed, 19 Jul 2000, Jeff Dairiki wrote: > On a related note: what about trying to phase out the page references > altogether? As far as I can see (except for the case of inlined > images) [1] with the appropriate entry in the page references is > identical in function to [1|http://www.foo.bar/wazoo.html]. Good point. Again, references were something we inhereted from classic Wiki. Arno made a good case to drop them when we started out on 1.1, but I was stubborn because some people use and like them (that is, people that emailed me, not my imaginary friends ;-) I am intrigued however by a change Ari made in the NBTSC PhpWiki where references are no more, they are instead "footnotes" in the classic sense: http://www.nbtsc.org/wiki/MarkupTest I haven't read the code for this yet. Whether it has value or not, I don't know, but it looks cool at least :-) Another reason I had was that references are the way to embed images... so in the end the compomise we made was to jam the references in one column and get a D- in database design. I guess I was also enamored with backwards compatibility. > If you're going to invent new syntax, how about something that doesn't > create any more special characters. Maybe something like > > [ INLINE | alt text | http://foo.com/img.png ] > > Yeah, it's ugly. There's probably something better. > > A couple of things to think about: > > To by lynx friendly, it would be nice to be able to specify ALT text > for an inlined image. > > One of the reasons I personally don't use the wiki inlined images very > much is that there's currently no way to specify the image size. > With my browser, that means I don't get to see any of the page until > the image loads. I hate that. > > One could create syntax for specifying image size: > > [ INLINE 20x60 | alt text | http://foo.com/img.png ] > > Or to get fancy, one might be able to have PhpWiki fetch the image > once (in a while) and cache the size. Ugh. I should start a WikiPage on ReinventingHtml. How about no embedded images? :-) sw ...............................ooo0000ooo................................. Hear FM quality freeform radio through the Internet: http://wcsb.org/ home page: www.wcsb.org/~swain |
|
From: Steve W. <sw...@wc...> - 2000-07-27 03:35:37
|
Just catching up on the week's posts... On Tue, 18 Jul 2000, Jeff Dairiki wrote: > 1. Transform "''"s > 2. Transform "'''"s > 3. Transform "__"s > 4. Transform "''"s again > 5. Transform "'''"s again > > This, I think, handles everything that your method does (while eliminating > the possibility of invalid HTML output.) Hmm. Don't forget ''''' :-) sw ...............................ooo0000ooo................................. Hear FM quality freeform radio through the Internet: http://wcsb.org/ home page: www.wcsb.org/~swain |
|
From: Steve W. <sw...@wc...> - 2000-07-26 04:10:09
|
Michael wrote me with problems using the admin/ tools... he was just getting a server error. It looks like some weirdness with his server configuration, and he ultimately did away with wiki_auth.php3 and installed his own .htaccess file. sw > > >2. If I try to start /admin/index.php3 I get an "error 500: intern > > >server error". > > > > No solution. > > Hmm. It will try to do basic HTTP authentication, promting you for a > login/password. I wonder if your version of Apache has auth enabled? > > You could try commenting out the block that tries to authenticate you > and see if you can access the forms... that would be a big help if we > knew to look for this. Now it works. 1. My ISP has only .php3, not .php included for running PHP. I have altered the file names etc. 2. If I insert a blank line before the first line in admin/index.php3 I don't get the error 500 (strange???) 3. I don't get a prompt for the password, so I think my ISP don't has auth enabled. 4. I have wiki_auth..php3 commented out and a .htaccess included in the admin directory - no sloppy systems administration ;) 5. The directories in wiki_config.php3 don't work for admin/php3. I have included a $AdminArchiveDataBase with a "../" before the path from $ArchiveDataBase. Thank you for the hints. It was a big help! best regards Michael d. Eschner ...............................ooo0000ooo................................. Hear FM quality freeform radio through the Internet: http://wcsb.org/ home page: www.wcsb.org/~swain |
|
From: Jeff D. <da...@da...> - 2000-07-21 18:45:51
|
I've cleaned up the new wiki_transform code somewhat. I'm still
tokenizing the __bold__ and ''italic''s, but I think I've found a
cleaner way around the "recursable" problem.
I've also added support for pagenames in $PATH_INFO
(configurable in wiki_config with the WIKI_PAGENAME_IN_PATHINFO
constant.)
I've created a new branch ("jeffs_hacks-branch") in the CVS repository
which contains both of these hacks. (To get it, you need
to add the '-rjeffs_hacks-branch' option to the 'cvs checkout' command.)
Comments are hereby solicited.
Jeff
|
|
From: Steve W. <sw...@wc...> - 2000-07-20 01:10:36
|
I'll be off the net most of Thursday-Monday, while I'm in Cleveland for a wedding. Just so you know if I'm not at all responsive :-) sw ...............................ooo0000ooo................................. Hear FM quality freeform radio through the Internet: http://wcsb.org/ home page: www.wcsb.org/~swain |
|
From: Jeff D. <da...@da...> - 2000-07-19 22:19:45
|
On a related note: what about trying to phase out the page references altogether? As far as I can see (except for the case of inlined images) [1] with the appropriate entry in the page references is identical in function to [1|http://www.foo.bar/wazoo.html]. Now... In message <Pin...@bo...>,Steve Wai nstead writes: >But unnamed, bracketed image links do: > >[http://phpwiki.sourceforge.net/phpwiki/png.png] > >Does anyone see any obvious flaws to this? Nothing terrible. But it sort of breaks my intuitive feeling that [http://foo/bar.png] should be equivalent to http://foo/bar.png in the same way that [WikiWord] is equivalent to WikiWord. > Would you rather introduce a > new construct, which might be more intuitive, like: > >{http://phpwiki.sourceforge.net/phpwiki/png.png} If you're going to invent new syntax, how about something that doesn't create any more special characters. Maybe something like [ INLINE | alt text | http://foo.com/img.png ] Yeah, it's ugly. There's probably something better. A couple of things to think about: To by lynx friendly, it would be nice to be able to specify ALT text for an inlined image. One of the reasons I personally don't use the wiki inlined images very much is that there's currently no way to specify the image size. With my browser, that means I don't get to see any of the page until the image loads. I hate that. One could create syntax for specifying image size: [ INLINE 20x60 | alt text | http://foo.com/img.png ] Or to get fancy, one might be able to have PhpWiki fetch the image once (in a while) and cache the size. In other news. The CVS problem appears to be fixed. You can now check out (or otherwise use) named revisions. I've mostly finished hacking in PATH_INFO support (switchable in wiki_config.) Soon (probably tomorrow) I'll check it into a branch ("jeffs_pathinfo_hacks-branch") in the CVS so y'all can inspect it. Jeff |
|
From: Steve W. <sw...@wc...> - 2000-07-19 21:12:22
|
I helped a PhpWiki user/operator with the hairy process of embedding an image file (which requires the use of "references"). This reminded me of a small change I was thinking of some time ago and I want everyone's feedback before I put it in place: Raw URLs do not get embedded: http://phpwiki.sourceforge.net/phpwiki/png.png Named URLs do not get embedded: [Check out this great art|http://phpwiki.sourceforge.net/phpwiki/png.png] But unnamed, bracketed image links do: [http://phpwiki.sourceforge.net/phpwiki/png.png] Does anyone see any obvious flaws to this? Would you rather introduce a new construct, which might be more intuitive, like: {http://phpwiki.sourceforge.net/phpwiki/png.png} ? The latter approach means that the use of [] is consistent. sw ...............................ooo0000ooo................................. Hear FM quality freeform radio through the Internet: http://wcsb.org/ home page: www.wcsb.org/~swain |
|
From: Jeff D. <da...@da...> - 2000-07-18 23:48:46
|
In message <147...@da...>,Arno Hollosi writes: >The one place I can think of right now is the use of preg_match_all() >in wiki_transform. Also, eregs don't have non-greedy matches. Can't >remember which one, but I recall that there is at least one match >which needs non-greediness. Of course, "need" is always relative. :-) > > Perhaps we can live with [invalid HTML]? > >I can, because the above case will not appear very often, will it? Not except as a result of typos and brainos. If the wiki markup is esoteric or just wrong, I don't mind if it comes out looking like garbage (in fact, it should). However broken HTML makes me nervous. Who knows what it will come out looking like on whatever random browser I happen to be using? (I'll admit the world is unlikely to end.) >Btw, as your FIXME states: the recursive logic does not work as >advertised: "__''word''__" renders ok, but "''__word__''" is not >rendered - instead __ is inserted verbatim. Just looking at the code it >becomes clear where the "fault" lies: you are always processing $line. >Real recursion means processing the created tokens. (I guess you are >aware of that already) Oddly enough replacing __ with ''' makes it >work in both cases, but that is due to the regexp and not >because of the recursion. You're right. Actually, my original intent was to handle this via regexps. My intent (not that it made it into the code) was that none of the "''", "'''", or "__" quoted expressions are recognized unless they contain no (untokenized) occurrence of either "''" or "__". Ie. the regexp for the __Bold__ expressions should have been: "__[^_'](?:[^_']+|_(?!_)|'(?!'))+(?<!_)__" There! Haha. Make sense? No really, you're right. It's broken. > > I suppose we could eliminate the recursable logic, while keeping the > > tokenization by applying each of the currently recursed transformations > > twice. > >Apart from doing ''' before '' (otherwise '''word''' becomes '<i>word</i>') >it does not immediately solve the problem. You need to transfrom the >tokens and not $line as you do right now. Of course. Okay, so never mind... >So my conclusion is: recursion adds complexity (while having its benefits). >Let's start with HTML-in-place right now, and once some time has >passed and the dust settled, we can do the recursion stuff - we will >then have a better understanding of the issue. > >[Or you write a functioning and beautiful recursion right away ;o)] Let me search for a nicer solution for a little while more. (A week or two.) As I see it, there's no big rush for this, as the present wiki_transform works just fine. Jeff |
|
From: Arno H. <aho...@in...> - 2000-07-18 22:35:20
|
> >Line-by-line processing is inherited from 1.0, which is how most Wikis do > >things. > > Do we want to get away from line-by-line processing? I don't. Keep the line-by-line approach. As you said: errors don't spill over the rest of the page. That makes wiki more fun to experiment with. /Arno |
|
From: Steve W. <sw...@wc...> - 2000-07-18 22:28:25
|
On Tue, 18 Jul 2000, Jeff Dairiki wrote: > Do we want to get away from line-by-line processing? > It can be done. > Now's the time to do it. > I think it might be faster that way besides. > > However, I kind of like the line-by-line processing. It keeps goofs in one > line from hosing the whole page. True... in a browser it's easy to see where you goofed by using View Source; not quite as practical here though. Would search be impacted? Probably not.. we can still iterate though lines by exploding() the text... Would storage be impacted? Again I think not... these are separated for a reason... I don't know. I think it's not a necessary change right now, and it creates even more work because certain markup has to be changed too. (Unless we're only talking about <b>, <i> and friends, then it's a minor point; to do all of them (<hr>, <pre>, etc) is too major a change, especially for 1.2. just thinking out loud again because I don't want to work on work, sw ................................ooo0000ooo................................. Hear FM quality freeform radio through the Internet: http://wcsb.org/ home page: www.wcsb.org/~swain |
|
From: J C L. <cl...@ka...> - 2000-07-18 22:24:46
|
On Tue, 18 Jul 2000 15:10:28 -0700 Jeff Dairiki <da...@da...> wrote: > However, I kind of like the line-by-line processing. It keeps > goofs in one line from hosing the whole page. It is significantly easier to do proper table support with line-by-line processing. Further whole file processing allows several easy optimisations. -- J C Lawrence Home: cl...@ka... ---------(*) Other: co...@ka... http://www.kanga/nu/~claw/ Keys etc: finger cl...@ka... --=| A man is as sane as he is dangerous to his environment |=-- |
|
From: Arno H. <aho...@in...> - 2000-07-18 22:21:03
|
> >Some Windows PHP's don't have preg_* functions. > >You can do without them in most places, but there are some where you > >absolutely need them. > > Not that I doubt you, but, out of curiosity: where? The one place I can think of right now is the use of preg_match_all() in wiki_transform. Also, eregs don't have non-greedy matches. Can't remember which one, but I recall that there is at least one match which needs non-greediness. > The one drawback I see offhand is that it's possible for (invalid ?) wiki > markup to generate invalid HTML. > > Eg.: "''__'' ''__''" becomes "<i><b></i> <i></b></i>". This is indeed invalid HTML. But the other way around (with tokens) the inner '' will have no effect at all (effectively: <i><i></i><i>) if __ is processed before '', or it becomes "<i>__</i> <i>__</i>" if __ is processed after ''. So the actual behaviour is not immediately apparent from the markup but depends on the implementation. Not much difference. > Perhaps we can live with [invalid HTML]? I can, because the above case will not appear very often, will it? > My thinking was that by tokenizing anything containing HTML markup, > the HTML is protected from being mangled by subsequent transforms. > As long as each transform individually produces complete (and correct) > HTML entities, the proper nesting of the final HTML output is guaranteed. A valid point. > This helps to minimize the sensitivity on the ordering of > the transforms. I view this as somewhat important since it will > make the writing of (well-behaved) transforms in (as yet unimagined) > future extension modules simpler. Ordering will always play a role. Though I have to agree that hiding HTML reduces one conflict point in the future for those "yet unimagined" extension modules. Btw, as your FIXME states: the recursive logic does not work as advertised: "__''word''__" renders ok, but "''__word__''" is not rendered - instead __ is inserted verbatim. Just looking at the code it becomes clear where the "fault" lies: you are always processing $line. Real recursion means processing the created tokens. (I guess you are aware of that already) Oddly enough replacing __ with ''' makes it work in both cases, but that is due to the regexp and not because of the recursion. > I suppose we could eliminate the recursable logic, while keeping the > tokenization by applying each of the currently recursed transformations > twice. > > 1. Transform "''"s > 2. Transform "'''"s > 3. Transform "__"s > 4. Transform "''"s again > 5. Transform "'''"s again Apart from doing ''' before '' (otherwise '''word''' becomes '<i>word</i>') it does not immediately solve the problem. You need to transfrom the tokens and not $line as you do right now. So my conclusion is: recursion adds complexity (while having its benefits). Let's start with HTML-in-place right now, and once some time has passed and the dust settled, we can do the recursion stuff - we will then have a better understanding of the issue. [Or you write a functioning and beautiful recursion right away ;o)] /Arno |
|
From: Jeff D. <da...@da...> - 2000-07-18 22:11:20
|
In message <Pin...@bo...>,Steve Wai nstead writes: >The minor drawback is that it's line-by-line processing, and if you want >to have successive lines in italics in preformatted text every line must >start and end with: > > ''here is my preformatted text in italics'' > >Line-by-line processing is inherited from 1.0, which is how most Wikis do >things. Do we want to get away from line-by-line processing? It can be done. Now's the time to do it. I think it might be faster that way besides. However, I kind of like the line-by-line processing. It keeps goofs in one line from hosing the whole page. |
|
From: Steve W. <sw...@wc...> - 2000-07-18 22:05:48
|
On Tue, 18 Jul 2000, Jeff Dairiki wrote: > >You can do without them in most places, but there are some where you > >absolutely need them. > > Not that I doubt you, but, out of curiosity: where? Oh, bugger... where was that? Arno's right though, there are places where preg_* are the only solution. > The one drawback I see offhand is that it's possible for (invalid ?) wiki > markup > to generate invalid HTML. > > Eg.: "''__'' ''__''" becomes "<i><b></i> <i></b></i>". > > Perhaps we can live with that? At some point you have to decide the user is sane and has some intelligence... we can concoct pathological situations all day and develop workarounds but I don't think that would make for a fun project. :-) > Yes you could tokenize the <br> and <hr> or not --- since the tokenizing > mechanism is already in place (an must remain so for the links, at least) > it really makes no difference readability, or complexity, and negligible > difference in run time. Probably true... > My thinking was that by tokenizing anything containing HTML markup, > the HTML is protected from being mangled by subsequent transforms. > As long as each transform individually produces complete (and correct) > HTML entities, the proper nesting of the final HTML output is guaranteed. > > This helps to minimize the sensitivity on the ordering of > the transforms. I view this as somewhat important since it will > make the writing of (well-behaved) transforms in (as yet unimagined) > future extension modules simpler. I agree; in a way this is a variation on the argument for storing all links in a separate table and storing the pages in a semi-state. What will the long term benefits be? In this case you can eliminate line-by-line processing entirely, but that would also require changes to the markup language (for plain text, you'd have to have some substitute for the tag instead of indenting with spaces like we do now; lists would be a nightmare; and we'd reinvent HTML, something I've repeatedly told users I have no intention of doing.) (Implementing XHTML might be worthwhile though. Mind you, I'm not suggesting this for 1.2 or even 1.4 (2.0?) but just speculating.) > I suppose we could eliminate the recursable logic, while keeping the > tokenization by applying each of the currently recursed transformations > twice. > > 1. Transform "''"s > 2. Transform "'''"s > 3. Transform "__"s > 4. Transform "''"s again > 5. Transform "'''"s again > > This, I think, handles everything that your method does (while eliminating > the possibility of invalid HTML output.) Not having read the code yet I'm not sure what the fuss is about... I did solve the whole issue of order-of-transformations in wiki_transform.php3 ages ago. Also, being performance minded is a good thing, but don't let it corner you into writing 10x the amount of code, or seriously complex code, just to gain small benefits. Wikis do not scale. Wikis cannot scale. They can grow a lot wider, but there is a low limit on how many people can edit a given topic before lost updates create confusion and frustration. Do not write bubble sorts; do not write loops that call external programs; but don't be afraid to use Perl regular expressions or make deep copies of objects, because we have the room to do it. sw ...............................ooo0000ooo................................. Hear FM quality freeform radio through the Internet: http://wcsb.org/ home page: www.wcsb.org/~swain |
|
From: Steve W. <sw...@wc...> - 2000-07-18 21:39:08
|
On Tue, 18 Jul 2000, Arno Hollosi wrote: > Sure, the new architecture is then a mixture of tokens and > HTML-in-place - compared to your tokens-only approach. > But it's much simplier - less complexity. And I don't think it's > too ugly from a structural point of view either. The minor drawback is that it's line-by-line processing, and if you want to have successive lines in italics in preformatted text every line must start and end with: ''here is my preformatted text in italics'' Line-by-line processing is inherited from 1.0, which is how most Wikis do things. just a minor point, sw ...............................ooo0000ooo................................. Hear FM quality freeform radio through the Internet: http://wcsb.org/ home page: www.wcsb.org/~swain |
|
From: Jeff D. <da...@da...> - 2000-07-18 21:25:40
|
In message <147...@da...>,Arno Hollosi writes:
>Some Windows PHP's don't have preg_* functions.
>You can do without them in most places, but there are some where you
>absolutely need them.
Not that I doubt you, but, out of curiosity: where?
>Instead of tokenizing $line, you directly subsitute the HTML into $line.
>So, step 1 $line is changed to
>"<strong>Bold and ''bold italics''</strong>"
>Step 2 does nothing and step three executes without nesting (no tokens
>in $line):
>"<strong>Bold and <i>bold italics</i></strong>"
>
>Voila :o)
Okay, I get it now.
The one drawback I see offhand is that it's possible for (invalid ?) wiki
markup
to generate invalid HTML.
Eg.: "''__'' ''__''" becomes "<i><b></i> <i></b></i>".
Perhaps we can live with that?
>Problem solved. Only use tokens where they are absolutely necessary.
>I don't see the need to tokenize emphasis markup or things like
>'%%%' and '^-{4,}'
Yes you could tokenize the <br> and <hr> or not --- since the tokenizing
mechanism is already in place (an must remain so for the links, at least)
it really makes no difference readability, or complexity, and negligible
difference in run time.
My thinking was that by tokenizing anything containing HTML markup,
the HTML is protected from being mangled by subsequent transforms.
As long as each transform individually produces complete (and correct)
HTML entities, the proper nesting of the final HTML output is guaranteed.
This helps to minimize the sensitivity on the ordering of
the transforms. I view this as somewhat important since it will
make the writing of (well-behaved) transforms in (as yet unimagined)
future extension modules simpler.
I suppose we could eliminate the recursable logic, while keeping the
tokenization by applying each of the currently recursed transformations
twice.
1. Transform "''"s
2. Transform "'''"s
3. Transform "__"s
4. Transform "''"s again
5. Transform "'''"s again
This, I think, handles everything that your method does (while eliminating
the possibility of invalid HTML output.)
Jeff
|
|
From: Arno H. <aho...@in...> - 2000-07-18 20:46:00
|
> (Speaking of which: it would probably be possible to avoid the use
> of the Perl regexps altogether, in favor of PHP's ereg_'s. Is this
> worth considering? How many PHP's are out there without PCRE support?)
Some Windows PHP's don't have preg_* functions.
You can do without them in most places, but there are some where you
absolutely need them. So if there's no way around it, you can use them
throughout.
> As a footnote though: I'm pretty sure that in most cases one transform
> with a complex regexp is faster than two transforms with simple regexps.
Point taken.
> The groups stuff is there to deal with the recursable stuff --- you haven't
> yet convinced me that the recursable stuff is unnecessary.
Ok, trying to convince you :o)
We need tokenization at least for links and stuff. That's for sure.
But do we need it for emphasis markup and the like?
Right now, recursive transforms are only used for '',''',__
Please correct me if I'm wrong.
Suppose the following line "__Bold and ''bold italics''__"
Transforms are registered in this order
1. __
2. '''
3. ''
Instead of tokenizing $line, you directly subsitute the HTML into $line.
So, step 1 $line is changed to
"<strong>Bold and ''bold italics''</strong>"
Step 2 does nothing and step three executes without nesting (no tokens
in $line):
"<strong>Bold and <i>bold italics</i></strong>"
Voila :o)
If there's something like "Look at __WikiLink__" it becomes:
"Look at __$token$__"
"Look at <strong>$token$</strong>"
"Look at <strong><a href="...">WikiLink</a></strong>"
Problem solved. Only use tokens where they are absolutely necessary.
I don't see the need to tokenize emphasis markup or things like
'%%%' and '^-{4,}'
By ensuring that transforms are executed in the right order, the
freshly inserted HTML tags won't interfere with later transformations.
E.g. it's important to do links and the '&<>' transform before
doing the rest.
Did I convince you?
Sure, the new architecture is then a mixture of tokens and
HTML-in-place - compared to your tokens-only approach.
But it's much simplier - less complexity. And I don't think it's
too ugly from a structural point of view either.
/Arno
|
|
From: Jeff D. <da...@da...> - 2000-07-18 20:15:36
|
I'm trying to start a new branch in the CVS in which to hack in support for PATH_INFO support. Every time I try to execute a CVS command referencing a tagged version, I get the error message: cvs [server aborted]: cannot write /cvsroot/phpwiki/CVSROOT/val-tags: Permission denied For example, all of the following commands fail with the same message: cvs diff -rrelease-1_1_7 cvs co -rrelease-1_1_7 cvs rtag -rjeffs_pathinfo_hacks-root -b jeffs_patchinfo_hacks-branch phpwiki (The last command is the one I really want to do.) Note that I had no problem creating the tag jeffs_pathinfo_hacks-root. Any ideas? Jeff |
|
From: Jeff D. <da...@da...> - 2000-07-18 20:10:43
|
In message <147...@da...>,Arno Hollosi writes: >I had a look at your new wiki_transform. >Overall impressive work. Thanks! >- the class interface (functions and variables) looks ok. > Some functions will have to be added in order to make it useable for > extracting links when using it from wiki_savepage. > (e.g. some way to access the array in WikiTokenizer()) It's already there (mostly). $page_renderer->wikilinks[] gets populated with all the WikiLinks found in the page. All that's needed is a bit more elegant API to get at the list. >- the regexp's are too complex in some places (which makes the > overall rendering slower than necessary): > Take for example: /__ ( (?: [^_] (?:[^_]|_[^_])* )? ) __/x > which renders __strong__ emphasis. Apparantly this regexp ensures > two things: no "_" after "__" and no "__" inside the emphasis. > How about: /__(.*?)__/ instead? ".*?" is non-greedy and thus > "__" cannot appear inside the emphasis. Also, why forbid "_" after > "__"? In your case "___text__" is rendered as "_<strong>text</strong>" > in my case it's rendered as "<strong>_text</strong>". What's the > difference? Okay, okay. So I'm paranoid. Yes the regexps should be cleaned up. My guess is that (at least in most cases) the speed differences are negligible --- I readily admit the regexps could be more readable. (Speaking of which: it would probably be possible to avoid the use of the Perl regexps altogether, in favor of PHP's ereg_'s. Is this worth considering? How many PHP's are out there without PCRE support?) >- Also, I don't think that all those "(?" extended regex syntax > is really necessary. It may be in some places, where it's important > to have a proper \0 match-value. But in all other places it adds > to complexity without any benefits (and makes the regexp slower, no?) Okay, okay already! :-) >- Ok, I don't like the groups. But groupTransforms() is plain ugly. > I understand that this stems from your goal to combine as many > transforms into a single $group as possible. I don't understand > the benefit of this approach - the only difference is that the > inner loops of render_line() are executed more often than the > outer for-loop. So what? The point was to do as much of the looping logic as possible (the grouping) only once rather once each line. It does make a speed difference. It is butt-ugly. I don't like it either. >- Maybe you are trying too hard with the concept of tokenization of a > line. E.g. is it really necessary to tokenize emphasises like "__" > and "'''"/"''"? Why not generate the HTML directly (<strong><b><i>)? > All you have to do is make sure, that later transforms don't mess > with the inserted HTML tags. By ordering the transforms (as you plan > to do anyway) this can be achieved easily. This would also solve > your problem of recursive transforms. Take the easy route first. > If we ever come accross a markup that requires recursive stuff, > then we can add recursive transforms. Right now I don't see the > need for them. The tokenization is not really necessary in all cases, but it is needed (I think) for the various links (or else the regexps get horrendous). If you accept that the tokenization code is needed, then it makes little difference (in complexity or time) whether <b>'s and <i>'s are tokenized or not. Tokenizing (I think) is safer --- less chance of some not-quite-completely- well-conceived custom WikiTransform mucking things up. As for recursiveness: I don't really see how direct substitution of the HTML gets around the root of the problem. How do you deal with __''Bold-italic'' and bold__ (or ''__Bold-italic__ and italic'')? (Or should we just punt on that?) >So my suggestions: > >- get rid of groups - implement priority sort order instead Yes, we need some sort of priority sorting anyhow, so that the WikiTransforms don't have to be registered() in a specific order. The groups stuff is there to deal with the recursable stuff --- you haven't yet convinced me that the recursable stuff is unnecessary. >- get rid of recursive markup - right now it's only needed for > emphasis. Insert the HTML tags instead. Again, I don't yet see how this helps? >- final transfroms can be dealt with one if-clause like > if($t->final) break; Yes that's the way I did it before I added the code to deal with the recursive stuff. (But then ''__Bold-italic__'' was broken.) >- make your regexps simpler. And if one regexp becomes too > complex split it into two transfroms. Okay, already! As a footnote though: I'm pretty sure that in most cases one transform with a complex regexp is faster than two transforms with simple regexps. Okay, so I guess my main counter-response is: Either: a) Convince me that the recursable stuff really is not needed. or b) Suggest a cleaner way to deal with the recursable stuff. Jeff |
|
From: Arno H. <aho...@in...> - 2000-07-18 18:56:09
|
Jeff, I had a look at your new wiki_transform. Overall impressive work. In some places it seems a little bit awkward. Actually, I had problems to understand how it works at first. I'm not sure I like the split into groups of the transfrom objects. The distinction final/normal/recursive seems necessary, but I'm sure it can be solved in a different way. See below (we can do away with recursive tokenization and the distinction final/normal can be dealt with one easy if-clause in render_line() instead of having groups and two different loops) Random thoughts: - the class interface (functions and variables) looks ok. Some functions will have to be added in order to make it useable for extracting links when using it from wiki_savepage. (e.g. some way to access the array in WikiTokenizer()) - the regexp's are too complex in some places (which makes the overall rendering slower than necessary): Take for example: /__ ( (?: [^_] (?:[^_]|_[^_])* )? ) __/x which renders __strong__ emphasis. Apparantly this regexp ensures two things: no "_" after "__" and no "__" inside the emphasis. How about: /__(.*?)__/ instead? ".*?" is non-greedy and thus "__" cannot appear inside the emphasis. Also, why forbid "_" after "__"? In your case "___text__" is rendered as "_<strong>text</strong>" in my case it's rendered as "<strong>_text</strong>". What's the difference? - Also, I don't think that all those "(?" extended regex syntax is really necessary. It may be in some places, where it's important to have a proper \0 match-value. But in all other places it adds to complexity without any benefits (and makes the regexp slower, no?) - Ok, I don't like the groups. But groupTransforms() is plain ugly. I understand that this stems from your goal to combine as many transforms into a single $group as possible. I don't understand the benefit of this approach - the only difference is that the inner loops of render_line() are executed more often than the outer for-loop. So what? - Maybe you are trying too hard with the concept of tokenization of a line. E.g. is it really necessary to tokenize emphasises like "__" and "'''"/"''"? Why not generate the HTML directly (<strong><b><i>)? All you have to do is make sure, that later transforms don't mess with the inserted HTML tags. By ordering the transforms (as you plan to do anyway) this can be achieved easily. This would also solve your problem of recursive transforms. Take the easy route first. If we ever come accross a markup that requires recursive stuff, then we can add recursive transforms. Right now I don't see the need for them. So my suggestions: - get rid of groups - implement priority sort order instead - get rid of recursive markup - right now it's only needed for emphasis. Insert the HTML tags instead. - final transfroms can be dealt with one if-clause like if($t->final) break; - make your regexps simpler. And if one regexp becomes too complex split it into two transfroms. Again, very promising start. Good work. /Arno |
|
From: Jeff D. <da...@da...> - 2000-07-18 05:35:30
|
>The pages are stored as MIME e-mail messages, with the meta-data stored >as parameters in the Content-Type: header. > >I also added the ability to make a zip including the archived versions of >the pages. In this case you still get one file per page, formatted >as a multipart MIME message: one part for each version of the page. Okay, so now how to use these zip files? Here's how: The CVS version now has a new config constant WIKI_PGSRC (in wiki_config), which controls the source for the initial page contents when index.php3 is first invoked on an empty database (ie. no FrontPage). If WIKI_PGSRC is set to the name of a zip file, that zip file is used for the initial page contents. If WIKI_PGSRC is set to './pgsrc' then the old behavior results. Note that the unzipping code only supports the 'store' (non-compressed) and 'deflate' compression methods --- furthermore the 'deflate' method is only support if PHP was compiled with zlib support. Also I'm somewhat unconvinced that the unzip code will work on deflated data from all zip programs. According to zip spec, the file CRC and compressed file size can be stored either ahead of the file data, or after the file data. My code only works if its stored ahead of the file data. (I think this is fixable, but is a bit of a pain --- one must determine the compressed data size from the compressed data stream itself.) I don't see much point in fixing it unless this is a problem for some major zipper (eg. PKZIP.) (The unzipper should work on all uncompressed zip files.) So far I've only tested this code with zip files from wiki_zip and from Info-Zip's zip 2.3. If y'all could test it on anything else you've got, that would be great. Jeff |
|
From: Jeff D. <da...@da...> - 2000-07-17 16:47:27
|
Here's a current snapshot of my thoughts on the new transform code.
This currently is in the form of a drop-in replacement for wiki_transform.
However, if I were to insert this into PhpWiki now, most of it would go
into wiki_stdlib. Some would go into new custum-feature module files.
Only a skeleton would remain in wiki_transform.
Here's some random thoughts, in order of increasing entropy:
Currently this only implements wiki_transform. However it should be
clear that class WikiRenderer can also be used as the basis for a modular
replacement to GeneratePage().
The main thing that I'm not completely happy with (and which is not yet
complete) is how the order of the WikiTransforms is specified.
(It is clear that some sort of 'order' or 'precedence' parameter is
required --- that's easy, I just haven't done it yet.) The hard
part is handling the following issues in an efficient, clean, clear way
(this issues are handled by this snapshot, but I'm not sure I'm happy
with the implementation):
o Some transforms are "final". When they are matched, they terminate the
page rendering.
o Some transforms (might) need to be applied repeatedly. Consider
constructs like "__''bold-italic''__".
Another issue is that putting the logic to handle these details into
(what is now) the inner loop (over transforms) is slow. I think I'll try
reversing the order of the loops (eg. make the loop over lines the inner loop,
and see if that helps).
Comments welcome.
Jeff
|
|
From: Jeff D. <da...@da...> - 2000-07-17 16:09:33
|
In message <147...@da...>,Arno Hollosi writes: > >I gave this some more thought. Here's what I've come up with. Good summary Arno. >Let me state again that the wikilink table can be used >with or without link tokenization. The benefits of this table are not >bound to tokenization. I agree completely. I think we should implement the link table soon. (In addition to the feature's we've been talking about, it will make the back-link search fast and correct.) >Pros: > > > 3. Faster page rendering. > >Wether or not this is true: it's a moot point. Just to add a data point: Wiki_transform takes about a second on the current TestPage (on PII/450). That is a fair amount of juice, and I can see that being an issue for some (though, it isn't for me, really). I don't think the new transform code is going to be any faster. >Jeff, I'd really like to see the class definitions of your >transform code. Okay! I'll send out my current working version in a separate email. Jeff |
|
From: Steve W. <sw...@wc...> - 2000-07-17 14:16:04
|
Great summary, Arno. As long as we architect 1.2 with this possibility in mind, I'm happy. sw On Mon, 17 Jul 2000, Arno Hollosi wrote: > > I gave this some more thought. Here's what I've come up with. > Let me state again that the wikilink table can be used > with or without link tokenization. The benefits of this table are not > bound to tokenization. > > Pros: > * Eliminate name collisions when merging wikis -- long term benefit > * Easy automatic link fixing when a page is renamed -- short term > benefit for a seldom (or not so seldom) used feature > * pages (and their referencing links) can be deleted easily -- short > term benefit for a seldom (or not so seldom) used feature. > > Note that the last two points "Seldom vs. often used feature" depends > on what kind of wiki you are running. In common wikis they would be > used seldom I reckon. > > Page deletes *without* deleting references can be easily done without > tokenization too. > > Con: > * Complexity and if it becomes too complex bugs may cause "Bad Things". > > > Other things mentioned: > > > Undefined pages can be listed separately (ListOfUndefinedPages) > > This can be done without tokenization as well. > Or is there more to this and I've overlooked something essential? > > > > 3. Inversion of the pre-transform [is hairy] > > > (Eg. was the link entered as "WikiLink", "[WikiLink]", > > > or "[ WikiLink ]"?) > > This is a moot point. Say links are always displayed as [WikiLink] > to editors afterwards. What's the drawback? > > > > 2. Bugs in transform code are more likely to cause > > > permanent changes/losses of page content. > > Only if the transform code becomes too complex. > > > > 3. Faster page rendering. > > Wether or not this is true: it's a moot point. > > > To sum it up: some (small?) short term benefits plus a long term > benefit weighed against added complexity. > > I vote for postponing this change until 1.4. > Eventually it will be done, but 1.2 is too early for this. > > Let's concentrate on the high priority issues first: > - making phpwiki more modular for easier extension and customization > - refactoring the db interface (going oop?) > - adding new navigation possibilities through use of our > new db schema. > > When this is done we can roll out 1.2. > And then we can start the really crazy things. > > Jeff, I'd really like to see the class definitions of your > transform code. > > > /Arno > > P.S: I have to switch vacation with a colleague. This means that > I'm on vacation from Thursday until end of July. Probably without > email access, but unable to code on phpwiki for sure :( > > -- > secret plan: > 1. world domination > 2. get all cookies > 3. eat all cookies > > _______________________________________________ > Phpwiki-talk mailing list > Php...@li... > http://lists.sourceforge.net/mailman/listinfo/phpwiki-talk > ................................ooo0000ooo................................. Hear FM quality freeform radio through the Internet: http://wcsb.org/ home page: www.wcsb.org/~swain |