From: Rui C. <rui...@ac...> - 2004-12-06 11:29:31
|
Hello there. I'm running a heavily customized version of PhpWiki at http://the.taoofmac.com, and have come up with an interesting problem. It might be from my tweaks (I forked my code from PhpWiki sometime around 1.3.7, I think), but plugins in the node contents seem to be invoked twice, i.e: * if I use PageTrails as part of browse.tmpl, it is only invoked once * If I add <?plugin PageTrails ?> to my Sandbox, the plugin code (i.e., run()) gets invoked twice, even though the output is only shown once. This is tolerable for simple stuff like PageTrails, but a _major_ nuisance for TitleSearch and whatnot (which is what I'm trying to optimize). I'm blaming it on template expansion (which I haven't modified myself and might have a few lingering bugs not present in current CVS), but instead of upgrading (which would mean refactoring all my current custom plugins and subpage handling) what I would _really_ like are some pointers as to how to go about debugging this, namely: * In which conditions are plugins expanded * if dump_template() is useful for debugging that (the initial results aren't satisfactory, but I might have missed something) I'm currently trying to wrap my neurons around printExpansion(), which is what I see getting called more often. I would have used PHP's call stack debugger if I could, but have to resort to browser output and file logging, so it's tough going (some sort of call graph for page generation would also be nice, since it is nigh on impossible to just use error_log() with all the built-in error handling mechanisms). Thanks, Rui Carmo http://the.taoofmac.com |
From: Reini U. <ru...@x-...> - 2004-12-06 17:15:47
|
Rui Carmo schrieb: > I'm running a heavily customized version of PhpWiki at > http://the.taoofmac.com, and have come up with an interesting problem. This blog theme is exactly what I wanted to have also, but didn't had time yet. Very good work! > It might be from my tweaks (I forked my code from PhpWiki sometime > around 1.3.7, I think), but plugins in the node contents seem to be > invoked twice, i.e: > > * if I use PageTrails as part of browse.tmpl, it is only invoked once > * If I add <?plugin PageTrails ?> to my Sandbox, the plugin code (i.e., > run()) gets invoked twice, even though the output is only shown once. If you have a plugin in a template it is invoced from there also, but no normal template should be processed twice. > This is tolerable for simple stuff like PageTrails, but a _major_ > nuisance for TitleSearch and whatnot (which is what I'm trying to > optimize). > > I'm blaming it on template expansion (which I haven't modified myself > and might have a few lingering bugs not present in current CVS), but > instead of upgrading (which would mean refactoring all my current custom > plugins and subpage handling) what I would _really_ like are some > pointers as to how to go about debugging this, namely: I'll do the work and merge some of your stuff to our version as another theme, ok? Our WikiBlog template smells awful. Also an example how to add google ads to some sidebar is useful for some theme. Do you have a tarball of your changes and the Kubrick theme somewhere to download? I was also working today on more radio userland features, like RSS2 and cloud support (pagechange notification via xml-rpc). The atom format not yet, but you already have that. Trackback is missing but almost the same. > * In which conditions are plugins expanded On every invocation only once :) > * if dump_template() is useful for debugging that (the initial results > aren't satisfactory, but I might have missed something) > > I'm currently trying to wrap my neurons around printExpansion(), which > is what I see getting called more often. I would have used PHP's call > stack debugger if I could, but have to resort to browser output and file > logging, so it's tough going (some sort of call graph for page > generation would also be nice, since it is nigh on impossible to just > use error_log() with all the built-in error handling mechanisms). Don't you have a GUI php debugger? -- Reini Urban http://xarch.tu-graz.ac.at/home/rurban/ |
From: Rui C. <rui...@ac...> - 2004-12-06 19:31:31
|
On Dec 6, 2004, at 5:15 PM, Reini Urban wrote: > Rui Carmo schrieb: >> I'm running a heavily customized version of PhpWiki at >> http://the.taoofmac.com, and have come up with an interesting >> problem. > > This blog theme is exactly what I wanted to have also, but didn't had > time yet. Very good work! Thanks. It's a bit crufty around the edges (the edit template is still broken, for instance), but it does the job for visitors. >> It might be from my tweaks (I forked my code from PhpWiki sometime >> around 1.3.7, I think), but plugins in the node contents seem to be >> invoked twice, i.e: >> * if I use PageTrails as part of browse.tmpl, it is only invoked once >> * If I add <?plugin PageTrails ?> to my Sandbox, the plugin code >> (i.e., run()) gets invoked twice, even though the output is only >> shown once. > > If you have a plugin in a template it is invoced from there also, but > no normal template should be processed twice. > >> This is tolerable for simple stuff like PageTrails, but a _major_ >> nuisance for TitleSearch and whatnot (which is what I'm trying to >> optimize). >> I'm blaming it on template expansion (which I haven't modified myself >> and might have a few lingering bugs not present in current CVS), but >> instead of upgrading (which would mean refactoring all my current >> custom plugins and subpage handling) what I would _really_ like are >> some pointers as to how to go about debugging this, namely: > > I'll do the work and merge some of your stuff to our version as > another theme, ok? Our WikiBlog template smells awful. > Also an example how to add google ads to some sidebar is useful for > some theme. > Do you have a tarball of your changes and the Kubrick theme somewhere > to download? No, there is no change tarball available. And the reason for that is that I've basically butchered the code over the last two years to remove PhpWiki features I don't like nor need (like multiple user support, subpage handling, etc.). So it's not really just a set of changes - I've edited just about everything from the database driver to the block parser, and admittedly not in a very elegant way. I also have a faint recollection of changing the database schema, although I will probably do it again soon to store post titles in a separate field upon page creation (saves me a lot of trouble to generate the Alphabetical Index, searching, etc.) and add a node creation date that stays put and doesn't get flushed away with older revisions (which would help my Atom feed). Not to mention a number of hard-coded regexps in the middle of the code to block spam referrers and other niceties of the open Internet... Actually, most of the stuff you see (the Kubrick theme, the ads, etc.) is really _trivial_ to do - the real work is in the plugins and CSS, and that isn't even halfway done yet (if you look at the CSS you'll still see two different markup styles, mine and Michael's, that I haven't yet merged). Adding the base theme took me all of two hours, most of it spent moving CSS and images around, and Adsense takes all of two minutes. Plus, it's not just PhpWiki anymore. I don't use any of the WikiBlog stuff (as I recall, it was based on UnfoldSubPages), I rely on a hardcoded blog/ prefix. There are also a couple of other things inside (although I haven't re-themed them all yet), and releasing the sources generally would mean I be expected to provide support and fix bugs for other people, so I don't do it wholesale - I just throw a copy of my plugins to whomever asks and make it plain that they're on their own. :) I could try to isolate some of the bits that are my code and send you a tarball, though. _After_ I fix the plugin calls. > I was also working today on more radio userland features, like RSS2 > and cloud support (pagechange notification via xml-rpc). > The atom format not yet, but you already have that. Oh yes, I hacked a copy of RssWriter and added a new formatter to RecentChanges. It does, however, rely on a lot of the internal caching stuff I also added (to check If-Modified-Since headers), and almost completely breaks your coding conventions :) > Trackback is missing but almost the same. I don't do trackback or comments - it's a free JavaScript add-on from HaloScan.com (I don't want to bother with dealing with comment spam, and I only added comments at all as an experiment). >> * In which conditions are plugins expanded > > On every invocation only once :) And, returning to my original question, where does that happen in the code? The only thing I haven't done so far is to sanitize the templates themselves to see if I've created some sort of loop... >> * if dump_template() is useful for debugging that (the initial >> results aren't satisfactory, but I might have missed something) >> I'm currently trying to wrap my neurons around printExpansion(), >> which is what I see getting called more often. I would have used >> PHP's call stack debugger if I could, but have to resort to browser >> output and file logging, so it's tough going (some sort of call graph >> for page generation would also be nice, since it is nigh on >> impossible to just use error_log() with all the built-in error >> handling mechanisms). > > Don't you have a GUI php debugger? No. I have to access the machine via SSH and I don't know of any free ones that run on Macs (or decent ones that I can run on Linux), so I would really appreciate some pointers to where plugin invocation happens in the source code and/or how to disable _all_ the extra error handling so I can simply place a few calls here and there and see the output in the system log... Thanks, R. |
From: Reini U. <ru...@x-...> - 2004-12-06 20:02:07
|
Rui Carmo schrieb: > On Dec 6, 2004, at 5:15 PM, Reini Urban wrote: >> Rui Carmo schrieb: >> >>> I'm running a heavily customized version of PhpWiki at >>> http://the.taoofmac.com, and have come up with an interesting problem. >> >> This blog theme is exactly what I wanted to have also, but didn't had >> time yet. Very good work! > > Thanks. It's a bit crufty around the edges (the edit template is still > broken, for instance), but it does the job for visitors. > >>> It might be from my tweaks (I forked my code from PhpWiki sometime >>> around 1.3.7, I think), but plugins in the node contents seem to be >>> invoked twice, i.e: >>> * if I use PageTrails as part of browse.tmpl, it is only invoked once >>> * If I add <?plugin PageTrails ?> to my Sandbox, the plugin code >>> (i.e., run()) gets invoked twice, even though the output is only >>> shown once. >> >> If you have a plugin in a template it is invoced from there also, but >> no normal template should be processed twice. >> >>> This is tolerable for simple stuff like PageTrails, but a _major_ >>> nuisance for TitleSearch and whatnot (which is what I'm trying to >>> optimize). >>> I'm blaming it on template expansion (which I haven't modified myself >>> and might have a few lingering bugs not present in current CVS), but >>> instead of upgrading (which would mean refactoring all my current >>> custom plugins and subpage handling) what I would _really_ like are >>> some pointers as to how to go about debugging this, namely: >> >> >> I'll do the work and merge some of your stuff to our version as >> another theme, ok? Our WikiBlog template smells awful. >> Also an example how to add google ads to some sidebar is useful for >> some theme. >> Do you have a tarball of your changes and the Kubrick theme somewhere >> to download? > > No, there is no change tarball available. And the reason for that is > that I've basically butchered the code over the last two years to remove > PhpWiki features I don't like nor need (like multiple user support, > subpage handling, etc.). > > So it's not really just a set of changes - I've edited just about > everything from the database driver to the block parser, and admittedly > not in a very elegant way. I also have a faint recollection of changing > the database schema, although I will probably do it again soon to store > post titles in a separate field upon page creation (saves me a lot of > trouble to generate the Alphabetical Index, searching, etc.) and add a > node creation date that stays put and doesn't get flushed away with > older revisions (which would help my Atom feed). > > Not to mention a number of hard-coded regexps in the middle of the code > to block spam referrers and other niceties of the open Internet... > > Actually, most of the stuff you see (the Kubrick theme, the ads, etc.) > is really _trivial_ to do - the real work is in the plugins and CSS, and > that isn't even halfway done yet (if you look at the CSS you'll still > see two different markup styles, mine and Michael's, that I haven't yet > merged). Adding the base theme took me all of two hours, most of it > spent moving CSS and images around, and Adsense takes all of two minutes. > > Plus, it's not just PhpWiki anymore. I don't use any of the WikiBlog > stuff (as I recall, it was based on UnfoldSubPages), I rely on a > hardcoded blog/ prefix. There are also a couple of other things inside > (although I haven't re-themed them all yet), and releasing the sources > generally would mean I be expected to provide support and fix bugs for > other people, so I don't do it wholesale - I just throw a copy of my > plugins to whomever asks and make it plain that they're on their own. :) > > I could try to isolate some of the bits that are my code and send you a > tarball, though. _After_ I fix the plugin calls. Ah okay. I'll probably use just your blog theme then. And some js-buttons here and there. >> I was also working today on more radio userland features, like RSS2 >> and cloud support (pagechange notification via xml-rpc). >> The atom format not yet, but you already have that. > > Oh yes, I hacked a copy of RssWriter and added a new formatter to > RecentChanges. It does, however, rely on a lot of the internal caching > stuff I also added (to check If-Modified-Since headers), and almost > completely breaks your coding conventions :) > >> Trackback is missing but almost the same. > > I don't do trackback or comments - it's a free JavaScript add-on from > HaloScan.com (I don't want to bother with dealing with comment spam, and > I only added comments at all as an experiment). > >>> * In which conditions are plugins expanded >> >> On every invocation only once :) > > And, returning to my original question, where does that happen in the > code? The only thing I haven't done so far is to sanitize the templates > themselves to see if I've created some sort of loop... > >>> * if dump_template() is useful for debugging that (the initial >>> results aren't satisfactory, but I might have missed something) >>> I'm currently trying to wrap my neurons around printExpansion(), >>> which is what I see getting called more often. I would have used >>> PHP's call stack debugger if I could, but have to resort to browser >>> output and file logging, so it's tough going (some sort of call graph >>> for page generation would also be nice, since it is nigh on >>> impossible to just use error_log() with all the built-in error >>> handling mechanisms). >> >> Don't you have a GUI php debugger? > > No. I have to access the machine via SSH and I don't know of any free > ones that run on Macs (or decent ones that I can run on Linux), so I > would really appreciate some pointers to where plugin invocation happens > in the source code and/or how to disable _all_ the extra error handling > so I can simply place a few calls here and there and see the output in > the system log... Ah I forget to wrote that, sorry. The only and main is in CachedMarkup.php: Cached_PluginInvocation::expand() This is where you must put your debugprint statement. expand() is called recursively through the HTML parse tree. The expander for the Cached_PluginInvocation class is a normal printExpansion() call. -- Reini Urban http://xarch.tu-graz.ac.at/home/rurban/ |
From: Rui C. <rui...@ac...> - 2004-12-06 20:36:29
|
On Dec 6, 2004, at 8:02 PM, Reini Urban wrote: > Rui Carmo schrieb: >> > > Ah okay. I'll probably use just your blog theme then. And some > js-buttons here and there. I'll send you SeeAlso and some of the other stuff for inclusion, though. Lots of people seem to like that ;) > > Ah I forget to wrote that, sorry. > > The only and main is in CachedMarkup.php: > Cached_PluginInvocation::expand() > This is where you must put your debugprint statement. > expand() is called recursively through the HTML parse tree. > > The expander for the Cached_PluginInvocation class is > a normal printExpansion() call. > Many thanks. I'll look into it after I rsync a copy to my local machine to try to get better debugging. R. |
From: Carsten K. <car...@ya...> - 2004-12-07 01:33:34
|
On Dec 6, 2004, at 6:28 am, Rui Carmo wrote: > Hello there. > > I'm running a heavily customized version of PhpWiki at > http://the.taoofmac.com, and have come up with an interesting problem. > > It might be from my tweaks (I forked my code from PhpWiki sometime > around 1.3.7, I think), but plugins in the node contents seem to be > invoked twice, i.e: > > * if I use PageTrails as part of browse.tmpl, it is only invoked once > * If I add <?plugin PageTrails ?> to my Sandbox, the plugin code > (i.e., run()) gets invoked twice, even though the output is only shown > once. Hi Rui, This sounds exactly like the "double transformation" bug which I (inadvertently introduced, and later) fixed around 1.3.7. I think you may find the answer here, by comparing an old revision of PageType.php with the one before it: http://cvs.sourceforge.net/viewcvs.py/phpwiki/phpwiki/lib/PageType.php? r1=1.9&r2=1.10 IIRC normally you wouldn't notice the double transformation, except on pages which contain plugins, so I'm quite sure this is it. I haven't had time to work on PhpWiki in almost a year but I do periodically monitor this list and thought I could help in this case. If this does not turn out to be the cause of the problem, I'm sorry then, offhand I don't know what else to try as I am quite out of date with the current code. Carsten |
From: Rui C. <rui...@ac...> - 2004-12-07 11:40:22
|
Bingo. I spotted it late yesterday evening when trying to figure out what PageType was for. My fork already had some of those lines moved around (I added If-Modified-Since checking near those bits and return "not modified"), but that was basically it. Which would definitely place my fork as 1.3.7-plus-bits-of-other-versions-on-steroids :) If only it wasn't running for two years like this (sigh). Well, at least my other optimizations work even faster now :) R. On Dec 7, 2004, at 1:33, Carsten Klapp wrote: > > On Dec 6, 2004, at 6:28 am, Rui Carmo wrote: > >> Hello there. >> >> I'm running a heavily customized version of PhpWiki at >> http://the.taoofmac.com, and have come up with an interesting >> problem. >> >> It might be from my tweaks (I forked my code from PhpWiki sometime >> around 1.3.7, I think), but plugins in the node contents seem to be >> invoked twice, i.e: >> >> * if I use PageTrails as part of browse.tmpl, it is only invoked once >> * If I add <?plugin PageTrails ?> to my Sandbox, the plugin code >> (i.e., run()) gets invoked twice, even though the output is only >> shown once. > > Hi Rui, > > This sounds exactly like the "double transformation" bug which I > (inadvertently introduced, and later) fixed around 1.3.7. > > I think you may find the answer here, by comparing an old revision of > PageType.php with the one before it: > > http://cvs.sourceforge.net/viewcvs.py/phpwiki/phpwiki/lib/ > PageType.php?r1=1.9&r2=1.10 > > IIRC normally you wouldn't notice the double transformation, except on > pages which contain plugins, so I'm quite sure this is it. > > I haven't had time to work on PhpWiki in almost a year but I do > periodically monitor this list and thought I could help in this case. > If this does not turn out to be the cause of the problem, I'm sorry > then, offhand I don't know what else to try as I am quite out of date > with the current code. > > Carsten |
From: Rui C. <rui...@ac...> - 2004-12-08 12:31:17
|
Besides yesterday's fix, I spent quite some time figuring out why I still had another "hit" on the search pages. At first I thought it was something inside the search code, but when I started logging IP addresses it became obvious. It turned out that AdSense will trigger an _immediate_ hit from the google bot if I display ads on my *Search pages, which prompted me to include this little snippet right at the start of index.php (to waste as little resources as possible): define( "DUMB_BOTS", '/(JPluck|Mediapartners|ia_archiver|googlebot|msnbot|Crawl)/i' ); if( preg_match( DUMB_BOTS, $_SERVER['HTTP_USER_AGENT'] ) ) { if( preg_match( '/\?(s|action|version)=/', $_SERVER['REQUEST_URI'] ) ) { header( "HTTP/1.1 404 File Not Found" ); echo( "<H1>404 File Not Found</H1>" ); exit; } } On Dec 7, 2004, at 11:39 AM, Rui Carmo wrote: > Bingo. I spotted it late yesterday evening when trying to figure out > what PageType was for. My fork already had some of those lines moved > around (I added If-Modified-Since checking near those bits and return > "not modified"), but that was basically it. > > Which would definitely place my fork as > 1.3.7-plus-bits-of-other-versions-on-steroids :) > > If only it wasn't running for two years like this (sigh). Well, at > least my other optimizations work even faster now :) > > R. > > On Dec 7, 2004, at 1:33, Carsten Klapp wrote: > >>> >> >> Hi Rui, >> >> This sounds exactly like the "double transformation" bug which I >> (inadvertently introduced, and later) fixed around 1.3.7. |
From: Reini U. <ru...@x-...> - 2004-12-08 14:17:53
|
Rui Carmo schrieb: > Besides yesterday's fix, I spent quite some time figuring out why I > still had another "hit" on the search pages. At first I thought it was > something inside the search code, but when I started logging IP > addresses it became obvious. > > It turned out that AdSense will trigger an _immediate_ hit from the > google bot if I display ads on my *Search pages, which prompted me to > include this little snippet right at the start of index.php (to waste as > little resources as possible): > > > define( "DUMB_BOTS", > '/(JPluck|Mediapartners|ia_archiver|googlebot|msnbot|Crawl)/i' ); > > if( preg_match( DUMB_BOTS, $_SERVER['HTTP_USER_AGENT'] ) ) { > if( preg_match( '/\?(s|action|version)=/', $_SERVER['REQUEST_URI'] ) ) { > header( "HTTP/1.1 404 File Not Found" ); > echo( "<H1>404 File Not Found</H1>" ); > exit; > } > } That's not a good idea! Google's Ad Sense checks where the ads really appear on the page, and calculates the rank (= money!) from this info. If you reject the checker you will get no profit from AdSense at all. rejected the main googlebot is also not a good idea. The googlebot is a good thing. You just have to prepare for being "slashdotted" to death once in a while. Some referrer check or referrer throttling. -- Reini Urban http://xarch.tu-graz.ac.at/home/rurban/ |
From: Rui C. <rui...@ac...> - 2004-12-13 21:03:18
|
Well, actually it is. Otherwise, the Google bot will go in and fill the database with empty pages wherever I leave an unlinked WikiWord, and, in this particular case, _will_repeat_the_search_. If it's a FullTextSearch, it's a significant performance hit for me, since I'm not (yet) caching the results. Note that I only exclude the bots from _action_ pages and older content versions, not from the rest of the site. And I haven't noticed any significant drop in AdSense revenue (it isn't that much to begin with). Search hits are something like 0.005% of my total traffic. Plus, referrer checking will do nothing for this - as far as I can see, the bot does not send a referrer at all in the search case. I have very significant pieces of code devoted to doing referrer checking and spam filtering, so I would have noticed. Moving on, I have finally debugged the edit template for the Kubrick theme. I still have a bug someplace in the diff routines, but have cleaned up the templates a bit and will, like I promised, be sending you a zip with them - it will, however, require further cleanup, since my navigation bar is different from the standard one, etc. I think you'll find it useful, and I look forward to seeing it in new versions of PhpWiki. I will also try to clean up some of my custom plugins and send them later - I've just returned from vacation, and work is already impinging on me. Regards, R. http://the.taoofmac.com On Dec 8, 2004, at 2:17 PM, Reini Urban wrote: > Rui Carmo schrieb: >> Besides yesterday's fix, I spent quite some time figuring out why I >> still had another "hit" on the search pages. At first I thought it >> was something inside the search code, but when I started logging IP >> addresses it became obvious. >> It turned out that AdSense will trigger an _immediate_ hit from the >> google bot if I display ads on my *Search pages, which prompted me to >> include this little snippet right at the start of index.php (to waste >> as little resources as possible): >> define( "DUMB_BOTS", >> '/(JPluck|Mediapartners|ia_archiver|googlebot|msnbot|Crawl)/i' ); >> if( preg_match( DUMB_BOTS, $_SERVER['HTTP_USER_AGENT'] ) ) { >> if( preg_match( '/\?(s|action|version)=/', $_SERVER['REQUEST_URI'] >> ) ) { >> header( "HTTP/1.1 404 File Not Found" ); >> echo( "<H1>404 File Not Found</H1>" ); >> exit; >> } >> } > > That's not a good idea! > Google's Ad Sense checks where the ads really appear on the page, and > calculates the rank (= money!) from this info. > If you reject the checker you will get no profit from AdSense at all. > > rejected the main googlebot is also not a good idea. > The googlebot is a good thing. > You just have to prepare for being "slashdotted" to death once in a > while. Some referrer check or referrer throttling. > -- > Reini Urban > http://xarch.tu-graz.ac.at/home/rurban/ |