Re: [Phpwiki-talk] pgsql bug

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Jeff Dairiki wrote:

>On Mon, 03 Mar 2003 08:56:18 +1000
>Cameron Brunner <ga...@in...> wrote:
>  
>
>>>(For a fairly well tested lighter-weight backend try the dba backend.)
>>> 
>>>
>>>      
>>>
>>That requires recompiling php.
>>    
>>
>
>Okay, never mind then.
> 
>  
>
>>>The flat-file backend is a recent addition by Jochen Kalmbach
>>>(<Jo...@ka...>).  That said the only bug I know of in it
>>>is that it won't work on case-insensitive file systems (Windows).
>>>
>>>      
>>>
>>Simple enough but just sitting here i had an idea how to fix that, 
>>instead of the filename on each character of the filename ord() it so as
>>
>>it just becomes a big int string, makes case insensitive a non issue. I 
>>would suggest md5 but with just using ord/chr you can reverse the 
>>process so you can get the real name back from the file.
>>    
>>
>
>Yeah, I was thinking along similar lines.  Current the file names
>are rawurlencoded to convert non alphanumerics to %xx codes.
>On those systems which have case insensitive file-systems, add
>one more step to add a carat (or something) in front
>of capital letters.  (An underscore would be a better choice,
>except that underscores are not escaped by rawurlencode.)
>I think this is a good choice, both since it tends to preserve
>the legibility of file names, and doesn't lengthen filenames
>unnecessarily.  (As I suspect filename length may become
>an issue on some systems...)
>
>Untested code:
>
>function pagename_to_filename ($pagename) {
>   return preg_replace('/(?<! % | %.)([A-Z])/x', '^$1', 
>                       rawurlencode($pagename));
>}
>
>(The look-behind assertion in the regexp prevents caratifying of
>capital letters in (%7E) rawurl escape sequences.)
>
>This should escape "~PageName" to "%7E^Page^Name"
>  
>
That's 1 way but the reason i suggested ord() was it will be quite fast, 
consider that filename length limit on windows is 128 or 256 (forget) 
its not much of an issue unless pages are 64/128chars long. As for 
readability, it wouldnt be hard to make a little script that opened the 
dir and listed like this

realfilename    intstringhere

Then again, there's nothing to say you couldnt make a plugin function to 
do the filename encoding and in that detect if its unix, no encode, if 
win32, watevr you prefer.

>(But since it looks like you're running on a unix server,
>is there some other problem which is keeping you from using
>the flat-file backend?)
>
>  
>
Just retested it now (last time it wouldnt work at all, blank page all 
the time) and I got the same as I'm getting in PgSQL, blank pages after 
loading the virgin wiki.

>>>There is an unadvertised install time config value which (is supposed
>>>to) defeat the caching of the marked-up data altogether.
>>>
>>>      
>>>
>>ummmm, im not saying dont cache, im saying dont compress it when u write
>>
>>it to the db/filesystem for performance reasons, as for suggesting bzip,
>>
>>if people are worried about disk usage bzip will be better than gzip, 
>>there should also be configurable compression level IMO, 9 takes a lot 
>>of cpu, most of the time i find 2 is fine for on the fly stuff and uses 
>>a fair bit less cpu.
>>    
>>
>
>I understood what you meant.  I think when gzip, performance
>considerations
>are a non-issue.  The zipping is much faster than the rest of the PhpWiki
>code.
>Bzip would be fine --- however I suspect the space savings are not large,
>and
>I didn't do that to avoid code complications/complexity.
>  
>
As for performance, you cant just write off optimizing it knowing that 
wiki is slow, you need to do what you can with all the little things 
first, if its as simple as profiling each function and seeing whats the 
slowest then so be it. (http://apd.communityconnect.com/ for profiling 
abilities)

>>>>Also I am curious why the LOCK TABLE's in the code? eel free.
>>>>        
>>>>
>>Simplest way is just to make lock() in pgsql do nothing, its already 
>>surrounded by begin/commit because of pear it seems.
>>    
>>
>
>No we do the begin/commit's (in lib/WikiDB/backend/pgsql.php).
>
>I'm no pgsql expert.  Are those begin/commits sufficient to prevent
>concurrent PhpWikis from getting partially updated data?  (When the
>update happens over several UPDATE operations within a single
>begin/commit, that is.)
>  
>
Yes, no data is updated that the other clients can read it until the 
commit is sent. It will also allow multiple people to hit a table at 
once updating not stop the others from doing their thing before they can 
write.

>>when you call getpage, why not just have an extra flag on it for grab 
>>the latest version or something? it would only need to be simple. maybe 
>>make an extended getpage that you could feed the revision you wanted as 
>>well as the page? should be simple enough
>>    
>>
>
>It's not that simple.  And the two selects, I suspect, are among the
>least of the SQL efficiency concerns in the current code.
>
Enlighten me, I dont have much time but for worthwhile projects and 
various things I'll use daily im willing to spend a bit of time on them.

>>>Do you get HTTP headers back?
>>>
>
>Those headers don't show any obvious problems.  (However there should
>be ETag and Last-Modified headers too if PhpWiki was running to 
>normal completion on a page view --- of course we already knew it wasn't.)
>  
>
I'll look into where its dying more today, I'v been quite busy.

>I still plan on looking at the original postgres bug later today.
>(I have to install postgres first --- among other things.)
>
If you would prefer a shell to work on let me know and I can organize it.

Cameron Brunner
inetsalestech.com