Re: [Phpwiki-talk] wget to backup PhpWiki - how do I ?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

La...@us... schrieb:
> Reini Urban [Fri, 22 Oct 2004 10:22:35 +0200] wrote:
> 
>>wget -nd -r -l1 -nH http://my/wiki/
> 
> This will not work either for me. (sorry to insist)
> 
> (anyway the additional flags -nH -nd are essentially cosmetic right ?)

nd and nH dont create the host and dir subdirs.

> May be I did something wrong in the install (PhpWiki *or* Apache) that
> prevents wget from crawling through the wiki. Or wget is too old ?
> 
> wget --version
> GNU Wget 1.8.2

wget cannot be too old, it is dumb on purpose.
if can click through your wiki, wget can "click" through it also.
$ wget --version
GNU Wget 1.9.1

There exist faster wget versions (using a hash instead of a list 
internally).

> http://phpwiki.sourceforge.net/phpwiki/BackupStrategies does say
> things about backuping with wget, but uses the zip-dump interface.
> 
> http://amphi-gouri.org/blog/2004/09/16/73-LeConvertisseurWikiDuPauvreConvertirUnSiteSimpleEnSyntaxeMoinmoinEnQuelquesLignes
> uses --no-parent => same result, only one page dumped.

sure. one zip, which is your whole wiki. all pages zipped.

> I read somthing about protection from bots:
> http://phpwiki.sourceforge.net/phpwiki/HowToHandleRobots
> 
> "Only action=browse and action=index is allowed for statically
> identified robots, but authorized action must be allowed, e.g. for my
> daily backups with Wget."

this is for a very old wiki version of mine. there's no action=index 
anymore.

> Is that an issue ? allowing the read action ? I have no clue
> what/where this is ...

no, currently we don't block wget robots.
but maybe your global /robots.txt disallows wget?

>>If you can live with the memory limitations after the upgrade.
>>until 1.3.4 it did no output buffering. After it needs more than 8MB.
>>
>>http://phpwiki.sourceforge.net/phpwiki/PhpMemoryExhausted/Testresults
-- 
Reini Urban
http://xarch.tu-graz.ac.at/home/rurban/