From: Erik W. <Eri...@st...> - 2012-09-03 04:51:44
|
Hello! I have been trying to run wikiprep over the weekend but saw that it crashes. First I tried using splitwiki and then running it using four cores. Compilation of splitwiki. splitwiki: splitwiki.o gcc splitwiki.o -O2 -lz -o splitwiki splitwiki.o: splitwiki.c gcc -c -Wall splitwiki.c Then running wikiprep (I installed all the debian packages listed in the readme) wikiprep -format composite -compress -nourl -parallel -f enwiki-latest-pages-articles.xml.0000.gz gzip: enwiki-latest-pages-articles.xml.0000.gz: unexpected end of file no element found at line 113211447, column 749, byte -1537634522 at /usr/share/perl5/Parse/MediaWikiDump/ Revisions.pm line 233 No such file or directory at /usr/local/bin/wikiprep line 358. ./enwiki-latest-pages-articles.title2id.db: No such file or directory at /usr/local/bin/wikiprep line 479 . ./enwiki-latest-pages-articles.title2id.db: No such file or directory at /usr/local/bin/wikiprep line 479 . ./enwiki-latest-pages-articles.title2id.db: No such file or directory at /usr/local/bin/wikiprep line 479 . ./enwiki-latest-pages-articles.title2id.db: No such file or directory at /usr/local/bin/wikiprep line 479 . I thought that I would try and see if it would crash using only one core before looking into details: wikiprep -format composite -compress -nourl -f enwiki-latest-pages-articles.xml.bz2 Use of qw(...) as parentheses is deprecated at /usr/local/share/perl/5.14.2/Wikiprep/Disambig.pm line 9. Use of qw(...) as parentheses is deprecated at /usr/local/bin/wikiprep line 134. Sep 03 09:48:12 [WARNING] title Ss (ID 354283) already encountered before (ID 198274) Sep 03 10:42:47 [WARNING] title T? (ID 13066537) already encountered before (ID 3406617) Sep 03 11:04:29 [WARNING] title ? (ID 19185171) already encountered before (ID 18984678) Sep 03 11:08:26 [WARNING] title ? (ID 20363161) already encountered before (ID 16504503) Sep 03 12:09:10 [NOTICE] total 12584750 pages (31631083782 bytes) Sep 03 12:09:15 [NOTICE] Loaded 6125016 titles Sep 03 12:09:15 [NOTICE] Loaded 5608108 redirects Sep 03 12:09:15 [NOTICE] Loaded 362394 templates Out of memory! So what are the best steps to getting wikiprep running? I would rather not learn perl details (unfamiliar with the language). Best regards, Erik |