|
From: Tomaž Š. <tom...@ta...> - 2011-02-11 08:39:34
|
On 11. 02. 2011 06:15, Tony Plate wrote: > The second and subsequent errors you show about > "...pages-articles.title2id.db: No such file or directory" are from the > transform stage. What was happening for me was that the pre-processing > stage was crashing, and hence not creating its output, the .db files. > Then when the transform stage tried to run, it couldn't find the .db > files and printed messages just like what you show. Thanks for pointing this out. I just commited a fix that implements better diagnostics of forked worker processes and stops Wikiprep at the first signs of trouble instead of trying to push on. Best regards Tomaž > > You can see if your original file is legal XML using xmllint on it, like > this: > > xmllint --stream --noout dewiki-20101013-pages-articles.xml > > (using the appropriate filename) > > You should see no output, unless there are xml errors. If you see XML > errors, then you'll probably need to fix them before proceeding. > > -- Tony Plate > > (thanks to Tomaz Solc who helped me track down my similar problems after > I mailed this list a few weeks ago.) > > > On 2/10/2011 1:30 PM, Jose Quesada wrote: >> Hi, >> >> I preprocessed the .fr and .es wikis with the latest wikiprep. But >> when I run the same thing on the .de one I get: >> >> perl ~/projIfollow/wikiprep/lib/wikipre >> >> no element found at line 22451642, column 0, byte 1471064466 at >> /usr/lib64/perl5/site_perl/5.12.2/Parse/MediaWikiDump/Revisions.pm >> line 233 >> ./dewiki-20101013-pages-articles.title2id.db: No such file or >> directory at /home/quesada/projIfollow/wikiprep/lib/wikiprep line >> 476. >> No such file or directory at >> /home/quesada/projIfollow/wikiprep/lib/wikiprep line 355. >> ./dewiki-20101013-pages-articles.title2id.db: No such file or >> directory at /home/quesada/projIfollow/wikiprep/lib/wikiprep line >> 476. >> ./dewiki-20101013-pages-articles.title2id.db: No such file or >> directory at /home/quesada/projIfollow/wikiprep/lib/wikiprep line >> 476. >> ./dewiki-20101013-pages-articles.title2id.db: No such file or >> directory at /home/quesada/projIfollow/wikiprep/lib/wikiprep line >> 476. >> >> Any idea why this is? >> Thanks! >> >> -- >> Best, >> -Jose >> >> Jose Quesada, PhD. >> Research scientist, >> Max Planck Institute, >> Center for Adaptive Behavior and Cognition, >> Berlin >> http://www.josequesada.name/ >> http://twitter.com/Quesada >> >> > > > ------------------------------------------------------------------------------ > The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: > Pinpoint memory and threading errors before they happen. > Find and fix more than 250 security defects in the development cycle. > Locate bottlenecks in serial and parallel code that limit performance. > http://p.sf.net/sfu/intel-dev2devfeb > > > > _______________________________________________ > Wikiprep-user mailing list > Wik...@li... > https://lists.sourceforge.net/lists/listinfo/wikiprep-user |