Strategy for making a small test file?

  • Barry Hoggard

    Barry Hoggard - 2012-07-15

    I have been unable to create a small subset of the Wikipedia English dump for testing some changes to my local version. I always get "Could not identify root category" even when I include most categories including the Fundamental Categories page.

    Has anyone else solved this?

    I'm also wondering whether anyone has succeeded in modifying the extraction process to use multiple files, which would make it easier to use S3 and Elastic MapReduce for managing updates between the large dumps coming from Wikipedia.

  • Duygu

    Duygu - 2012-09-23

    I am getting the same error "Could not identify root category". Did you solve this?

  • Barry Hoggard

    Barry Hoggard - 2012-09-24

    I don't get the error with a full dataset. I haven't been able to produce a subset that works.


Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks