Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Strategy for making a small test file?

Help
2012-07-15
2013-05-30
  • Barry Hoggard
    Barry Hoggard
    2012-07-15

    I have been unable to create a small subset of the Wikipedia English dump for testing some changes to my local version. I always get "Could not identify root category" even when I include most categories including the Fundamental Categories page.

    Has anyone else solved this?

    I'm also wondering whether anyone has succeeded in modifying the extraction process to use multiple files, which would make it easier to use S3 and Elastic MapReduce for managing updates between the large dumps coming from Wikipedia.

     
  • Duygu
    Duygu
    2012-09-23

    I am getting the same error "Could not identify root category". Did you solve this?
    Thanks

     
  • Barry Hoggard
    Barry Hoggard
    2012-09-24

    I don't get the error with a full dataset. I haven't been able to produce a subset that works.