Share dump

  • luc

    luc - 2013-02-24

    Can somebody share a valid couple of


    To test installation before building a new csv summary (long process).

    Thanh you in advance
    best regards

    • lioo

      lioo - 2013-07-27

      do you hava 20110722.xml? if you do,please share it
      thank you

      • Alexander Mera

        Alexander Mera - 2015-05-29

        Hi Lioo,

        Could you please share the enwiki-20110722-pages-articles.xml.bz2 file?

        Thanks in advance!

  • Alaa Alahmadi

    Alaa Alahmadi - 2013-02-25


    I have it , it will take a time to upload it .


    • lioo

      lioo - 2013-07-27

      you have 20110722.xml? can you send it to me?
      thank u

  • luc

    luc - 2013-02-27

    Can you put it on FTP or share as big file (zipped)

    Bes regards

  • andrewtankianlam

    Yup. I need it as well. Can you share it out?

  • luc

    luc - 2013-03-07

    Thank you so much for sharing!!! There is no problem using a wikipedia dump from 2008 and a CSV summary from 2011?

    Best regards

  • Alaa Alahmadi

    Alaa Alahmadi - 2013-03-08

    Sorry about this ,I upload the old file for wikipedia miner 1 , you should downlwd the same wikipedia dump as  CSV summary . You can find it in the link below and  it has this name enwiki-20110722-pages-articles.xml .

    best regards

  • luc

    luc - 2013-03-08

    This is exactly the key point !!
    enwiki-20110722-pages-articles.xml is not anymore available!!!

    So need to rebuild Csv file or find a couple of valid dataset CSV and Dump.

    Any on could provide one?

    Best regards

  • Sarah

    Sarah - 2013-04-09


    Did you get the enwiki-20110722-pages-articles.xml.bz2 file by any chance? If so, can you kindly share :)?

  • Guillaume

    Guillaume - 2013-04-15

    Hi ,

    I also need a dump with his corresponding CSV summary.
    It seems that I'm not the only one and that would be very kind  to share a couple of valid data.

    Best regards.

  • Guillaume

    Guillaume - 2013-04-25

    I finally managed to extract the CSV summaries of a recent wikipedia dump ( I don't remember the exact date of the dump…).

    If someone needs it, I can upload it to a FTP server or an online service of your choice (9GB for the dump and 5.8GB (uncompressed) for the summaries).

  • Kyle

    Kyle - 2013-05-05

    Hi Muonique, that would be awesome if you could share the summaries! I have sent you some info via SF message and I can host publicly after receiving it.

  • Guillaume

    Guillaume - 2013-05-07

    No problem for sharing but I didn't receive your message.

    • Amrita Lakshmi

      Amrita Lakshmi - 2013-07-16

      Hi Guillaume,
      Which wikipedia dump have you extracted the CSV summaries for? I don't mean the exact date but is it a 2013 dump?
      Also, what hardware resources did you need for extraction? I'm trying to get an idea of how long it would take to process any of the recent Wikipedia dumps and how big a Hadoop cluster I will need for this, what memory size for each node etc.

      Thanks in advance.

  • Guillaume

    Guillaume - 2013-08-06


    I extracted the latest dump available in April 2013.
    It took about 2 days on a single node (8 core Xeon processor) and a few hours on 30 nodes (4 core processor). Sending the data on each node and the reduce phase were the main bottlenecks on the grid.

    • Anonymous - 2013-11-05

      Hi Guillaume,

      It would be really great if you can share the files, I am working on a local test case and have to present it to people for which am using the wiki dump and the csv summary dump however I am not able to get the xml dump for either of the csv dumps which is available here (

      It will be really cool that we put an effort and get it uploaded on sourceforgenet as it will help in solving the problem of a lot of people around here. Please reply soon as its quite urgent.


    • Rebecca

      Rebecca - 2014-08-15

      Could you please share the dump and csv file with me? Thank you so much!

  • rah kah

    rah kah - 2013-08-28

    Hi Guillaume,
    I really need an CSV summary for a recent dump, if you have it can you please share it with me. or the enwiki-20110722-pages-articles.xml file, I couldn;t find it any where, and I really need this...

    Last edit: rah kah 2013-08-28
  • Han Xiao

    Han Xiao - 2014-07-17


    I am trying to make the system work but the embarrasing parts are:

    1. extracting the summary won't work for me
    2. if the summary exists, the wikipedia dump does not.

    Any dumps as well as the corresponding summary available for sharing?

  • Rebecca

    Rebecca - 2014-08-15

    I could not find a corresponding CSV file for any version of Wikipedia.
    Could anybody share me a dump and its CSV file? Thank you so much.

    you can contact me (email:

  • wikidude

    wikidude - 2014-12-03

    Hi fellow miners.
    I need a dump and its CSV file too.
    Can anyone help me with this?
    Thanks in advance!


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks