I'm currently doing bachelor and I want to use WikipediaMiner for French.
1. It would be grate if someone happens to have any already summarized *.csv dumps for French (even if they are older). I can provide space where to upload them
2. Configuring extractWikipediaData.pl for French. I know from the Readme guide and other posts in this forum that I have to change the following settings. If someone could conform that the following settings are correct or could give me a hint how to find the correct ones. :)
a) my @disambig_templates = ("disambig", "disambig-cleanup", "geodis", "hndis", "numberdis") ;
- are they the on listed here: http://fr.wikipedia.org/wiki/MediaWiki:Disambiguationspage
b) my @disambig_categories = ("disambiguation") ;
- it seems for me that this is the same for french
b) #my $root_category = "Fundamental" ; # for enwiki
- is it the one corresponding to http://fr.wikipedia.org/wiki/Cat%C3%A9gorie:Accueil, meaning my $root_category = "Accueil"
I have an extract of the French Wikipedia (2010 09 15). If you have an FTP site I should be able to upload it (with or without content.csv?).
Your settings seems pretty correct to me, even if I've peeked 'Espace encyclopédique' as root cat.
Thank you very much! I sent you a message with the details for the FTP account.
If anyone else needs the data I can make the FTP account public.
is the french dump still up? It would be great if you could let me know the ftp account.
i dont know how do i find right key word for
i wonder how find these keys exactly to dump xml file.
thanks for your help,
I want to use wikipedia miner for French for research purpose
Are the French *.csv dumps still available?
Log in to post a comment.