[PerlWikiBot] SF.net SVN: perlwikibot:[53] trunk/no-interwiki/prepare_noiw_list.pl
Status: Pre-Alpha
Brought to you by:
rotemliss
From: <am...@us...> - 2008-07-31 10:09:36
|
Revision: 53 http://perlwikibot.svn.sourceforge.net/perlwikibot/?rev=53&view=rev Author: amire80 Date: 2008-07-31 10:09:44 +0000 (Thu, 31 Jul 2008) Log Message: ----------- POD updates. Modified Paths: -------------- trunk/no-interwiki/prepare_noiw_list.pl Modified: trunk/no-interwiki/prepare_noiw_list.pl =================================================================== --- trunk/no-interwiki/prepare_noiw_list.pl 2008-07-31 10:08:34 UTC (rev 52) +++ trunk/no-interwiki/prepare_noiw_list.pl 2008-07-31 10:09:44 UTC (rev 53) @@ -1217,13 +1217,15 @@ =item * C<prepare_noiw_list.pl --rtl ./big-files/hewiki-20080420-pages-meta-current.xml> +=item * C<prepare_noiw_list.pl --stop_after=20000 ./big-files/hewiki-20080420-pages-meta-current.xml> + =back =head1 REQUIRED ARGUMENTS =over -=item * MediaWiki dump file name is obligatory. +=item * MediaWiki dump file name is required =back @@ -1248,11 +1250,14 @@ =item * --max_sections_per_page Maximum number of sections per output page. Default is 20. +=item * --max_iw_places Number of places to print in the statistics of +pages with the most interlanguage links. + =back =head1 DESCRIPTION -The main goal of this searching is to find pages which do not have +The main goal of this program is to find pages which do not have interwiki (interlanguage) links to certain languages. This program scans a MediaWiki XML dump file. It searches every page for @@ -1261,22 +1266,22 @@ =over -=item * If the page contains links to the defined languages and contains -no "no interwiki" template, its processing stops. +=item * If the page contains links to the defined languages and does not +contain the "no interwiki" template, its processing stops. =item * If the page contains links to the defined languages and contains this template, it is logged, so the template can be removed. (It is planned that it will be removed automatically in the future.) -=item * If the page contains no links to the defined languagesm but no -template, it is automatically added to type "other". +=item * If the page contains no links to the defined languages and does not +comtain the template, it is automatically added to type "other". =item * If the page contains no links to the defined languages and a template with types, it is added to the defined types. =back -Pages without links are added to nicely formatted lists +Pages without links are added to nicely formatted lists according to their type. This program also collects some information on the way about problematic @@ -1304,7 +1309,8 @@ =head2 unable to handle any case setting besides 'first-letter' -Something is weird with the dump. +Something is weird with the dump. See the documentation of +L<Parse::MediaWikiDump> and MediaWiki. =head2 A page has no pure title @@ -1315,7 +1321,7 @@ STRING is supposed to be a parameter in a template, but it does not look like one. It could be an error in the template, and also a bug in this program -(the parser that this program employs is rather limited). +(the parser that this program employs is rather stupid). =head2 Unicode character 0xNUMBER is illegal @@ -1326,7 +1332,8 @@ supposed to be in the page and should be fixed, but otherwise this issue is not supposed to affect the functionality of this program significantly. -This was reported as a MediaWiki bug: L<https://bugzilla.wikimedia.org/show_bug.cgi?id=14600> +This was reported as a MediaWiki bug: +L<https://bugzilla.wikimedia.org/show_bug.cgi?id=14600> =head1 EXIT STATUS @@ -1353,7 +1360,7 @@ =head1 DEPENDENCIES -This module depends on these CPAN modules: +This module requires these CPAN modules: =over @@ -1374,28 +1381,35 @@ This module is used for transliterating filenames to ASCII. +=item * C<Readonly> + +To make Perl::Critic happy :) + =back =head1 HACKING =head2 Perl 5.10 -This program needs Perl 5.10. It has clean, new and useful syntax, which +This program requires Perl 5.10. It has new clean and useful syntax, which makes the programs easier to hack, maintain and debug. It is useless to try and run it on an older version, unless you want to waste your time backporting. Please upgrade your Perl installation if you still have 5.8 or -something older. +(horrors!) something older. -=head2 Perl Best Practices and Perl::Critic +=head2 Perl Best Practices, Perl::Critic and perltidy Great effort has been put into making this source code pass as cleanly as -possible the Perl::Critic tests in the 'brutal' mode. If you modify it, do -yourself a favor, install Perl::Critic and regularly test it using this command: +possible the Perl::Critic tests in the 'brutal' mode. It also uses perltidy +for automatic code formatting. If you modify it, do yourself a favor, install +Perl::Critic and regularly test it using this command: -perlcritic -brutal prepare_noiw_list.pl +./tidy.sh -All places where P::C has been disabled using "# no critic" are explained. +It checks the syntax, runs perltidy on the code and runs Perl::Critic. +All the places where P::C has been disabled using "# no critic" are explained. + The time invested in making the code P::C-friendly will be returned as time saved on debugging. Also consider reading the book "Perl Best Practices" by Damian Conway if you have not already. @@ -1407,9 +1421,9 @@ This program works best on GNU/Linux, where Perl and the filesystem are Unicode-friendly. -This program was tested on Windows with ActivePerl 5.10 and Cygwin Perl 5.10. -In both cases Unicode-related issues cause filenames and clipboard text -to become jumbled. +This program was also tested on Windows XP and Vista with ActivePerl 5.10 +and Cygwin Perl 5.10. In these Unicode-related issues caused filenames +and clipboard text to become jumbled. You have been warned. =head1 BUGS AND LIMITATIONS This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |