The character encoding scheme used in the XML file is somewhat confusing. First, I think the file is generally UTF-8 encoded (that's what I had some success with); but it might be better to specify this explicitly in the XML declaration, e.g.
<?xml version="1.0" encoding="UTF-8"?>
Then, within the file, multiple different encodings are used. For an example, see the entry on WikiProject Åland Islands. In the "name" and "page" attributes, the encoding seems to be generic UTF-8, using entities. The "id" attribute uses some other encoding, showing the first character as non-displayable; while this works consistently throughout the file, it doesn't seem to be intended. Then, the text content of the "Category" and "Statistics" tags, as well as the attributes of the "banner" tags, are urlencoded, plus the MediaWiki-specific representation of the space character by an underscore.
In other examples, the space-underscore-replacement is also used in the "page" attribute, see e.g. id="BASEBALL/LITTLE_LEAGUE_TASK_FORCE".
These are all minor issues, and when reading the file I can work around them with a few lines of code; but still they seem somewhat confusing, and make re-using the file harder. Using plain UTF-8 in all fields would make the file more consistent I think.