sc2 battle.net spider Code
Player spider for Starcraft 2 battle.net data
Brought to you by:
david_muffley
Hello <SUBJECT NAME HERE>! If you are reading this, then I am^W^W you have obtained the files for this project! Core functionality only requires that you have python (2.6 >= YOURVERSION < 3.0) installed. Graphs require pygame (pygame.org). Several speedups are possible (mostly in saving and loading) with psyco (tested with psyco 2.0). The extra stuff (such as data packing and up_sc2peel) requires personal libraries I haven't doled out yet. :\ (DETAIL: the 'with' statement is used in several areas, and so python2.5 plus "from __future__ import with_statement" may be enough to have it work, but I have not tested this.) Starting data for each locale is included in the main package files; enough people to cover ~50% of the divisions per locale. Your first runthrough will take about twice as long as a normal update, quite a bit of time depending on which locale you choose (US has ~30,000 divisions as I write this!), so be prepared to wait. Running sc2peel.py will allow you to update a locale. If you want to do anything else with the data, you must run sc2peel.py in some sort of IDE, or make another .py file to do things for you. rank() is a commonly used function, and graphs ( graph_*() ) make some pretty (value of pretty depends on the user) pictures to look at. ps there's no real documentation ============================================================================== 13 June 2012: I've added some simple logic so no longer is fine-tuning of perdiv needed when new seasons start. Also with the settings: fails, when a server is down (or just after that number of consecutive failures to get a division), the run with exit without saving. All that I need to be doing myself is modifying the time between updates, and when to call make_new_season(). 2 Janurary 2012: I've greatly modified how this program handles getting a division without the player used to get it (This player has not been ranked as part of the listed season and ladder). Rather than hanging the whole process until, one by one, a suitable player is used or there aren't any, it kicks back to the main loop of getting web data and puts the next request for that division at the head of the queue. Rather than seeing "trying NUMBER...", you will see "tried+failed DIVISION PLAYER (will try ANOTHER PLAYER soon)". 7 August 2011: Season 3 merged several servers, and caused problems with how I store player data. Each player is given an index (number), and this is unique across (old) servers. LA, RU, and TW users are given a secondary identifier, a "2" where all the others have a "1". This is still in place, and they allow the same server-based index on multiple people now on US (to include previous LA players!). Since my main players hash table and team identifiers (an unfortunately complex process I use to save memory) goes off that first index, I was in trouble. So I now limit player indexs to (2**31)-1, since if there's a sub-index of "2", it adds 2**31 to the index. And then is handled anywhere that number needs to be broken back into the smaller index and a "2". Hope they never put a 3 as I'd like to stay in 4 bytes! # April 2011: savefiles are 5-11% smaller 3 April 2011: Finally working with Season 2, and for new seasons as they come in the future (sort of). 24 March 2011: Everything with patch 1.3 is now updated. Name changes are finally done without hassle, and the Division parsing I mentioned yesterday WAS in fact my problem, and has been fixed. 23 March 2011: With patch 1.3, Blizzard has removed the "loss" data for all teams not in Master league. Currently losses for such teams are stored as 0, which could cause problems for some winrate analysis. Also, my parser isn't picking up any new divisions. While Blizzard mentioned that Divisions would be locked, I cannot be 100% sure that the changes to the Division page did not break my "get new divisions" code. 10 March 2011: Loading and saving are a bit slower, due to a change inside of Team<>s. Teammates are no longer stored as a tuple of ints, but rather as a single long. The actual difference in memory is not very big (something like 33MB for `us` or approximately 1 byte per team), but python seems to run into some problems in garbage collecting tuples. Perhaps this is due to using them as keys in the teams container (all[3]), I don't know. Anyhow, Team<> objects now have an 'id' attribute, the long, and 'teammates' is a property giving the >>list<< of player indexs that was used before. The end result is that during the longer updates, memory usage does not balloon nearly as much as it used to.