From: Brian <Bri...@co...> - 2009-01-23 21:02:06
|
This howto really applies to all parser functions but its very useful for semantic mediawiki and solved a difficult problem I was having with the utter slowness of pyWikipediaBot. First, you'll want mwclient: https://mwclient.svn.sourceforge.net/svnroot/mwclient/trunk/mwclient/ mwclient uses the mediawiki api directly with low overhead. pyWikipediaBot is slow because a) it uses BeautifulSoup and b) it uses it extensively, every single time you perform an action. You'll need to patch mwclient. Paste this code at the very end of client.py def parse(self, text, title = None): > kwargs = {} > if title is None: kwargs['title'] = title > result = self.api('parse', text = text, **kwargs) > return result['parse'] > Now you have direct access to the mediawiki parser. You start up mwclient like so: from sys import path > path.append('/usr/local/mwclient') # or wherever you put it > import client as mwclient > site = mwclient.Site('www.mysite.com', path='/myWikiLocation/') > site.login('username','password') > And now you can execute parser functions and retrieve their results directly: result = site.parse('{{#ask: [[author::~*Reilly*]] | ?title }}') It's still not as fast as we'd like, but much faster than with pywikipediabot: from time import time > start=time() > site.parse('{{#ask: [[author::~*Reilly*]] | ?title }}') > print "Parse took", time()-start, "seconds" > > Parse took 4.50420308113 seconds > |