all great advice, thanks!
yeah, my jobqueue does get ridiculous, because I'm using externalData calls, and properties that automatically create pages, which have their own externalData calls, etc etc. All managed basically through the refreshLinks and runJobs maintenance scripts. So I've got the job rate set to 0, and I've got scheduled jobs running to take care of that. Right now I'm running refreshLinks and runJobs in multiple perpetual loops, hoovering up more external data. Then occasionally I stop those processes, to see how the performance is doing.
So I guess running those maintenance scripts is going to cause some stress that affects the performance of page loads, and general mysql access, no? Once my dataset is a little more stable, I'll reduce the maintenance script frequency, and be able to do some more profiling.

refreshLinks2.php : what's that about? I don't see that in my codebase; maybe it's a version thing?

I didn't realize MW had its own profiling framework. I'll have to dig into that for sure. I had been using xdebug and wincachegrind, to some effect.

aslo, could you explain a bit more about "one of the tables in my database was created with the latin1 charset, while the rest were in binary, which made the use of indexes useless. "

I've got tables with collations: binary (Innodb), latin1_swedish_ci(MyIsam) , and a couple in utf8_bin (MyIsam) . How does this make the indexes not work?

Sorry if any of these questions have been covered elsewhere. Feel free to tell me to google it, or toss a link my way, if you want.

thanks again!

Don Undeen

From: Thomas Fellows <>
To: don undeen <>
Cc: smw list <>
Sent: Mon, March 1, 2010 4:48:26 PM
Subject: Re: [SMW-devel] optimizing SMW

Hey -

Something you can try out that helped me out of a large smw-timeout issue was actually really just related to a large MW issue, and MW's profiling turned up the answer.

In my case, the Job Queue was what was killing my performance - templates that affected 100,000+ pages were trying to get run off the queue (refreshLinks2), and that just ended in a timeout for the user.  Turning the job rate to 0 solved it. (Have to set up chron job to run the queue overnight).

In your case, check out how long everything is taking using the profiler, will be easier to pinpoint the time hog this way, though I'm sure others might have better suggestions.  As far as MySQL query optimisation, there are lots of good articles on the "explain" syntax out there.  Another problem I encountered (though assuredly rare) was that one of the tables in my database was created with the latin1 charset, while the rest were in binary, which made the use of indexes useless.  The explain command turned that one up for me.

Hope it was at least a little bit helpful, and goodluck


On Mon, Mar 1, 2010 at 4:26 PM, don undeen <> wrote:
hi all,
I've got a semantic mediawiki installation with about 100,000 pages and growing,  236k rows in smw_ids, and 646k rows in pagelinks

running on Windows Server 2008,
MediaWiki 1.13.5
PHP 5.3.1
MySQL 5.1.41
SMW 1.4.2
SMWHalo 1.4.5

I'm getting to the point where page loads are starting to be pretty slow sometimes, and occasionally timeout.

Granted, I'm using lots of external data calls, and those calls cause new pages to be created in the background, and those new pages beget more new pages, etc etc. So there's a spidering growth going on as well. I'm doing plenty of caching of my service calls, using memcached.

obviously it's a sort of complicated setup, and I'm noticing that even normal queries of the db (using phpmyadmin) are taking quite a while.

I don't have a lot of experience with db optimization; I'm wondering if there's anything that you guys do to your wiki to make it run better, any defaults I can change, indexes to create, etc (I did add an index on a temp table being created in code, and that helped in one area, so I know things like that can be done).

Also, if there's any good tools you use for profiling either the php or the mysql?

I've used xdebug and wincachegrind for php profiling; has anyone tried MonYog:

for mysql profiling? And other tools you like that I can use in a windows env?
Or maybe just some general pointers on what to look for when trying to improve performance?

I know this is vague; maybe there's a good thread/link out there already for this topic? I haven't seen it.

thanks for all your help and hard work!

don undeen
Metropolitan Museum of Art

Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
Semediawiki-devel mailing list