From: Sergey C. <ser...@gm...> - 2010-03-03 06:14:54
|
Don, I think you might want to look at enabling parser cache (you probably don't have it enabled at this point). Pages will get a bit stale, but results will be loading much faster. Also, if you're not currently doing this, try using APC for opcode cacheing as well as object cache. This doesn't help with huge footprint, but helps with overall performance. Thank you, Sergey -- Sergey Chernyshev http://www.sergeychernyshev.com/ On Mon, Mar 1, 2010 at 6:17 PM, Thomas Fellows <tho...@gm...>wrote: > Hey -- the different charsets, if i remember correctly, had to do with one > table/index had latin1, the other had binary, and there was a join statement > that wasn't using an index as found by Explain, even though it could find > it. One index was in, say, Latin1, and the other was in binary, and so it > exploded when trying to convert over 8million+ rows from one to the other as > it couldn't do a direct compare. Really more of an issue based around > initial performance than scalability to your level, but try watching what > queries are taking the longest in mysql, run an explain on the query, see if > it's using indexes, and if they aren't, there's a cause for concern. > > refreshLinks2 is just a job type (like htmlCacheUpdate or SMWUpdateJob) in > MW for when you edit templates that effect large numbers of pages. They > don't want the page to hang when you hit 'save page' by inserting 100ks of > refreshLinks jobs into the jobs table. They also take a long time to run, > depending on the template/complexity of the change/etc., so if the jobrate > > 0, and a user accesses a page, it tries to run the job before returning > anything to the user, and hence page timeout. > > select distinct(job_cmd) from job; > > you can see some other job types depending on what is in your queue at any > given time. but since you have the job queue turned off (i.e. one isn't run > every time a user accesses a page), I don't believe that is what is causing > the slow page loads. > > definitely look into the mw profiling capability, also look into how to add > profiling to functions/etc as some extensions might not have them by > default. Fairly simple to add it in and time how long everything is taking. > > > -Tom > > On Mon, Mar 1, 2010 at 5:07 PM, don undeen <don...@ya...> wrote: > >> all great advice, thanks! >> yeah, my jobqueue does get ridiculous, because I'm using externalData >> calls, and properties that automatically create pages, which have their own >> externalData calls, etc etc. All managed basically through the refreshLinks >> and runJobs maintenance scripts. So I've got the job rate set to 0, and I've >> got scheduled jobs running to take care of that. Right now I'm running >> refreshLinks and runJobs in multiple perpetual loops, hoovering up more >> external data. Then occasionally I stop those processes, to see how the >> performance is doing. >> So I guess running those maintenance scripts is going to cause some stress >> that affects the performance of page loads, and general mysql access, no? >> Once my dataset is a little more stable, I'll reduce the maintenance script >> frequency, and be able to do some more profiling. >> >> Questions: >> refreshLinks2.php : what's that about? I don't see that in my codebase; >> maybe it's a version thing? >> >> I didn't realize MW had its own profiling framework. I'll have to dig into >> that for sure. I had been using xdebug and wincachegrind, to some effect. >> >> >> aslo, could you explain a bit more about "one of the tables in my database >> was created with the latin1 charset, while the rest were in binary, which >> made the use of indexes useless. " >> >> I've got tables with collations: binary (Innodb), >> latin1_swedish_ci(MyIsam) , and a couple in utf8_bin (MyIsam) . How does >> this make the indexes not work? >> >> Sorry if any of these questions have been covered elsewhere. Feel free to >> tell me to google it, or toss a link my way, if you want. >> >> >> thanks again! >> >> Don Undeen >> >> ------------------------------ >> *From:* Thomas Fellows <tho...@gm...> >> *To:* don undeen <don...@ya...> >> *Cc:* smw list <sem...@li...> >> *Sent:* Mon, March 1, 2010 4:48:26 PM >> *Subject:* Re: [SMW-devel] optimizing SMW >> >> Hey - >> >> Something you can try out that helped me out of a large smw-timeout issue >> was actually really just related to a large MW issue, and MW's profiling >> turned up the answer. >> >> http://www.mediawiki.org/wiki/How_to_debug#Profiling >> >> In my case, the Job Queue was what was killing my performance - templates >> that affected 100,000+ pages were trying to get run off the queue >> (refreshLinks2), and that just ended in a timeout for the user. Turning the >> job rate to 0 solved it. (Have to set up chron job to run the queue >> overnight). >> >> In your case, check out how long everything is taking using the profiler, >> will be easier to pinpoint the time hog this way, though I'm sure others >> might have better suggestions. As far as MySQL query optimisation, there >> are lots of good articles on the "explain" syntax out there. Another >> problem I encountered (though assuredly rare) was that one of the tables in >> my database was created with the latin1 charset, while the rest were in >> binary, which made the use of indexes useless. The explain command turned >> that one up for me. >> >> Hope it was at least a little bit helpful, and goodluck >> >> -Tom >> >> On Mon, Mar 1, 2010 at 4:26 PM, don undeen <don...@ya...> wrote: >> >>> hi all, >>> I've got a semantic mediawiki installation with about 100,000 pages and >>> growing, 236k rows in smw_ids, and 646k rows in pagelinks >>> >>> running on Windows Server 2008, >>> MediaWiki 1.13.5 >>> PHP 5.3.1 >>> MySQL 5.1.41 >>> SMW 1.4.2 >>> SMWHalo 1.4.5 >>> >>> >>> I'm getting to the point where page loads are starting to be pretty slow >>> sometimes, and occasionally timeout. >>> >>> Granted, I'm using lots of external data calls, and those calls cause new >>> pages to be created in the background, and those new pages beget more new >>> pages, etc etc. So there's a spidering growth going on as well. I'm doing >>> plenty of caching of my service calls, using memcached. >>> >>> obviously it's a sort of complicated setup, and I'm noticing that even >>> normal queries of the db (using phpmyadmin) are taking quite a while. >>> >>> I don't have a lot of experience with db optimization; I'm wondering if >>> there's anything that you guys do to your wiki to make it run better, any >>> defaults I can change, indexes to create, etc (I did add an index on a temp >>> table being created in code, and that helped in one area, so I know things >>> like that can be done). >>> >>> Also, if there's any good tools you use for profiling either the php or >>> the mysql? >>> >>> I've used xdebug and wincachegrind for php profiling; has anyone tried >>> MonYog: >>> http://www.webyog.com/en/ >>> >>> for mysql profiling? And other tools you like that I can use in a windows >>> env? >>> Or maybe just some general pointers on what to look for when trying to >>> improve performance? >>> >>> I know this is vague; maybe there's a good thread/link out there already >>> for this topic? I haven't seen it. >>> >>> >>> thanks for all your help and hard work! >>> >>> don undeen >>> Metropolitan Museum of Art >>> >>> >>> ------------------------------------------------------------------------------ >>> Download Intel® Parallel Studio Eval >>> Try the new software tools for yourself. Speed compiling, find bugs >>> proactively, and fine-tune applications for parallel performance. >>> See why Intel Parallel Studio got high marks during beta. >>> http://p.sf.net/sfu/intel-sw-dev >>> _______________________________________________ >>> Semediawiki-devel mailing list >>> Sem...@li... >>> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel >>> >>> >> > > > ------------------------------------------------------------------------------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > Semediawiki-devel mailing list > Sem...@li... > https://lists.sourceforge.net/lists/listinfo/semediawiki-devel > > |