From: nischay n. <nis...@gm...> - 2012-03-29 13:03:00
|
Hi, I saw the SQL queries on SMW tables that run when a page is edited by logging it on a file. Every time a page is saved after editing we write many things to the database. This leads to low performance on the user side, Markus suggested that before we run any queries we should check that if we did the same query last time by using a hash and perform the query only if it is new to prevent too many writes. However, I think this still leads to slow performance because we will still be parsing the page for SMW properties every time. Wouldn't it be better if we do not parse the page on edits but instead mark this page as "edited recently" and later run a cron job on the server to parse the page and update the SMW properties afterwards? This will lead to a single query even if the page has been edited for more than 5 times in a minute have the same load on the server but save lot of processing time on the user side. Another of my concern is that we perform lots of Delete and Insert Operations on the database when editing a page even if nothing has changed, In simple words we delete and rewrite all properties on that page. I don't think this is necessary. I may be wrong here, I am still new to SMW so my apologies for that. -- With Regards Nischay Nahata B.tech 3rd year Department of Information Technology NITK,Surathkal |
From: Jeroen De D. <jer...@gm...> - 2012-03-29 13:57:52
|
Hey Nischay, Great you are looking into this. Someone really ought to look at the performance stuff, since there is a lot of low hanging fruit there for SMW :) > However, I think this still leads to slow performance because we will still be parsing the page for SMW properties every time. The parsing in not that expensive, and it only happens on edits, which are far less common then views or re-renders of the page. And you can't really get rid of the parsing, since you need it to know if something changed. I'd just focus on this stuff: > Another of my concern is that we perform lots of Delete and Insert Operations on > the database when editing a page even if nothing has changed, In simple words > we delete and rewrite all properties on that page. So yes, this would definitely be something to look at. Although optimizing that would be nice, I guess the deal performance issues people are having are the many SMW related read queries that happen at page render, or even at page view. One thing that falls into the later category is the Semantic Forms check to see if a page is in a certain namespace to see if it should have a certain "edit with form" tab, which consists out of several queries. Stuff like that could be cached using whatever cache is available to MediaWiki, so that all the needed info can be obtained using a single request to this cache. (Such caches can be stuff such as memcached, but if those are not available, it falls back to the db, still reducing the amount of queries to one.) You might already have done so, but I suggest you enable the debug toolbar: $wgDebugToolbar = true; This allows you to see what queries happened to show the current page to you, as well as their individual and total execution time :) And you might find other settings relevant to you here (copy of some of the config I use): https://www.mediawiki.org/wiki/User:Jeroen_De_Dauw/LocalSettings.php Cheers -- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. -- |
From: nischay n. <nis...@gm...> - 2012-03-30 11:12:22
|
Hey Great you are looking into this. Someone really ought to look at the > performance stuff, since there is a lot of low hanging fruit there for SMW > :) I think the fruits are sweet :) Although optimizing that would be nice, I guess the deal performance issues > people are having are the many SMW related read queries that happen at page > render, or even at page view. One thing that falls into the later category > is the Semantic Forms check to see if a page is in a certain namespace to > see if it should have a certain "edit with form" tab, which consists out of > several queries. Stuff like that could be cached using whatever cache is > available to MediaWiki, so that all the needed info can be obtained using a > single request to this cache. (Such caches can be stuff such as memcached, > but if those are not available, it falls back to the db, still reducing the > amount of queries to one.) > <https://www.mediawiki.org/wiki/User:Jeroen_De_Dauw/LocalSettings.php> I think we can implement caching in Semantic Forms and other places as mentioned here http://www.mediawiki.org/wiki/Memcache#Using_memcached_in_your_code Also we might use APC for caching PHP code, as described here see first answer http://stackoverflow.com/questions/815041/memcached-vs-apc-which-one-should-i-choose Both Memcache and APC can run in parallel so we might also consider using both > https://www.mediawiki.org/wiki/User:Jeroen_De_Dauw/LocalSettings.php Thanks for this link. -- Cheers Nischay Nahata B.tech 3rd year Department of Information Technology NITK,Surathkal -- With Regards Nischay Nahata B.tech 3rd year Department of Information Technology NITK,Surathkal |
From: Jeroen De D. <jer...@gm...> - 2012-03-30 14:36:40
|
Hey, > Both Memcache and APC can run in parallel so we might also consider using both Installing APC gives you a performance boost, even if the app does not really do effort to take care of the cache. I'm for a cache-type agnostic approach, ie one where you just cache things wherever they should be cached according to the wikis config and available caching tools. I recommend you have a look at wfGetCache in MediaWikis GlobalFunctions.php, and the related functions that can be found near it. Unfortunately there do not appear to be any docs on this. I've been using wfGetCache( CACHE_ANYTHING ) a lot recently, which can work with memc when available, but is also very effective when it's not (when stuff is stored in the db). The main questions for this project are probably not how to cache stuff (as MediaWiki already provides the facilities you need), but rather what to cache, when to cache it, and how to properly invalidate it (if needed). An easy place to start would be SMWs special pages such as Special:Properties, Special:Types and Special:SemanticStatistics (or something like that). There you can easily cache the HTML output, and then when the page is hit again, and is still in the cache, just use the build up HTML, sparing you the need to do any queries (except the single one obtaining the HTML) and the cost of rendering the page itself. I wrote some generic utilities to cache special pages (for some other extension), which we might want to use here, although I think it'd be best if you just did a page without first, so you know how the stuff works, and can use it where those utilities are not helpful :) Cheers -- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. -- |
From: Niklas L. <nik...@gm...> - 2012-03-30 15:25:06
|
On 30 March 2012 17:36, Jeroen De Dauw <jer...@gm...> wrote: >> Both Memcache and APC can run in parallel so we might also consider using >> both > > Installing APC gives you a performance boost, even if the app does not > really do effort to take care of the cache. Agreed. Installing APC is something that we as developers cannot do on users behalf. > The main questions for this project are probably not how to cache stuff (as > MediaWiki already provides the facilities you need), but rather what to > cache, when to cache it, and how to properly invalidate it (if needed). Especially the invalidation is tricky. > An easy place to start would be SMWs special pages such as > Special:Properties, Special:Types and Special:SemanticStatistics (or > something like that). There you can easily cache the HTML output, and then > when the page is hit again, and is still in the cache, just use the build up > HTML, sparing you the need to do any queries (except the single one > obtaining the HTML) and the cost of rendering the page itself. Don't forget i18n. On particular complex pages you might need to cache both the queried and processed data and the generated html output for each language. In special I'm most concerned of the buildup of queries on normal page views, that happen much more often that special page views and they add up to lot to decrease general snappiness. -Niklas -- Niklas Laxström |
From: Jeroen De D. <jer...@gm...> - 2012-03-30 16:55:01
|
Hey, > In special I'm most concerned of the buildup of queries on normal page views, > that happen much more often that special page views and they add up to lot to decrease general snappiness. Yeah, definitely, I actually said this in one of my earlier mails, and did not mean to push SMWs special pages as high priority. I simply think they are a good place to start (if you still need to learn this stuff), since caching them is relatively straight forward. Cheers -- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. -- |
From: nischay n. <nis...@gm...> - 2012-03-31 01:09:23
|
On Fri, Mar 30, 2012 at 10:24 PM, Jeroen De Dauw <jer...@gm...>wrote: > Yeah, definitely, I actually said this in one of my earlier mails, and did > not mean to push SMWs special pages as high priority. I simply think they > are a good place to start (if you still need to learn this stuff), since > caching them is relatively straight forward. Yes, Its better to start with something easy. I have already started to work on caching Special Pages. -- With Regards Nischay Nahata B.tech 3rd year Department of Information Technology NITK,Surathkal |