On Dec 29, 2007 10:42 AM, Markus Krötzsch <mak@aifb.uni-karlsruhe.de> wrote:
On Montag, 17. Dezember 2007, Sergey Chernyshev wrote:
> Thank you, Markus - it's a really good review! I wonder if there is any way
> to unify performance reporting for all SMW instances so we can compare the
> effects of large data sets, different systems configs (e.g. disabled cache
> and so on) - just looked at profileinfo.php script, it might be an answer,
> actually.
> I wonder if real Wikipedia set of data (outdated, maybe) is going to be set
> up as a test-case for SMW to handle (with Semantic Templates, of course) -
> I was going to do that, but don't have resources for this. This might help
> to make the goal of "Semantic Wikipedia" more transparent.

In fact we have such a site, but it runs on a rather unstable hardware (we
have a buggy RAID controller or driver :-(). It is our test server at
test.ontoworld.org , which also was used for other experiments and is not in
perfect shape right now (and querying was disabled in order to not impair
other experiments). We might set up another more recent Wikipedia copy
sometime in some future.

Great, I'll definitely take a look. What were your main discoveries with English WP and SMW? How many Semantic Templates did you make? I'd say that I'm mostly interested in volume and performance on relatively simple queries since this is what most of large projects would need, although complex queries are also interesting.

It would be quite interesting to see latest WP and latest SMW to understand biggest pain-points.

> Since we're talking about performance, there is another side of performance
> tuning - percepted performance, this mostly concerns javascripts, css and
> so on - for example there is still a problem of SIMILE Timeline not being
> that fast to load (although performance of pages that didn't have it
> improved now, when client-side the code is loaded only on pages that need
> it). This kind of issues can be tracked using Firebog with Yahoo's YSlow
> add-on.

True, and I hope Timeline is really the main performance problem there. I
wonder whether we could ship a more stripped down version of the scripts to
decrease load time. I guess we should ask the guys over at SAIL for that ...

Can you describe modifications you made to original Timeline code? I used it some time ago and took closer look at how their code is bundled, I might be able to help to migrate it to new version. BTW, it might make sense to have SMW code separate from theirs to make such upgrades easier. 

> I'll be happy to run the tests on the system with significant amount of
> data if you need a testbed.

All profiling support is appreciated, but I am not sure how to operationalise
testing on our servers (SQL profiling would probably need server access,
which is not possible in this case). Insights on JavaScript performance are
also useful, but I guess that MySQL tuning could be most important for
approaching large sites. I you have know about DB optimisation, you can also
have a look at our DB layout and at the SQL queries we generate

Actually I was talking about standardizing some test process that all of us can run on our servers to compare performance and settings and help each other to derive best practices in terms of performance and maybe even find bottlenecks.