From: Chris B. <ch...@bi...> - 2006-04-12 15:51:10
|
Hi Markus, >=20 > We consider using RAP as a quadstore for Semantic MediaWiki (see > http://wiki.ontoworld.org).=20 Interesting. > In the long run, we are interested in > inferencing, but for now Wikipedia-size scalability is most important. Hmm sorry, up to my knowledge there are no systematic comparisons of the performance of RAP with other RDF toolkits. We did some relatively unsystematic performance testing when we = implemented different features, but the results are outdated by now. S=F6ren Auer and Bart Pieterse (both cc'ed) have used RAP in bigger = projects and I guess they are the best sources for practical experiences with the performance of RAP with bigger real world datasets. =20 My general impression is that as PHP itself is still slower than = languages like Java or C, RAP is also slow and its performance can not be compared with toolkits like Jena or Sesame. S=F6ren might disagree on this point = with me. > Are > there recent evaluations concerning the performance of the different > storage > models? In particular, we are interested in scalability of the = following > functions: >=20 > 1 SPARQL queries: > 1.1 general performance Around one second for a medium complex query against a data set with 100 = 000 triples in memory, much slower if the data set is in a database. Tobias Gauss can give you details. An PHP alternative for SPARQL queries against data sets which are stored = in a database is Benjamin appmosphere toolkit http://www.appmosphere.com/pages/en-arc. He does smarter SPARQL to SQL rewriting than RAP and should theoretically be faster.=20 > 1.2 performance of "join-intensive" queries (involving long chains of > triples) > 1.3 performance of datatype queries (e.g. selecting/sorting results = by > some > xsd:int or xsd:decimal) > 1.4 performance for partial result lists (e.g. getting only the first = 20) > 2 simple read access (e.g. getting all triples of a certain pattern or = RDF > dataset) OK with models up to 100 000 triples. Don't know about bigger models. = S=F6ren? > 3 write access > 3.1 adding triples to an existing store > 3.2 deleting selected triples from the store Should be OK. I think S=F6ren implemented some work arounds for bulk = updates.=20 > 4 impact of RDF dataset features/named graph functionality About 5% slower than operations on classic RDF models. > For inclusion in Wikipedia, dealing with about 10 Mio triples split = into 1 > Mio > RDF datasets is probably necessary.=20 Too much for RAP, too much for appmoshere (Benjamin?), and I guess even = hard for Jena, Redland and Co if the queries become more complicated. > We are working on useful update and > caching strategies to reduce access to the RDF store, but a rather = high > number of parallel requests still is to be expected (though normal = reading > of > articles will not touch the store). It would also be possible to = restrict > to > certain types of queries if this leads to improved performance. >=20 > We currently use RAP as an RDF parser for importing ontologies into > Semantic > MediaWiki. For querying our RDF data, we consider reusing an existing > triplestores such as Redland or RAP, but also using SQL queries = directly. > Java toolkits are not an option since Wikipedia requires the use of = free > software (and free Java implementations probably don't support current = RDF > stores). If current RDF stores means Named Graph stores then you could use a combination of Jena and NG4J. Jena is BSD and supports SPARQL. NG4J adds = a API for manipulating Named Graph sets. See: http://www.wiwiss.fu-berlin.de/suhl/bizer/ng4j/ >=20 > I can imagine that one can already find performance measures for RAP > somewhere > on the web -- sorry if I missed this. Not that I know. But all efforts into that direction are highly = welcomed. Cheers Chris > Best regards, >=20 > Markus >=20 > -- > Markus Kr=F6tzsch > Institute AIFB, University of Karlsruhe, D-76128 Karlsruhe > ma...@ai... phone +49 (0)721 608 7362 > www.aifb.uni-karlsruhe.de/WBS/ fax +49 (0)721 693 717 --=20 Chris Bizer Freie Universit=E4t Berlin Phone: +49 30 838 54057 Mail: ch...@bi... Web: www.bizer.de |