From: Ivan M. <imi...@op...> - 2009-02-27 15:52:50
|
Hello Armin, 1. I'd strongly advice to place FROM clauses to the query whenever possible. As soon as you've created your big graph yourself, you know the name. This Wednesday one customer accelerated his application by a factor of 500(!) by just placing forgotten FROM. When you really need running queries with unbound graph you probably ought to create additional indexes on RDF_QUAD table. 2. I'd recommend to keep ResultSetMaxRows less than 2000000000. 3. The timeout value comes from [SPARQL] section of virtuoso.ini, as a value of MaxQueryExecutionTime. If not set, the value of ExecutionTimeout is used. If that one is not set as well, the timeout is not set at all. Nevertheless HTTP client may decide that it waits too long and drop the connection without paying much attention to any settings on the other end :) 4. In order to save big amounts of data, consider using string_to_file function, like string_to_file ('myfile.ttl', (sparql define output:format "TTL" construct {...} from <...> where {...} ), -2); Best Regards, Ivan Mikhailov OpenLink Software http://virtuoso.openlinksw.com On Thu, 2009-02-26 at 15:36 +0100, Armin Nagel wrote: > Hello users, > > I have loaded some dbpedia parts into virtuoso merged to just one graph. > All works fine. > > Now I have the idea to extract via sparql-construct all data belong to > resources of type person. > > This is my sparql construct: > > define output:format "TTL" > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > CONSTRUCT { ?s ?p ?lo. ?p rdfs:label ?lp.} > WHERE { ?s a <http://dbpedia.org/ontology/Person>. > { ?s ?p ?lo. FILTER isLiteral(?lo) } > UNION { ?s ?p ?o. ?o rdfs:label ?lo. FILTER isIRI(?o) } > ?p rdfs:label ?lp.}; > > Explained: > I collect all information to any resource of type person. > The resource must be subject. If it is linked with another resource like > a place, I collect the label of it too. > > I tried different ways. The construct works for type actor over > httpendpoint, but for person not all data is in the result. I think > virtuoso breaks if the query takes to much time. > > My virtuoso.ini: > ; > ; Server parameters > ; > [Parameters] > ServerPort = 1111 > DisableUnixSocket = 1 > ;SSLServerPort = 2111 > ;SSLCertificate = cert.pem > ;SSLPrivateKey = pk.pem > ;X509ClientVerify = 0 > ;X509ClientVerifyDepth = 0 > ;X509ClientVerifyCAFile = ca.pem > ServerThreads = 20 > CheckpointInterval = 60 > O_DIRECT = 0 > NumberOfBuffers = 400000 > MaxDirtyBuffers = 1200 > CaseMode = 2 > MaxStaticCursorRows = 5000 > CheckpointAuditTrail = 0 > AllowOSCalls = 0 > SchedulerInterval = 10 > DirsAllowed = ., /foo/vad, /bar/dbpedia, > ThreadCleanupInterval = 0 > ThreadThreshold = 10 > ResourcesCleanupInterval = 0 > FreeTextBatchSize = 100000 > SingleCPU = 0 > VADInstallDir = /moo/vad/ > PrefixResultNames = 0 > > [SPARQL] > ;ExternalQuerySource = 1 > ;ExternalXsltSource = 1 > ResultSetMaxRows = 9223372036854775807 > DefaultGraph = http://neofonie.de/dbpedia_3_2 > ;ImmutableGraphs = http://localhost:8890/dataspace > ;MaxQueryCostEstimationTime = 120 ; in seconds > ;MaxQueryExecutionTime = 10 ; in seconds > ;PingService = http://rpc.pingthesemanticweb.com/ > DefaultQuery = select * where { ?s ?p ?o . } Limit > 100 > > I tried over isql with following command: > /isql 1111 foo bar test.sql > test_r.nt > > test.sql contains: > sparl > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > CONSTRUCT { ?s ?p ?lo. ?p rdfs:label ?lp.} > WHERE { ?s a <http://dbpedia.org/ontology/Person>. > { ?s ?p ?lo. FILTER isLiteral(?lo) } > UNION { ?s ?p ?o. ?o rdfs:label ?lo. FILTER isIRI(?o) } > ?p rdfs:label ?lp.}; > > It' breaks with warning > Warning 01004: [Virtuoso Driver]CL077: Data truncated in column 1 of the > result-se(callretRDF/XML-0, type 125) > > Is there any way to extract such huge data without breaking virtuoso? > > Kind regards > > Armin Nagel > > -- > > Sie finden uns auch mit unserer Web 2.0 Suchmaschine WeFind > auf der CeBIT in Halle 006, Stand G60 (in der Webciety). > > Wir freuen uns auf Ihr Kommen! > ________________________________ > > Armin Nagel > Softwareentwickler > > neofonie > Technologieentwicklung und > Informationsmanagement GmbH > Robert-Koch-Platz 4 > 10115 Berlin > fon: +49.30 24627 257 > fax: +49.30 24627 120 > arm...@ne... > http://www.neofonie.de > > Handelsregister > Berlin-Charlottenburg: HRB 67460 > > Geschäftsführung > Helmut Hoffer von Ankershoffen > (Sprecher der Geschäftsführung) > Nurhan Yildirim > Uwe-Gernot Fasold > ________________________________ > > Die erste Web 2.0 Suchmaschine jetzt auf http://www.wefind.de . > > Unterwegs immer bestens informiert mit WeFind Mobile für iPhone und > jetzt auch mit WeFind Mobile für Android: kostenloser Download im iTunes > AppStore und im Android Market. > > > > ------------------------------------------------------------------------------ > Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA > -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise > -Strategies to boost innovation and cut costs with open source participation > -Receive a $600 discount off the registration fee with the source code: SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > Virtuoso-users mailing list > Vir...@li... > https://lists.sourceforge.net/lists/listinfo/virtuoso-users |