Bigdata query optimizer

  • feugen24

    feugen24 - 2014-03-18

    While testing various scenarios I loaded db with ~7M triples and I noticed the default static optimizer fails in simple scenarios, meaning that it rearranges the query so it joins much more than needed.
    For example given a inner query that binds a ?s variable to 3 values, if i want to get all triples for ?s (~ 10 triples) and I join with ?s ?p ?o, it takes few minutes to get them (and 2G of ram), but with optimizer set to "None" or "Runtime" it's just few ms as it should.

    If someone is evaluating the db he will get the impression of poor performance. Also I tried to find a config setting that changes the default optimizer since now I have to set it to None in all queries.
    So can I set a default?
    In general the performance should be better with static optimizer so i should try to use it?
    In this kind of situations a bug should be reported? (I have the impression the query is v simple so it should work fine with any optimizer)


  • feugen24

    feugen24 - 2014-03-19

    No bind usage..
    No data set required because the query is quite generic, it just needs to filter on a subject so any data set will do, and as db will get bigger the query will be slower.

    I have attached a zip of 2 executions explain html so you can see the query and execution plan.
    Notice that the "limit" keyword makes query run a bit faster because the plan changes, but it should not be necessary.
    Changing optimizer to none gives instant response.

    I'm using quad mode.


Log in to post a comment.