#5046 Slow query: pyforge.repo_commitrun query: { commit_ids: { $in: [...] } }

forge-oct-05
closed
Cory Johns
General
2
2014-08-20
2012-10-03
Cory Johns
No

The top three queries by time taken according to a recent parsing of the log are:

93252492ms: query pyforge.mailbox query: { queue: { $ne: {} }, type: "..." }
14015985ms: getmore pyforge.repo_commitrun query: { commit_ids: { $in: [...] } }
2529344ms: query pyforge.user query: { $query: { display_name: /.../i }, $orderby: { username: 1 } }

The second one is regarding building the CommitRunDocs during repository refreshing.

Related

Tickets: #5049
Tickets: #5094

Discussion

  • Cory Johns
    Cory Johns
    2012-10-03

    The commit_ids field is indexed, and the index is being used, but when doing an $in against an array field, it appears that the way the index is used is roughly equivalent to the following mongo code:

    var results = [];
    in_values.forEach(function(val) {
        var run = db.repo_commitrun.findOne({commit_ids: val});
        if (!(run in results))
            results.push(run);
    });
    

    That is to say, the index is used effectively to locate the document for each value in the list of values passed to $in, but each of those documents is scanned when reducing the results to a unique set. See http://pastie.org/private/uro3wts6si44detlpe2mva for an example.

     
  • Cory Johns
    Cory Johns
    2012-10-04

    • status: in-progress --> code-review
    • qa: Dave Brondsema
     


Anonymous


Cancel   Add attachments