From: Michael Kay <mike@sa...>  20090211 09:27:48

> I see now that in the first code fragment the correct > subexpression must be: > > $ind in indexof($vPrimesLT50M, $p1)[1], > > instead of: > > > $ind in indexof($vPrimesLT50M, $p1), I don't think that change helps. I don't think this optimization can be done without knowing that there are no duplicates in $vPrimesLT50M, and I think the only situation where Saxon be confident in knowing that property of a sequence statically is when the sequence is the result of calling distinctvalues(), or a few other expressions like (1 to N). For example, if $vPrimesLT50M is the sequence (4, 3, 4, 7), then (after adding the [1]) $ind would take the values 1, 2, 1, 4. The optimization also depends on knowing that the position of an item in ($vPrimesLT50M[position() lt $vmaxP1Ind]) is the same as its position in $vPrimesLT50M, which is a rather specialized inference to be making. Even if I knew how to do it, before putting an optimization into Saxon, I have to ask the question "how many queries/stylesheets would benefit". In this particular case, I suspect the answer would be one. I've spent yesterday trimming about 3% off the cost of invoking builtin template rules, which is a much more useful change because it benefits everyone. I guess there is one optimization I could consider here: if indexof(X, Y) is called inside a loop in which X is constant but Y is not, one could build an index of all the items in X and their positions. That's a little bit more generalpurpose, but I would still question how many queries are likely to benefit. > > In this case the speed of the first solution is just 23 > times less than that of the second. > I'm a little surprised it should make that much difference. I would expect it still to be quadratic. Perhaps it's because $vmaxP1Ind is much smaller than the size of $vPrimesLT50M. Michael Kay http://www.saxonica.com/ 