As the attached example demonstrates, a chained reduction of an array expression can only be processed when the intermediate results are stored in an array. This is not feasible for large arrays.
This has been verified with gcc-4.3 on openSUSE-11.0 (x86_64) with blitz-0.9 and the current version from the CVS.
bug demo
The problem seems to have something to do with (from the manual):
"Reductions have an important restriction: It is currently only possible to reduce over the last dimension of an array or array expression."
The reduction works if the expression is transposed so tensor::l is the last index of the last operand:
sum(a1(i, k) * a2(l,k) * a3(j, l), l)
However, transposing the operands shouldn't affect whether the expression can be reduced, so there's something strange going on here.
The problem is that compute_ordering() throws an assert if the ordering of the expression is indeterminate. In _bz_ArrayExprReduce::computeOrdering(), the ordering of the reduction is taken from the operands, except that missing values are just initialized in standard order, so just calling this for an expression with mixed order will throw the assert. It seems that mixed ordering of the operands here is not a fatal problem, since the expression will be evaluated using indices anyway. So perhaps if there's mixed order of the operands, we should just skip it and initialize that dim just as if it was missing?
Migrated to: https://github.com/blitzpp/blitz/issues/64