This is done by moving the costly calculation of the matrix exponential out of the for loops.
The trick was to find a method to do dot product of higher dimensions.
Thiw was done with numpy.einsum:
Example at:
http://wiki.nmr-relax.com/Numpy_linalg#Ellipsis_broadcasting_in_numpy.einsum
Example:
dot_V_W = einsum('...ij,...jk', V, W_exp_diag)
Where V, and W_exp_diag has shape: [NE][NS][NM][NO][ND][7][7]
The profiling script shows a 2X speed up.
----BEFORE:
SINGLE
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 18.811 18.811 <string>:1(<module>)
1 0.002 0.002 18.811 18.811 pf_3d:407(single)
CLUSTER
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 18.315 18.315 <string>:1(<module>)
1 0.001 0.001 18.315 18.315 pf_3d:431(cluster)
-----AFTER:
SINGLE
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 8.818 8.818 <string>:1(<module>)
1 0.002 0.002 8.818 8.818 pf_3d:407(single)
CLUSTER
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 9.082 9.082 <string>:1(<module>)
1 0.001 0.001 9.082 9.082 pf_3d:431(cluster)
Task #7807 (https://gna.org/task/index.php?7807): Speed-up of dispersion models for Clustered analysis.