During the refactoring in [#121], some performance penalty has been introduced due to the transition CTensor -> State. For ODE solvers, it has already turned out that there are substantial benefits in unwrapping states, and using the in-place operations for tensors whenever we add or multiply tensors.
The same optimizations have not been done for Chebychev propagators and potentially other places. The effect is a slowdown up to 15% for small systems (<100 points or so), but much less relevant for larger, 2D systems.
Possible places to apply the optimization:
Testing showed no effect for EquationSystem, so I kept the simpler current code.
Chebychev propagation got noticably faster (ca. 10% for the 2G Harmonic oscillator demo)
Diff: