I had a good idea on how to optimize fec. (Batch
things into cache-coherent memory chunks. Transpose
the matrix of inputs/outputs.)
Unfortunately in my implementation (visible on branch
opt_fec) I screwed up the math so it didn't actually
To close this issue, fix it. I'm assigning it to myself.
Alternately, get Enochian to explain what's wrong with
blocksharez, fix blocksharez, and benchmark it against
fec. Then you can close this issue.
Log in to post a comment.