Have extra tuning step in Level 2, where we tune a NoTrans GEMV specifically to match the Trans gemv; assume we call trans gemv first, then it's in-L1 for NoTrans (no need for prefetch), but you'll want an axpy-based algorithm . . .
Logged In: YES user_id=182470
This is done. Will be in next developer release.
Log in to post a comment.
Logged In: YES
user_id=182470
This is done. Will be in next developer release.