NEON GEMM for 3.9.x
Brought to you by:
rwhaley,
tonyc040457
Adds a single precision NEON GEMM kernel (SGEMM+CGEMM) to 3.9.x. Note that NEON is not fully IEEE 754 compliant, and so should be used only when specifically requested. This patch does not contain the changes to configure necessary to accomplish this, and will use NEON wherever it works when applied.
The first patch failed to properly update scases.flg and ccases.flg to use the NEON kernel because the author of the patch had failed to consume sufficient caffeine to activate his brain. The corrected patch is neon-3.9.x-corrected.diff.
Corrected NEON patch
Tom,
I notice you submitted your kernel with a non-standard license header that you hacked up. I'd like to use the standard ATLAS one instead. I attach what that would look like below, let me know if this is OK with you.
Thanks,
Clint
Yes, that's the same file as the one I've been distributing, so it still has my header. I'll get you a revised copy next week when I get a few minutes. I also hope to have a few more NEON kernels ready in the next month or so.
Tom,
This is currently blocking the next developer list. Can I just swap your header with the one I posted below? Otherwise, I will remove it so I am free to do the next release.
Thanks,
Clint
Please swap the header; sorry for delaying things.