Name | Modified | Size | Downloads / Week |
---|---|---|---|
fvml-2012-11-19.tgz | 2012-11-19 | 24.8 kB | |
fvml-2012-11-07.tgz | 2012-11-07 | 21.3 kB | |
README | 2012-11-07 | 1.4 kB | |
Totals: 3 Items | 47.5 kB | 0 |
This is an attempt to provide SSE (and eventually AVX) vectorized forms of C math functions. Progress is kept in the todo file, but suffice it to say that most of the simple functions have been written for double precision. The difficult work is still ahead. The goal is to provide 2 ulp or better precision. I think all functions currently do so, with most completely accurate. The exp code is broken. I hope to support qNaNs and +-inf, and I think I do so far, but denormals would be very difficult for the vectorized code. I use gcc's __builtin_ia32 functions a lot, so there is no guarantee that these functions will continue to be supported or that the code will compile on any other system. Code has been tested, and most functions found to be faster than standard C math, for an AMD Opteron system and an Intel Core 2 Duo system. Some early code exists in vfml_inlines.h and vfml_log.c for performing faster computations on single double precision number, opposed to v2df functions. Functions will probably be renamed in the future to use trailing _v2df type names, more compliant with the existing C names, as opposed to the current v2df_ prefix. Some example code which has been used for timing and testing the routines is in the test directory. Send comments to Daniel Davis ddesics@gmail.com 201 Buckingham Pl Blacksburg, VA, 24060