|
From: Julian S. <js...@ac...> - 2011-04-20 11:46:20
|
> These instructions have a startup cost, so just doing 8 or > 16 bytes at a time is slower but it probably should work > reasonably enough. Yeah. This is kind of similar to the way CRC32 works on amd64. > Then I have to find a good way to let a clean > helper return a 128 bit value. I don't know of any good way using the existing IR infrastructure. One bad way I have used in the past (try not to die laughing) is to call the helper twice, passing it a bool that indicates whether it should return the lower or upper 64 bits of the result. Yes, it's brain-dead-moron stuff. But yes, it works. See amd64g_calculate_RCR(). In fact it is very annoying not to be able to pass 128 bit values to/from clean helpers -- it has caused a lot of extra complexity in the SSEx implementations. And in future for AVX I suspect I will want to pass 256 bit vectors to/from helpers. It would be good perhaps to contemplate how to extend the clean helper IR stuff just a little bit, so it is possible to pass 128 values in/out by reference. Maybe the cleanest solution is to allow args and/or result type to be Ity_V128, and put the burden on the instruction selectors, so that when they see such an arg type or result type, they generate code to allocate space on the (host) stack, and pass the address of that instead to the helper. Of course this means we'd also have to document that a helper function expecting to deal with such arguments must pass/return them by reference. IOW specify how to write such a function in a way that is compatible with the proposed instruction selector changes. > The message digest functions (e.g. SHA512) will be a little more > tricky, since they have up to 128byte as data block size. Here > we might need a dirty helper instead of a clean one. And I still > dont know how to pass back 512bits without going over memory. I have no suggestions for that. J |