Small fix on OpenCL dot product test.
working on OpenCL implementation
making the code more portable
adding volume tiling test
Working on OpenCL testing.
Preparing OpenCL future experimentation.
Updating dot product test.
FAST! FAST! FAST!
minor updates to readme
adding AVXCopyRelu to amd64 - i386 is still missing