ispc
Intel SPMD Program Compiler
... on architectures with 4-wide vector SSE units and 5x-6x on architectures with 8-wide AVX vector units, without any of the difficulty of writing intrinsics code. Parallelization across multiple cores is also supported by ispc, making it possible to write programs that achieve performance improvement that scales by both numbers of cores and vector unit size. Build a small set of extensions to the C language that would deliver excellent performance to performance-oriented programmers.