add plgins now here
added an sse based implementation
add sse implementation
check close float
optimisation
do some optimisation
add some test and improve kernel creation code
add actual numer of overlapping pixels
add image combination