I have optimized GDL "smooth" implementation. The computation time of "foo=smooth(dist(1024),50)" have changed from 3.5 to 0.37 seconds.
I have attached the whole file because i don't know how to prepare a patch.
The point of the optimization is to use several subsequental 1d convolutions instead of 1 multidimensional convolution. This approach is mathematically correct for boxcar filter i.e. smooth.
p.s. Sorry for my english.