Ivan-Assen Ivanov wrote:
Second, you can calculate the Gaussian blur in a single post-processing
pass if you want. It's no different from doing the horizontal and
vertical in two different passes, except for the amount of math
involved. If you do that, then you'll be using a large amount of pixel
shader instructions compared to framebuffer writes, which will generally
be slower than just doing two separate passes on render targets. All
hardware is decent enough at fill rate compared to shader instructions
that it doesn't make sense to try to do it in a single post-process
pass. When we get a 1000:1 ratio of shader ops:framebuffer writes, that
detemrination may change, but we're nowhere near there just yet :-)
    

I beg to disagree - it still depends on the particular hardware and
filter sizes.

There's this weird piece of hardware that has
sold in the mid-two-digit millions, yet refuses to sample from its framebuffer,
so if you're doing two passes, you have to do transfer the results from the
first pass into RAM, and 5x5 and 9x9 Gaussian kernels turn out faster
in one pass instead of separated in H and V.

  
I'm curious, what hardware is that? This is the first I've heard of this and I'm wondering if it's something I'd need to worry about.