BrookGPU programs that create streams > 1024 in any
dimension freeze on my GeForceFx in linux, using the
nvgl30 profile. If I use the cpu profile, the program
runs fine, if I use the nvgl30 runtime, it freezes.
Running the same program with the nvgl30 runtime, but a
stream <= 1024 elements in all dimension works fine too.
I tried compiling with brcc -w 2048, but it still freezes.
BTW, main.c (that builds brcc binary) has a bug in the
option parsing code. A colon (:) is missing in the
getopt call, after 'w')
Logged In: YES
user_id=40974
I've tracked the freeze for streams > 1024 elements (in any
dimension) dowto the StreamWrite() routine. More
specifically, the card freezes in the next GL call:
glReadPixels (0, 0, width, height, GLformat[ncomp[i]],
GL_FLOAT, t);
(nv30glstream.cpp:169)
Logged In: YES
user_id=927204
w is deprecated :-)
if you use reductions the max dimension supported goes to half
you can try making a bigger pbuffer in the code by replacing
all 2048 with 4096 and 1024 by 2048
since directX doesn't support 4096x4096 streams this may be
a lost cause.
What are you trying to do BTW?
I run linux and I run it on big streams (2048x2048 w/o
reductions). What version of the nvidia driver are you
using? perhaps this is an old nvidia bug.
Logged In: YES
user_id=40974
actually, i tracked it down. the default brook code (version
0.2) uses the following code for readback:
glViewport(.., width, height, ...);
... render quad ..
glReadPixels(..., width, height, ...);
My X display is 1280 * 1024, therefore this doesn't work for
width or height > 1024! I changed the brook code to use
glGetTexImage(). Works fine now. Thanks!
Logged In: YES
user_id=927204
Thanks! We really appreciate your help.
The problem is that glGetTexSubImage is slower than reading
back from the framebuffer under GL. We're trying to find a
compromise here--but it really shouldn't be
display-dependent...I'd like to know what version of the
nvidia drivers you are using so we could potentially file a
bug against them.