From: Jeff E. <je...@ds...> - 2001-05-07 20:26:48
Attachments:
mesa-scanline-optimization.patch
|
I'm currently trying to move an older piece of software from using PEX/X11 to Mesa/X11. (We use(d) a modified version of PEX with zbuffer support, and it's becoming too much of a chore to propogate these patches to new versions of XFree86) Our software draws exclusively flat-shaded, zbuffered polygons. Out of the box (Mesa 3.2/RedHat 6.2) or using Mesa 3.4.1, the Mesa renderer is rather slower than our PEX-based renderer. However, a simple change to write_span_mono_pixmap brings the two renderers to nearly equal speed. Basically, we discovered in past iterations of software renderers that coalescing as many pixels as possible into single X calls is a very important optimization (in fact, another dead renderer actually unifies multiple single-color spans into a single XDrawSegments call). write_span_mono_pixmap performs this optimization using XMesaFillRectangle when no pixels in the span were excluded (due to depth check or other factors) However, if any pixels were excluded, then it falls back to XMesaDrawPoint for all points. This patch accumulates consecutive "drawn" pixels into a single XMesaFillRectangle call. This is a big speed boost in our application, and a slight pessimization in the case of e.g., a 50% stipple. (XFillRectangle is likely to be optimized for the 1x1 case, as well as the Nx1 case) A slightly more intelligent patch would probably disable the optimization for very short spans, or spans with large numbers of sub-spans. I'll include the patch in my message. Is submitting it to the sourceforge patch tracker appropriate? It doesn't look like that SF feature is well-used. Jeff |
From: Keith W. <ke...@va...> - 2001-05-09 14:42:34
|
How about this as a cleaner (untested) version of the loop: for (i = 0 ; i < n ;) { GLuint start = i; /* Identify and emit contiguous rendered pixels */ for( ; i < n && mask[i]; i++) ; if (start < i) XMesaFillRectangle( dpy, buffer, gc, (int)(x+start), (int) y, (int)(i-start), 1); /* Eat up non-rendered pixels */ for( ; i < n && !mask[i]; i++) ; } Keith |
From: Brian P. <br...@va...> - 2001-05-09 14:54:56
|
Jeff Epler wrote: > > I'm currently trying to move an older piece of software from using > PEX/X11 to Mesa/X11. (We use(d) a modified version of PEX with zbuffer > support, and it's becoming too much of a chore to propogate these > patches to new versions of XFree86) > > Our software draws exclusively flat-shaded, zbuffered polygons. > > Out of the box (Mesa 3.2/RedHat 6.2) or using Mesa 3.4.1, the Mesa > renderer is rather slower than our PEX-based renderer. However, a > simple change to write_span_mono_pixmap brings the two renderers to > nearly equal speed. > > Basically, we discovered in past iterations of software renderers that > coalescing as many pixels as possible into single X calls is a very > important optimization (in fact, another dead renderer actually unifies > multiple single-color spans into a single XDrawSegments call). > write_span_mono_pixmap performs this optimization using > XMesaFillRectangle when no pixels in the span were excluded (due to > depth check or other factors) > > However, if any pixels were excluded, then it falls back to > XMesaDrawPoint for all points. > > This patch accumulates consecutive "drawn" pixels into a single > XMesaFillRectangle call. This is a big speed boost in our application, > and a slight pessimization in the case of e.g., a 50% stipple. > (XFillRectangle is likely to be optimized for the 1x1 case, as well as > the Nx1 case) A slightly more intelligent patch would probably > disable the optimization for very short spans, or spans with large > numbers of sub-spans. > > I'll include the patch in my message. Is submitting it to the > sourceforge patch tracker appropriate? It doesn't look like that SF > feature is well-used. Sending small patches to this list is fine. Larger patches should be filed on SF. I'll look this over and probably apply it soon. Thanks! -Brian |
From: Jeff E. <je...@us...> - 2001-05-09 22:32:31
|
Jeff Epler wrote: > > This patch accumulates consecutive "drawn" pixels into a single > > XMesaFillRectangle call. [...] On Wed, May 09, 2001 at 09:00:38AM -0600, Brian Paul wrote: > Sending small patches to this list is fine. Larger patches should > be filed on SF. > > I'll look this over and probably apply it soon. Thanks! Would you like to incorporate the patch as-is, or incorporate modified logic as Keith has suggested? Another modification you might want before inclusion is a way to switch back to the old per-pixel code when it's appropriate, such as when drawing with a 50% stipple. (The code as submitted draws this ~20% slower than 3.4.1 in my test app, IIRC) Jeff |
From: Brian P. <br...@va...> - 2001-05-09 22:42:05
|
Jeff Epler wrote: > > Jeff Epler wrote: > > > This patch accumulates consecutive "drawn" pixels into a single > > > XMesaFillRectangle call. [...] > > On Wed, May 09, 2001 at 09:00:38AM -0600, Brian Paul wrote: > > Sending small patches to this list is fine. Larger patches should > > be filed on SF. > > > > I'll look this over and probably apply it soon. Thanks! > > Would you like to incorporate the patch as-is, or incorporate modified > logic as Keith has suggested? Keith's looked a little cleaner. Did you test it? > Another modification you might want before inclusion is a way to switch > back to the old per-pixel code when it's appropriate, such as when drawing > with a 50% stipple. (The code as submitted draws this ~20% slower than > 3.4.1 in my test app, IIRC) I might just #ifdef-out the old code but leave it there. I think the Z buffer scenario you're using is more typical than the 50% stipple scenario. -Brian |
From: Jeff E. <je...@us...> - 2001-05-10 00:02:23
|
> Jeff Epler wrote: > > Would you like to incorporate the patch as-is, or incorporate modified > > logic as Keith has suggested? > Brian Paul wrote: > Keith's looked a little cleaner. Did you test it? No, I didn't get a chance today. I hope to test it tomorrow morning. If it were my code, I'd have written a fencepost error in just to make sure I had something to debug. > > Another modification you might want before inclusion is a way to switch > > back to the old per-pixel code when it's appropriate, such as when drawing > > with a 50% stipple. (The code as submitted draws this ~20% slower than > > 3.4.1 in my test app, IIRC) > > I might just #ifdef-out the old code but leave it there. I think > the Z buffer scenario you're using is more typical than the 50% stipple > scenario. I have the same suspicion, but wanted to include this information for completeness. Jeff -- |
From: Jeff E. <je...@ds...> - 2001-05-10 14:12:06
Attachments:
mesa-scanline-optimization-v2.patch
|
I tried Keith's code this morning. It appears to be the same speed as my original code. I wanted to verify that it gives the same results as the old code, but chose a rather low-tech way to do it -- grab a screenshot with gimp from each version, and use the "difference" mode to find any differences. The version I first compared against was actually Mesa 3.2-2 from RedHat 6.2, and I noticed something weird -- The background was cleared to RGB (0,4,0) rather than (0,0,0) in 3.4.1 + patch. Is this just a bug that was fixed in the meantime? When I compare 3.4.1 vs 3.4.1 + patch, the results are identical in all pixels. One question -- if we use a Mesa with this patch in XFree86 4.0 glx, will the optimization apply when not using hardware acceleration? Jeff |
From: Brian P. <br...@va...> - 2001-05-10 14:16:30
|
Jeff Epler wrote: > > I tried Keith's code this morning. It appears to be the same speed > as my original code. > > I wanted to verify that it gives the same results as the old code, but > chose a rather low-tech way to do it -- grab a screenshot with gimp from > each version, and use the "difference" mode to find any differences. > > The version I first compared against was actually Mesa 3.2-2 from > RedHat 6.2, and I noticed something weird -- The background was > cleared to RGB (0,4,0) rather than (0,0,0) in 3.4.1 + patch. Is > this just a bug that was fixed in the meantime? What is your glClearColor() and what depth is your rendering window? I seem to recall making a change to clear color months ago but I don't remember the details. > When I compare > 3.4.1 vs 3.4.1 + patch, the results are identical in all pixels. OK, I'll apply Keith's version of the patch. > One question -- if we use a Mesa with this patch in XFree86 4.0 glx, > will the optimization apply when not using hardware acceleration? Yes. The indirect GLX renderer will also benefit from this. -Brian |
From: Jeff E. <je...@ds...> - 2001-05-10 15:12:07
|
On Thu, May 10, 2001 at 08:22:04AM -0600, Brian Paul wrote: > > The version I first compared against was actually Mesa 3.2-2 from > > RedHat 6.2, and I noticed something weird -- The background was > > cleared to RGB (0,4,0) rather than (0,0,0) in 3.4.1 + patch. Is > > this just a bug that was fixed in the meantime? > > What is your glClearColor() and what depth is your rendering window? > I seem to recall making a change to clear color months ago but I > don't remember the details. 16 bit (RGB 565), and glClearColor(0.0, 0.0, 0.0, 0.0) This isn't important, since it looks fixed, and I never even noticed the problem visually. I'm guessing that it happened around the time that a change to xmesa1.c was committed with the log message Pass pixel format to xmesa_color_to_pixel(). Compute clearpixel without dither since this certainly touches that area .. Jeff |
From: Keith W. <ke...@va...> - 2001-05-10 15:27:35
|
Jeff Epler wrote: > > I tried Keith's code this morning. It appears to be the same speed > as my original code. > > I wanted to verify that it gives the same results as the old code, but > chose a rather low-tech way to do it -- grab a screenshot with gimp from > each version, and use the "difference" mode to find any differences. > > The version I first compared against was actually Mesa 3.2-2 from > RedHat 6.2, and I noticed something weird -- The background was > cleared to RGB (0,4,0) rather than (0,0,0) in 3.4.1 + patch. Is > this just a bug that was fixed in the meantime? When I compare > 3.4.1 vs 3.4.1 + patch, the results are identical in all pixels. > > One question -- if we use a Mesa with this patch in XFree86 4.0 glx, > will the optimization apply when not using hardware acceleration? > Note that the 'write_all' optimization higher up in the function is now redundant as well. Keith |