From: Thomas H. <th...@br...> - 2000-06-11 22:36:45
|
Hi, all! I'm developing a Computational-fluid-dynamics mesh visualizer using OpenGL and qt's OpenGL extension, which I've tried with the mga dri module (latest (june 12th) trunk, kernel 2.2.15 with backported agpgart and rawhide Xfree86-4.0-0.12, mga g400MAX 32MB) The code draws a large number (20000+) of filled tri's and quads and each of them is outlined so that each individual tri or quad can be distinguished. The drawing code looks something like glLineWidth( 1.0 ); glEnable(GL_DEPTH_TEST); glPolygonMode(GL_FRONT_AND_BACK,GL_LINE); qglColor( gray ); glCallList( object ); glPolygonMode(GL_FRONT_AND_BACK,GL_FILL); glEnable(GL_POLYGON_OFFSET_FILL); glPolygonOffset(1.0,1.0); qglColor( red ); glCallList( object ); glDisable(GL_POLYGON_OFFSET_FILL); It turns out that when the lines are drawn using the first part of the code, rendering is very slow compared to when the filled polygons are drawn (about a factor 10 or so). The filled polygon rendering seems quite fast. As far as I can see direct rendering is enabled (X log says so and my application uses all the cpu). So my question is: is glPolygonMode(..,GL_LINE) not hardware accelerated and if not, are there any line-drawing techniques that are? Any answer would be greatly appreciated. /Thomas Hellström -- Thomas Hellström, Fyrmästaregången 8, S-413 18 Göteborg, Sweden Email: th...@br... Tel: +46 31 244077, +46 31 663295, +46 704 976916 // Fax: +46 31 546710 |
From: Keith W. <ke...@pr...> - 2000-06-12 16:14:48
|
Thomas Hellstrom wrote: > > Hi, all! > > I'm developing a Computational-fluid-dynamics mesh visualizer using > OpenGL and qt's OpenGL extension, which I've tried with the mga dri > module (latest (june 12th) trunk, kernel 2.2.15 with backported agpgart > and rawhide Xfree86-4.0-0.12, mga g400MAX 32MB) > > The code draws a large number (20000+) of filled tri's and quads and > each of them is outlined so that each individual tri or quad can be > distinguished. The drawing code looks something like > > glLineWidth( 1.0 ); > glEnable(GL_DEPTH_TEST); > glPolygonMode(GL_FRONT_AND_BACK,GL_LINE); > qglColor( gray ); > glCallList( object ); > > glPolygonMode(GL_FRONT_AND_BACK,GL_FILL); > glEnable(GL_POLYGON_OFFSET_FILL); > glPolygonOffset(1.0,1.0); > qglColor( red ); > glCallList( object ); > glDisable(GL_POLYGON_OFFSET_FILL); > > It turns out that when the lines are drawn using the first part of the > code, rendering is very slow compared to when the filled polygons are > drawn (about a factor 10 or so). The filled polygon rendering seems > quite fast. As far as I can see direct rendering is enabled (X log says > so and my application uses all the cpu). > > So my question is: is glPolygonMode(..,GL_LINE) not hardware > accelerated and if not, are there any line-drawing techniques that are? It should be accelerated, and this has been tested fairly recently. I'll try and retest and send you my results. Keith |
From: Keith W. <ke...@va...> - 2000-06-14 23:59:50
Attachments:
soprof-0.0.tar.gz
|
Keith Whitwell wrote: > > Thomas Hellstrom wrote: > > > > Hi, all! > > > > I'm developing a Computational-fluid-dynamics mesh visualizer using > > OpenGL and qt's OpenGL extension, which I've tried with the mga dri > > module (latest (june 12th) trunk, kernel 2.2.15 with backported agpgart > > and rawhide Xfree86-4.0-0.12, mga g400MAX 32MB) > > > > The code draws a large number (20000+) of filled tri's and quads and > > each of them is outlined so that each individual tri or quad can be > > distinguished. The drawing code looks something like > > > > glLineWidth( 1.0 ); > > glEnable(GL_DEPTH_TEST); > > glPolygonMode(GL_FRONT_AND_BACK,GL_LINE); > > qglColor( gray ); > > glCallList( object ); > > > > glPolygonMode(GL_FRONT_AND_BACK,GL_FILL); > > glEnable(GL_POLYGON_OFFSET_FILL); > > glPolygonOffset(1.0,1.0); > > qglColor( red ); > > glCallList( object ); > > glDisable(GL_POLYGON_OFFSET_FILL); > > > > It turns out that when the lines are drawn using the first part of the > > code, rendering is very slow compared to when the filled polygons are > > drawn (about a factor 10 or so). The filled polygon rendering seems > > quite fast. As far as I can see direct rendering is enabled (X log says > > so and my application uses all the cpu). > > > > So my question is: is glPolygonMode(..,GL_LINE) not hardware > > accelerated and if not, are there any line-drawing techniques that are? I've been looking at it this afternoon. GL_LINE polygons are definitely accelerated, but they are a lot slower than regular polygons for the following reason - we have to draw lines on MGA hardware with two triangles, because we haven't got warp code to draw lines from Matrox. Thus, to draw a GL_LINE triangle, we have to draw 3 lines, each of which is two thin triangles, each of those have 3 vertices. So, to draw a GL_LINE triangle we need to emit 3 * 2 * 3 == 18 vertices to AGP memory. Compare this to a normal triangle where we just emit 3 vertices. This is a big increase, but it isn't enough to count for a 10x factor, so maybe something else is at work. Consider using Josh Vanderhoof's excellent somon/soprof utilities to do some very quick profiling of your application and our driver in action. I've attached a source tarball. Usage: somon ./gears soprof < somon.out | sort -n Keith |
From: Thomas H. <th...@br...> - 2000-06-16 15:37:36
|
Hi, again! Keith Whitwell wrote: > Thomas Hellstrom wrote: > > > Hi, Keith. > > > > I did some timings with somon/soprof. I tested the following three cases: > > > > 1: > > glPolygonMode(GL_FRONT_AND_BACK,GL_FILL) > > glCallList(object) > > > > The display list is composed of GL_TRIANGLES > > one at a time between each glBegin and glEnd calls. > > The run was quite fast. Timings in attached file filledtiming. > > Most time spent in a select call which from strace seems to be fd 4 wich, again > > from strace, seems to be > > /dev/dri/card0. Timing in attached file 'filledtiming' > > Try to track this down. I've only seen select take this sort of time in > applications which call it too many times - it really should just sleep for > most of the time, and shouldn't show up (very high) in profiling. If it is > being used to poll the fd's without a reasonable timeout, and called often (eg > in an event-dispatch loop), you can get this type of result. Select is > probably being called with a few fd's, ours being just one of them. > Yep, you're right, but the long time spent in the select call I think stems from the QT main event loop, while the application loads the volume tetrahedal mesh using a child thread. This actually takes quite some time and the main thread is idle waiting for QT events. From what I saw in the soprof / somon README it counts elapsed time and not CPU time so I don't think this is an important issue. At least it does not seem related to the dri / drm driver. > > > 2: > > glCallList(object) > > > > Same geometry as above but GL_TRIANGLES was replaced with GL_LINE_LOOP. > > Reasonably fast and what is stated above in your mail seems consistent with the > > time increase. > > Too bad Matrox doesn't have / give away the microcode for lines. > > (I have a hp-ux fx6 visualize workstation at work that outperforms the G400 with > > a factor 20 on this test) > > most time is spent in mga_dri.so:line_flat. Timing in attached file > > 'linelooptiming' > > > > 3: > > glPolygonMode(GL_FRONT_AND_BACK,GL_LINE) > > glCallList(object) > > > > Same display list as case 1. This is where the problems occur. It's very slow and > > even particularly more so when "somon" is used. Most of the time (almost all) is > > spent in ioctl() which, from strace is fd 4 which, again from strace, _seems_ to > > be /dev/dri/card0. There is also some rendering errors here if GL_QUADS are used > > and a quad intersects the window border. Then it seems to be split up in multiple > > polygons and sometimes a line is drawn around each of them, instead of around all > > of them as a group, introducing extra lines in the model. > > (I can send an image on request). > > I was pretty sure we'd nailed these problems. Can you try to make a small > demo that demonstrates the problem. This is the most helpful way to report a > bug. > Hmm. I'll try to make a short glut application that renders a single triangle / quad and reproduces the problem. If I'm successful I'll submit it to the bug tracking system. > > > Timing in attached file 'linetiming'. Maybe one > > could have glPolygonMode(...,GL_LINE) partly sharing the rendering code with > > GL_LINE_LOOP? > > > > BUT the increasing slowdown in case 3seems to be occuring _only_ when there is > > glBegin() and glEnd() around each triangle in the display list. > > If the display list includes only a glBegin() at the beginning and a glEnd() at > > the end, it is as fast as case 2. > > which may indicate that the problem is partly with my application. > > The issue seems to be that internal buffers are getting flushed for each > polygon. I've tracked this to the way we handle switching between drawing > triangles as triangles and drawing lines with triangles (slightly different > hardware state is required for each operation). We are switching between > these for each triangle (at the glBegin), and then back again when we realize > we're really drawing lines. > > I've got a patch for extras/Mesa/src/vbrender.c that you can try out, > attached. It has some other stuff mixed in, but should be ok. Let me know if > there are problems - I've just dashed it off. > > Keith > > ------------------------------------------------------------------------ > Index: vbrender.c Great!!! This patch really made a difference. Now case two and three are appoximately equally fast. Thanks alot. Best regards, Thomas -- Thomas Hellström, Fyrmästaregången 8, S-413 18 Göteborg, Sweden Email: th...@br... Tel: +46 31 244077, +46 31 663295, +46 704 976916 // Fax: +46 31 546710 |
From: Keith W. <ke...@va...> - 2000-06-16 15:52:50
|
Thomas Hellstrom wrote: > > Hi, again! > > Keith Whitwell wrote: > > > Thomas Hellstrom wrote: > > > > > Hi, Keith. > > > > > > I did some timings with somon/soprof. I tested the following three cases: > > > > > > 1: > > > glPolygonMode(GL_FRONT_AND_BACK,GL_FILL) > > > glCallList(object) > > > > > > The display list is composed of GL_TRIANGLES > > > one at a time between each glBegin and glEnd calls. > > > The run was quite fast. Timings in attached file filledtiming. > > > Most time spent in a select call which from strace seems to be fd 4 wich, again > > > from strace, seems to be > > > /dev/dri/card0. Timing in attached file 'filledtiming' > > > > Try to track this down. I've only seen select take this sort of time in > > applications which call it too many times - it really should just sleep for > > most of the time, and shouldn't show up (very high) in profiling. If it is > > being used to poll the fd's without a reasonable timeout, and called often (eg > > in an event-dispatch loop), you can get this type of result. Select is > > probably being called with a few fd's, ours being just one of them. > > > > Yep, you're right, but the long time spent in the select call I think stems from the > QT main event loop, while the application loads the volume tetrahedal mesh using a > child thread. This actually takes quite some time and the main thread is idle waiting > for QT events. From what I saw in the soprof / somon README it counts elapsed time and > not CPU time so I don't think this is an important issue. At least it does not seem > related to the dri / drm driver. It counts CPU time, I'm pretty sure. Keith |
From: Thomas H. <th...@br...> - 2000-06-16 20:23:57
|
Keith Whitwell wrote: > Thomas Hellstrom wrote: > > > > Hi, again! > > > > Keith Whitwell wrote: > > > > > Thomas Hellstrom wrote: > > > > > > > Hi, Keith. > > > > > > > > I did some timings with somon/soprof. I tested the following three cases: > > > > > > > > 1: > > > > glPolygonMode(GL_FRONT_AND_BACK,GL_FILL) > > > > glCallList(object) > > > > > > > > The display list is composed of GL_TRIANGLES > > > > one at a time between each glBegin and glEnd calls. > > > > The run was quite fast. Timings in attached file filledtiming. > > > > Most time spent in a select call which from strace seems to be fd 4 wich, again > > > > from strace, seems to be > > > > /dev/dri/card0. Timing in attached file 'filledtiming' > > > > > > Try to track this down. I've only seen select take this sort of time in > > > applications which call it too many times - it really should just sleep for > > > most of the time, and shouldn't show up (very high) in profiling. If it is > > > being used to poll the fd's without a reasonable timeout, and called often (eg > > > in an event-dispatch loop), you can get this type of result. Select is > > > probably being called with a few fd's, ours being just one of them. > > > > > > > Yep, you're right, but the long time spent in the select call I think stems from the > > QT main event loop, while the application loads the volume tetrahedal mesh using a > > child thread. This actually takes quite some time and the main thread is idle waiting > > for QT events. From what I saw in the soprof / somon README it counts elapsed time and > > not CPU time so I don't think this is an important issue. At least it does not seem > > related to the dri / drm driver. > > It counts CPU time, I'm pretty sure. > > Keith Excerpt from the somon README: "Since the monitor program, somon, uses the real time timer, you will get inaccurate results if there are other programs using significant CPU time. " And, for example $ somon kdat (or another event driven program) and letting kdat idle for some time will give a very high select count and thus percentage. regards, Thomas -- Thomas Hellström, Fyrmästaregången 8, S-413 18 Göteborg, Sweden Email: th...@br... Tel: +46 31 244077, +46 31 663295, +46 704 976916 // Fax: +46 31 546710 |
From: Keith W. <ke...@va...> - 2000-06-16 21:58:37
|
Thomas Hellstrom wrote: > > Keith Whitwell wrote: > > > Thomas Hellstrom wrote: > > > > > > Hi, again! > > > > > > Keith Whitwell wrote: > > > > > > > Thomas Hellstrom wrote: > > > > > > > > > Hi, Keith. > > > > > > > > > > I did some timings with somon/soprof. I tested the following three cases: > > > > > > > > > > 1: > > > > > glPolygonMode(GL_FRONT_AND_BACK,GL_FILL) > > > > > glCallList(object) > > > > > > > > > > The display list is composed of GL_TRIANGLES > > > > > one at a time between each glBegin and glEnd calls. > > > > > The run was quite fast. Timings in attached file filledtiming. > > > > > Most time spent in a select call which from strace seems to be fd 4 wich, again > > > > > from strace, seems to be > > > > > /dev/dri/card0. Timing in attached file 'filledtiming' > > > > > > > > Try to track this down. I've only seen select take this sort of time in > > > > applications which call it too many times - it really should just sleep for > > > > most of the time, and shouldn't show up (very high) in profiling. If it is > > > > being used to poll the fd's without a reasonable timeout, and called often (eg > > > > in an event-dispatch loop), you can get this type of result. Select is > > > > probably being called with a few fd's, ours being just one of them. > > > > > > > > > > Yep, you're right, but the long time spent in the select call I think stems from the > > > QT main event loop, while the application loads the volume tetrahedal mesh using a > > > child thread. This actually takes quite some time and the main thread is idle waiting > > > for QT events. From what I saw in the soprof / somon README it counts elapsed time and > > > not CPU time so I don't think this is an important issue. At least it does not seem > > > related to the dri / drm driver. > > > > It counts CPU time, I'm pretty sure. > > > > Keith > > Excerpt from the somon README: > > "Since the monitor program, somon, uses the real time timer, you will > get inaccurate results if there are other programs using significant > CPU time. " > > And, for example > $ somon kdat (or another event driven program) > > and letting kdat idle for some time will give a very high select count and thus percentage. > How odd, and at the same time interesting. I'll have to look through the code more closely. Keith |
From: Keith W. <ke...@va...> - 2000-06-19 14:11:50
Attachments:
clip_funcs.h
|
Thomas Hellstrom wrote: > > Hi, again! > > Keith Whitwell wrote: > > > Thomas Hellstrom wrote: > > > > > Hi, Keith. > > > > > > I did some timings with somon/soprof. I tested the following three cases: > > > > > > 1: > > > glPolygonMode(GL_FRONT_AND_BACK,GL_FILL) > > > glCallList(object) > > > > > > The display list is composed of GL_TRIANGLES > > > one at a time between each glBegin and glEnd calls. > > > The run was quite fast. Timings in attached file filledtiming. > > > Most time spent in a select call which from strace seems to be fd 4 wich, again > > > from strace, seems to be > > > /dev/dri/card0. Timing in attached file 'filledtiming' > > > > Try to track this down. I've only seen select take this sort of time in > > applications which call it too many times - it really should just sleep for > > most of the time, and shouldn't show up (very high) in profiling. If it is > > being used to poll the fd's without a reasonable timeout, and called often (eg > > in an event-dispatch loop), you can get this type of result. Select is > > probably being called with a few fd's, ours being just one of them. > > > > Yep, you're right, but the long time spent in the select call I think stems from the > QT main event loop, while the application loads the volume tetrahedal mesh using a > child thread. This actually takes quite some time and the main thread is idle waiting > for QT events. From what I saw in the soprof / somon README it counts elapsed time and > not CPU time so I don't think this is an important issue. At least it does not seem > related to the dri / drm driver. > > > > > > 2: > > > glCallList(object) > > > > > > Same geometry as above but GL_TRIANGLES was replaced with GL_LINE_LOOP. > > > Reasonably fast and what is stated above in your mail seems consistent with the > > > time increase. > > > Too bad Matrox doesn't have / give away the microcode for lines. > > > (I have a hp-ux fx6 visualize workstation at work that outperforms the G400 with > > > a factor 20 on this test) > > > most time is spent in mga_dri.so:line_flat. Timing in attached file > > > 'linelooptiming' > > > > > > 3: > > > glPolygonMode(GL_FRONT_AND_BACK,GL_LINE) > > > glCallList(object) > > > > > > Same display list as case 1. This is where the problems occur. It's very slow and > > > even particularly more so when "somon" is used. Most of the time (almost all) is > > > spent in ioctl() which, from strace is fd 4 which, again from strace, _seems_ to > > > be /dev/dri/card0. There is also some rendering errors here if GL_QUADS are used > > > and a quad intersects the window border. Then it seems to be split up in multiple > > > polygons and sometimes a line is drawn around each of them, instead of around all > > > of them as a group, introducing extra lines in the model. > > > (I can send an image on request). > > > > I was pretty sure we'd nailed these problems. Can you try to make a small > > demo that demonstrates the problem. This is the most helpful way to report a > > bug. > > > > Hmm. I'll try to make a short glut application that renders a single triangle / quad > and reproduces the problem. If I'm successful I'll submit it to the bug tracking > system. OK, Here's a replacement for extras/Mesa/src/clip_funcs.h which solves the clipping problem on my machine. Let me know how you go. (The changes are small but have a lot of spurious whitespace stuff that confuses diff, so I'm attaching a replacement file.) Keith |
From: Thomas H. <th...@br...> - 2000-06-19 20:00:02
|
Keith Whitwell wrote: > > > > Same display list as case 1. This is where the problems occur. It's very slow and > > > > even particularly more so when "somon" is used. Most of the time (almost all) is > > > > spent in ioctl() which, from strace is fd 4 which, again from strace, _seems_ to > > > > be /dev/dri/card0. There is also some rendering errors here if GL_QUADS are used > > > > and a quad intersects the window border. Then it seems to be split up in multiple > > > > polygons and sometimes a line is drawn around each of them, instead of around all > > > > of them as a group, introducing extra lines in the model. > > > > (I can send an image on request). > > > > > > I was pretty sure we'd nailed these problems. Can you try to make a small > > > demo that demonstrates the problem. This is the most helpful way to report a > > > bug. > > > > > > > Hmm. I'll try to make a short glut application that renders a single triangle / quad > > and reproduces the problem. If I'm successful I'll submit it to the bug tracking > > system. > > OK, > > Here's a replacement for extras/Mesa/src/clip_funcs.h which solves the > clipping problem on my machine. Let me know how you go. (The changes are > small but have a lot of spurious whitespace stuff that confuses diff, so I'm > attaching a replacement file.) > > Keith Yes. This patch fixes the most annoying lines, However there is still a line present exactly at the window border on _some_ of the clipped polygons in my application. This type of line can be seen also if you run the example program (the thin red line on top of the screen) http://www.geocrawler.com/lists/3/SourceForge/680/25/3901584/ But many OpenGL implementations seem to do the same. (Nvidia's drivers, Xig's new MGA G400 drivers and hp's OpenGL for hp-ux.) I'm not totally sure the line shouldn't be there, but in case it really should be present there should be a similar line on _all_ clipped polygons, not just some. Anyway, a big improvement!!! Best regards! Thomas -- Thomas Hellström, Fyrmästaregången 8, S-413 18 Göteborg, Sweden Email: th...@br... Tel: +46 31 244077, +46 31 663295, +46 704 976916 // Fax: +46 31 546710 |
From: Allen A. <ak...@va...> - 2000-06-19 21:19:08
|
On Mon, Jun 19, 2000 at 09:56:13PM +0200, Thomas Hellstrom wrote: | Yes. This patch fixes the most annoying lines, However there is | still a line present exactly at the window border on _some_ of the | clipped polygons in my application. One of the main differences between drawing a polygon's vertices as a line loop and drawing them as a polygon in glPolygonMode(GL_LINE) is that the latter displays a line where the polygon is clipped, and the former doesn't. This allows you to get a visual cue when clipping has occurred, if you want one. If you're not getting the line at the window border on *all* the clipped polygons, it's possible that (a) the OpenGL implementation isn't clipping correctly, (b) the viewport transformation is off by a pixel or so, causing the clipped edge to fall just outside the window, or (c) something else is wrong. :-) You could check (b) by squeezing your viewport transformation down to a slightly smaller rectangle inset in the window. The other possibilities are good fodder for bug reports. Allen |
From: Keith W. <ke...@va...> - 2000-06-19 22:32:28
|
Allen Akin wrote: > > On Mon, Jun 19, 2000 at 09:56:13PM +0200, Thomas Hellstrom wrote: > > | Yes. This patch fixes the most annoying lines, However there is > | still a line present exactly at the window border on _some_ of the > | clipped polygons in my application. > > One of the main differences between drawing a polygon's vertices as a > line loop and drawing them as a polygon in glPolygonMode(GL_LINE) is > that the latter displays a line where the polygon is clipped, and the > former doesn't. > > This allows you to get a visual cue when clipping has occurred, if you > want one. > > If you're not getting the line at the window border on *all* the > clipped polygons, it's possible that (a) the OpenGL implementation > isn't clipping correctly, (b) the viewport transformation is off by a > pixel or so, causing the clipped edge to fall just outside the window, > or (c) something else is wrong. :-) > > You could check (b) by squeezing your viewport transformation down to > a slightly smaller rectangle inset in the window. The other > possibilities are good fodder for bug reports. I think the answer is probably (c), and related to pixelization of the line. The software rasterizer might be doing some late, viewport-space primitive rejection, but Thomas is testing with the g400, so that shouldn't be an issue. The quads are closed (per the spec) when clipped by userclip planes, which use basically the same clipping algorithm. Keith |