Thread: [Plib-devel] [PATCH] crease for ac3d files and speedup
Brought to you by:
sjbaker
From: Mathias <Mat...@gm...> - 2004-10-08 06:24:05
Attachments:
crease.diff
|
Hi all, During the past days I have done some profiling on flightgear. One final=20 outcome of that work was, that most time is spent in ssg routines (yes, ssg= =20 not OpenGL!!). The problems are the huge amount of small leaf nodes we get= =20 from the ac file loader. That vertex optimization pass makes this a bit=20 better but there was still room for improovement. I have now changed the ac3d file loader to first collect all surface elemen= ts=20 of a leaf object and then split them to a minimum number of leaf nodes=20 according to material and colour properties. =46or my local setup (radeon 9100, athlon XP 2400+) this gave me a framerat= e=20 speedup up to 40% (16 fps -> 26fps, for the c172 rendered in a window to no= t=20 hit the ancient radeons fill rate limit). I would guess that the average=20 speedup is about 25%-30%. As a /sideeffect/ it was now easy to implement the crease tag for ac3d file= s. Ac3d models look now the same as in ac3d. Attached is a patch to todays plib anoncvs. Is sombody there with cvs write access to plib? Can somebody apply this pat= ch,=20 please? Greetings Mathias =2D-=20 Mathias Fr=C3=B6hlich, email: Mat...@gm... |
From: Steve B. <sjb...@ai...> - 2004-10-08 13:44:42
|
Mathias Fr=F6hlich wrote: > Hi all, >=20 > During the past days I have done some profiling on flightgear. One fina= l=20 > outcome of that work was, that most time is spent in ssg routines (yes,= ssg=20 > not OpenGL!!). That's because you have a graphics card that executes almost 100% of Open= GL functions off on separate hardware. When you eliminate OpenGL, what othe= r functions could possibly consume much time? It has to be SSG - so that comes as no suprise at all. > The problems are the huge amount of small leaf nodes we get=20 > from the ac file loader. That vertex optimization pass makes this a bit= =20 > better but there was still room for improovement. Yep. It's a perennial problem for flight simulators. ---------------------------- Steve Baker ------------------------- HomeEmail: <sjb...@ai...> WorkEmail: <sj...@li...> HomePage : http://www.sjbaker.org Projects : http://plib.sf.net http://tuxaqfh.sf.net http://tuxkart.sf.net http://prettypoly.sf.net -----BEGIN GEEK CODE BLOCK----- GCS d-- s:+ a+ C++++$ UL+++$ P--- L++++$ E--- W+++ N o+ K? w--- !O M- V-- PS++ PE- Y-- PGP-- t+ 5 X R+++ tv b++ DI++ D G+ e++ h--(-) r+++ y++++= -----END GEEK CODE BLOCK----- |
From: Wolfram K. <w_...@rz...> - 2004-10-08 15:51:16
|
Mathias wrote: > >Hi all, > >During the past days I have done some profiling on flightgear. One final= =20 >outcome of that work was, that most time is spent in ssg routines (yes, = ssg=20 >not OpenGL!!). The problems are the huge amount of small leaf nodes we = get=20 >from the ac file loader.=20 Yes. Actually some loaders like the MDL loader are even worse, if you save out a normal CFS1 model directly after reading it from MDL into an ascii format it can well be a GB (not a MB) in size! My fix is simply to call the function to merge all hierachy nodes and to then save as ssg. File size is then normally under a MB. The main reason is simply that there are about 1000 times more nodes than needed. Of course this HUGE overhead slows down rendering a lot. >As a /sideeffect/ it was now easy to implement the crease tag for ac3d = files. >Ac3d models look now the same as in ac3d. Great! >Is sombody there with cvs write access to plib? Can somebody apply this = patch,=20 >please? If noone else does it, remind me again in a week. > Greetings > > Mathias Bye bye, Wolfram. |
From: Mathias <Mat...@gm...> - 2004-10-18 18:44:45
Attachments:
crease-updated.diff
|
Hi Wolfram, On Freitag 08 Oktober 2004 17:52, Wolfram Kuss wrote: > If noone else does it, remind me again in a week. So, I will do this now :) Attached is the last patch again. Just that you don't need to search for th= at=20 data! :) Thanks in advance. Mathias =2D-=20 Mathias Fr=F6hlich, email: Mat...@gm... |
From: Mathias <Mat...@gm...> - 2004-10-08 16:55:09
|
On Freitag 08 Oktober 2004 15:43, Steve Baker wrote: > Mathias Fr=F6hlich wrote: > > During the past days I have done some profiling on flightgear. One final > > outcome of that work was, that most time is spent in ssg routines (yes, > > ssg not OpenGL!!). > > That's because you have a graphics card that executes almost 100% of Open= GL > functions off on separate hardware. When you eliminate OpenGL, what other > functions could possibly consume much time? It has to be SSG - so that > comes as no suprise at all. Yep, this is right for functions like ssgLeaf::draw_geometry or ssgLeaf::dr= aw.=20 I had expected to see them on the top of the profile list. But not for functions only operating on ssg*'s datastructures like=20 ssgVtxTable::getNumColours(), ssgBranch::recalcBSphere(),=20 ssgEntity::dirtyBSphere(), ssgVtxTable::getNumTexCoords(),=20 ssgVtxTable::getNumVertices() and ssgVtxTable::getNumNormals(). Which are a= ll=20 in the top ten of that profile run. They account that much because of the amount of leaf nodes with only few=20 vertices in each node. Greetings Mathais =2D-=20 Mathias Fr=F6hlich, email: Mat...@gm... |
From: Steve B. <sjb...@ai...> - 2004-10-08 19:09:26
|
Mathias Fr=F6hlich wrote: > But not for functions only operating on ssg*'s datastructures like=20 > ssgVtxTable::getNumColours(), ssgBranch::recalcBSphere(), If recalcBSphere is being called in an otherwise static scene, there is something very seriously wrong - either with the application or with SSG. The system is only supposed to recalc the bounding sphere if it's been 'dirtied' by changing a low level vertex or moving a Transform node. The fact that: > ssgEntity::dirtyBSphere(), =2E..is also in your top ten list says there is certainly some kind of a bug. The only time the bsphere should be dirtied is for the (presumably rare) moving models and things that move vertices around outside of SSG. Since moving models are usually put into the tree very near the top in most applications, those are very unlikely to cause much CPU time consumption. This needs to be tracked down - recaclulating bounding spheres needlessly will *cripple* performance. ---------------------------- Steve Baker ------------------------- HomeEmail: <sjb...@ai...> WorkEmail: <sj...@li...> HomePage : http://www.sjbaker.org Projects : http://plib.sf.net http://tuxaqfh.sf.net http://tuxkart.sf.net http://prettypoly.sf.net -----BEGIN GEEK CODE BLOCK----- GCS d-- s:+ a+ C++++$ UL+++$ P--- L++++$ E--- W+++ N o+ K? w--- !O M- V-- PS++ PE- Y-- PGP-- t+ 5 X R+++ tv b++ DI++ D G+ e++ h--(-) r+++ y++++= -----END GEEK CODE BLOCK----- |
From: Mathias <Mat...@gm...> - 2004-10-09 08:11:53
|
On Freitag 08 Oktober 2004 21:08, Steve Baker wrote: > Mathias Fr=F6hlich wrote: > > But not for functions only operating on ssg*'s datastructures like > > ssgVtxTable::getNumColours(), ssgBranch::recalcBSphere(), > > If recalcBSphere is being called in an otherwise static scene, there > is something very seriously wrong - either with the application or > with SSG. Very true. But, one for one ... :) > The fact that: > > ssgEntity::dirtyBSphere(), > > ...is also in your top ten list says there is certainly some kind of > a bug. The only time the bsphere should be dirtied is for the > (presumably rare) moving models and things that move vertices > around outside of SSG. Since moving models are usually put into > the tree very near the top in most applications, those are very > unlikely to cause much CPU time consumption. > > This needs to be tracked down - recaclulating bounding spheres > needlessly will *cripple* performance. Yep. Better support for that crease thing and the speedup gain from that change = is=20 it worth anyway. Greetings Mathias =2D-=20 Mathias Fr=F6hlich, email: Mat...@gm... |
From: Steve B. <sjb...@ai...> - 2004-10-09 14:38:15
|
Mathias Fr=F6hlich wrote: >>This needs to be tracked down - recaclulating bounding spheres >>needlessly will *cripple* performance. >=20 > Yep. >=20 > Better support for that crease thing and the speedup gain from that cha= nge is=20 > it worth anyway. Does this patch take care not to merge things if (for example) there is a comment or object name attached to one or other of the nodes (or to a node that is the parent of one leaf node and not the other)? Some applications use the comment or model name to tell them to turn that node in the scene graph into an LOD or an animation node. ---------------------------- Steve Baker ------------------------- HomeEmail: <sjb...@ai...> WorkEmail: <sj...@li...> HomePage : http://www.sjbaker.org Projects : http://plib.sf.net http://tuxaqfh.sf.net http://tuxkart.sf.net http://prettypoly.sf.net -----BEGIN GEEK CODE BLOCK----- GCS d-- s:+ a+ C++++$ UL+++$ P--- L++++$ E--- W+++ N o+ K? w--- !O M- V-- PS++ PE- Y-- PGP-- t+ 5 X R+++ tv b++ DI++ D G+ e++ h--(-) r+++ y++++= -----END GEEK CODE BLOCK----- |
From: Mathias <Mat...@gm...> - 2004-10-11 06:15:51
|
Hi, Sorry for the long delay I was not online that weekend ... On Samstag 09 Oktober 2004 16:37, Steve Baker wrote: > Does this patch take care not to merge things if (for example) there is > a comment or object name attached to one or other of the nodes (or to a > node that is the parent of one leaf node and not the other)? > > Some applications use the comment or model name to tell them to turn > that node in the scene graph into an LOD or an animation node. Yep. All this stuff is done per ac3d-object. So different objects are never merg= ed=20 together. Stripping away unused transforms and such things are left to the old=20 ssgFlatten and childs code. One part of the patch in a slight modification in the strip function to=20 prevent branches with names to be stripped away. This could happen before and it happend occasionally with a still unpublish= ed=20 flightgear model I have on my private disk. Now fixed ... I did that patch with flightgear in mind. Flightgear uses the object names = for=20 animations. Looking at all that happy people who tried that patch with the= =20 improved framerates together with flightgear during that weekend, I think=20 this is ok :) Greetings Mathias =2D-=20 Mathias Fr=F6hlich, email: Mat...@gm... |
From: Mathias <Mat...@gm...> - 2004-10-09 08:30:09
Attachments:
crease-updated.diff
|
Hi, During digging for that adf.ac stuff I have done some minor adjustmens to t= hat=20 patch. Things like not storing texture coordinate arrays when there is no texture= =20 attached to that object and such. The updated patch is attached. Wolfram, this is the current one :) Greetings and Thanks Mathias =2D-=20 Mathias Fr=C3=B6hlich, email: Mat...@gm... |
From: Erik H. <er...@eh...> - 2004-11-30 18:18:46
|
Mathias Fröhlich wrote: > Hi Wolfram, > > On Freitag 08 Oktober 2004 17:52, Wolfram Kuss wrote: > >>If noone else does it, remind me again in a week. > > So, I will do this now :) > > Attached is the last patch again. Just that you don't need to search for that > data! Hmm, please? http://www.a1.nl/~ehofman/fgfs/download/crease-updated.diff Erik |
From: Wolfram K. <w_...@rz...> - 2004-12-01 02:37:53
|
Erik wrote: >Hmm, please? > >http://www.a1.nl/~ehofman/fgfs/download/crease-updated.diff > >Erik Sorry it took so long, I finally tried to do it. When I use the options c, n or e, it says the diff file is "garbadge". When I use u, then it says in line 53 would be no good file name and it asks me which file to handle. So: - Is "u" the correct option? - Do I assume correctly that it should fnid all file names from the diff file ? - I do this in the directory PLIB, one level above src. Is this correct? Bye bye, Wolfram. |
From: Melchior F. <mf...@ao...> - 2004-12-01 06:47:59
|
* Wolfram Kuss -- Wednesday 01 December 2004 03:38: > Erik wrote: > > http://www.a1.nl/~ehofman/fgfs/download/crease-updated.diff > - I do this in the directory PLIB, one level above src. Is this > correct? Yes, you go to the outmost plib directory, the one that contains src/. Then I always do a test run that doesn't alter the code: $ patch -p0 --dry-run < /path/to/the.diff if everything looks ok, remove the --dry-run: $ patch -p0 < /path/to/the.diff m. |
From: Wolfram K. <w_...@rz...> - 2004-12-01 08:29:29
|
Thanks, "-p0" works. I have now committed the patch. Bye bye, Wolfram. |
From: Erik H. <er...@eh...> - 2004-12-01 08:37:58
|
Wolfram Kuss wrote: > Thanks, "-p0" works. I have now committed the patch. This is really great! Thanks Wolfram. Erik |
From: Frederic B. <fre...@fr...> - 2004-12-01 08:59:07
|
Wolfram Kuss wrote: > Thanks, "-p0" works. I have now committed the patch. Thank you Wolfram, we are eagerly waiting the backup CVS server to be upd= ated. Can I abuse your patience to ask you to commit this joystick patch ? Thanks, -Fred cvs -z4 -q diff -u jsWindows.cxx (in directory I:\FlightGear\cvs\plib\src= \js\) Index: jsWindows.cxx =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /cvsroot/plib/plib/src/js/jsWindows.cxx,v retrieving revision 1.5 diff -u -r1.5 jsWindows.cxx --- jsWindows.cxx 21 Sep 2004 11:45:55 -0000 1.5 +++ jsWindows.cxx 7 Oct 2004 21:47:48 -0000 @@ -126,7 +126,7 @@ // X,Y,Z,R,U,V,POV - not necessarily the first n of these. if ( os->jsCaps.wCaps & JOYCAPS_HASPOV ) { - num_axes =3D _JS_MAX_AXES ; + num_axes =3D _JS_MAX_AXES_WIN ; min [ 7 ] =3D -1.0 ; max [ 7 ] =3D 1.0 ; // POV Y min [ 6 ] =3D -1.0 ; max [ 6 ] =3D 1.0 ; // POV X } |
From: Wolfram K. <w_...@rz...> - 2004-12-01 22:31:48
|
On Wed, 1 Dec 2004 09:57:47 +0100, you wrote: >- num_axes =3D _JS_MAX_AXES ; >+ num_axes =3D _JS_MAX_AXES_WIN ; It is=20 #define _JS_MAX_AXES_WIN 8=20 and _JS_MAX_AXES is 16. I have just been told, windows would theoretically support 128 axis and Microsoft would have tested 40+. To be honest, I think 8 is really restricted, there is almost no reserve left. Why do you think the change is a good one? Maybe we should change to _JS_MAX_AXES_WIN, but up the number? Bye bye, Wolfram. |
From: Frederic B. <fre...@fr...> - 2004-12-02 06:51:30
|
Wolfram Kuss wrote : >On Wed, 1 Dec 2004 09:57:47 +0100, you wrote: > > > >>- num_axes = _JS_MAX_AXES ; >>+ num_axes = _JS_MAX_AXES_WIN ; >> >> > >It is >#define _JS_MAX_AXES_WIN 8 >and _JS_MAX_AXES is 16. I have just been told, windows would >theoretically support 128 axis and Microsoft would have tested 40+. To >be honest, I think 8 is really restricted, there is almost no reserve >left. > >Why do you think the change is a good one? >Maybe we should change to _JS_MAX_AXES_WIN, but up the number? > > Short answer : the whole windows plib code doesn't actually work with 16, but works with 8. longer answer : already posted the 8 october 2004 ( http://sourceforge.net/mailarchive/forum.php?thread_id=5726756&forum_id=4479 ) > the last change to jsWindows.cxx by /smokydiamond/ here : > http://cvs.sourceforge.net/viewcvs.py/plib/plib/src/js/jsWindows.cxx?r1=1.4&r2=1.5 > > > is giving me an error when running FlightGear. I am getting tons of > messages like this : > > PLIB_JS: Wrong num_axes. Joystick input is now invalid > > indicating that the program goes line 251. This is a default case of a > switch on the number of axis and 16, which is the value of num_axes > initialized with _JS_MAX_AXES line 129, is not listed as a valid case. > > Could someone restore line 129 with its previous wording : > > num_axes = _JS_MAX_AXES_WIN > > or include 'case 16:' ( or 'case _JS_MAX_AXES:' ) in the switch begining > line 204 So the problem is not what windows can do, but how the plib code is actually written, with numerical literals in it rather than defined symbols and code that can use them. -Fred |
From: Wolfram K. <w_...@rz...> - 2004-12-03 09:04:29
|
Ok, committed. Bye bye, Wolfram. |
From: Norman V. <nh...@ca...> - 2004-12-03 09:36:01
|
> -----Original Message----- > From: pli...@li... > [mailto:pli...@li...]On Behalf Of Wolfram Kuss > Sent: Friday, December 03, 2004 4:05 AM > To: pli...@li... > Subject: Re: [Plib-devel] Re: [PATCH] crease for ac3d files and speedup > > > Ok, committed. > > Bye bye, > Wolfram. The current PLIB tarball has been updated to reflect these changes http://plib.sourceforge.net/dist/current.tgz 638429 Dec 3 01:19 current.tgz Norman |