发件人： Renk Thorsten <email@example.com>
收件人： FlightGear developers discussions <firstname.lastname@example.org>
发送日期： 2012年9月19日, 星期三, 上午 3:59
[Flightgear-devel] Some ideas for better performance
unfortunately, due to personal reasons, I won't have much coding time for the rest of the year, which means that I probably won't have a chance to do some things I sort of promised to do (make more shaders work in the atmospheric light scattering framework, help making atmospheric light scattering work in Rembrandt...). If anyone else wants to have a go at converting model shaders or the runway shader, I can still try to help, but right now I've even started up Flightgear just once during the last 4 weeks, so coding is just not something I can do.
Anyway - there are some ideas and observations which I made and where I wanted to follow up, and I thought I just put them up here so that they may be discussed.
I'm interested in making the appearance of the scenery more realistic with improved light and procedural texturing, i.e. things that end up in
the shaders. At least on my system, if I run any eye candy features, the bottleneck is shaders (I can render a normal scene from the ufo without any shader effects with about 200 fps, on the other hand rendering urban effect on the whole screen drives me down to ~5 fps - I'm sure various statements which have been made on this list that our performance would be CPU limited are based on something, it's just nothing I am able to experience on my system, on my system it's shaders, more specifically fragment shader). So, that's what I would like to address - making Flightgear faster while expensive shaders are on.
My general understanding of the situation
There's a tendency for all the high quality eye candy to end up in the fragment shaders - see urban effect, water sine wave shader or my procedural texturing. In some sense that's good, because that decouples performance from the visibility range (since the number of pixels doesn't
change), but it may also be bad if that constant performance is too low. I don't know how it is for others, but when I switch urban effect on, it relies on not too many patches of urban terrain in my field of view - if ~1/5 of the scene is urban, I still get decent performance, but if all I see is urban it becomes unusable and single digit experience. I guess there's part of the reason Stuart spent so much time on random buildings... My point being, something like the urban effect doesn't really generalize, we can't put expensive stuff all into the fragment shader if we can anticipate that it will fill a lot of the scene.
Shuffling some load into the vertex shader helps a lot in the near zone (where a triangle has many pixels, and they're all filled based on three expensive operations at the corners) but loses out at the far edge of the visual range (where several vertices may fall into a single pixel). At least on my system, some things only run
with usable speed (provided the visibility doesn't exceed a limit) because there's load in the vertex shader, so there's virtue in making the vertex shader do stuff.
My understanding of Rembrandt is that essentially all operations go to the fragment stage since geometry is rendered in one initial pass and then buffered. If the situation is fragment-shader limited in the default scheme, my guess is that it will be even worse fragment-shader limited in deferred rendering. But admittedly what I have to say about deferred rendering is a bit guesswork.
Clouds are a bit of an unrelated topic, as they can be very heavy on the vertex shader (all the billboard rotations need to be computed...).
So, ideally I'd like to
* make use of the speedup vertex shading operations can provide in the near zone without buying into their disadvantages in the far zone
* speed the fragment shaders up as much as possible
Schemes to speed
things up vertex shading (and what doesn't work)
The obvious solution to speed up a vertex shader is to drop/pass through vertices, or to reduce the number of vertices up front. I understand none of this is much of an issue for Rembrandt where the geometry pass doesn't take the largest share of time, but I believe deferred rendering isn't unconditionally superior - it definitely rules for opaque objects blocking each other and multiple light sources, but for a terrain mesh seen from above, I believe the ability to do work in the vertex shader and interpolate would under general conditions translate to a performance advantage.
I've looked into a few of such schemes (in theory and for some in practice):
* Dropping vertices from the mesh to simplify it for large distances can, if at all, be done only with a geometry shader, because the vertex shader has no idea what the surroundings look like. Geometry shaders seem to be so expensive
to run that this scheme is basically dead up-front.
* 'Passing through' vertices which are later going to be invisible works in some situations (in heavy ground fog I got a 20% performance boost) - the problem is coming up with a criterion to tag these vertices early on which isn't so expensive that the performance gain is eaten up by evaluating it
* The terrain mesh from above doesn't really respond well to standard techniques like depth buffering or backface culling - from high enough, we see the mesh, almost all of the mesh and almost no backface of the mesh, and the few back sides of hills don't really make the difference
So, I believe any solution must come from outside the rendering pipeline, once vertices are in, it's too late. What makes most sense to me is a pre-computed scheme in which we have a high, medium and lowres version of terrain tiles (or just high and low) on disk and successively load the LOD stages of the same
tile as we approach. Since the decision what vertex to cull and how to reconnect the mesh afterwards is probably computationally not cheap, a pre-computed scheme makes more sense to me than a runtime LOD scheme for the mesh.
75% of vertices are typically beyond 0.5 * visibility, but they're increasingly heavily fogged. We just see the outline of distant mountains and some topology. So in a lowres version of terrain, we could drop many/all landclass vertices and just give the whole tile a single texture, because it's going to be more than 75% fogged in any case. Topology could be reduced by dropping 80% of the vertices based on a criterion that they do not mark a sharp gradient in slope (there's lots of literature how to drop vertices in LOD calculations). We'd end up with the vertex shader load reduced by a huge margin, which means that the vertex shader does only work in the near zone where it's useful.
So, my question to scenery people -
would it be possible to run a processing step in which we protect all vertices at the tile boundary (to avoid creating gaps) and cull a large number of vertices inside the tile, so that we pre-generate hires and lowres LOD levels? And could we structure the terrain tile manager such that it supports a such a LOD by loading first the lowres and then the hires version?
Schemes to speed fragment shaders up
The obvious candidates to drop pixels are things obscuring each other. Partially this is automatically taken care of in Rembrandt I think, partially not because transparent objects are rendered the same way in default and Rembrandt.
Now, I've identified two most promising candidates: The instrument panel blocks view to basically anything else, and clouds block a lot of terrain when seen from above or a lot of each other when at layer altitude.
The instrument panel issue is tricky to get really right due to near and far camera
issues, but I typically get 70% of its pixels by a simple rectangular mask. I see performance boosts of 50% and more, and the really neat thing is, the information that the instrument panel blocks all scenery can be used to drop fragments all over the place, independent of transparency issues - I can drop trees, scenery, clouds, you name it. It's easy. It requires 3 parameters to be defined for each airplane (the obscuring rectangle) and a per-frame routine translating that into screen coordinates based on current view, zoom and screen resolution.
Would it be worth to code this properly? Apparently any stencil buffer based solution is hugely complicated because the panel is always in the near camera whereas what it obscures is in the far camera - so a simple solution may be superior.
Clouds blocking terrain or other clouds runs into transparency issues which are a bit trickier.
* I've tried a scheme in which I render the opaque
bits of clouds with depth buffering and the transparent bits without, but rendering two passes of clouds is too slow already, so that's not feasible.
But - what if we would do an early pass rendering simple proxies (one-layered discs, rectangles) as a mask in front of the scenery? If these discs are put into the scenery at cloud creation time such that they get most of the opaque bits of a cloud, they would result in a depth mask against which we could decide what scenery and what other clouds to render? I lack the ability to actually pull this off, but I think it might just work. It galls me to spend an enormous amount of performance in broken cloud cover to render all the scenery with all bells and whistles, then burn an equally large performance rendering all the clouds which obscure 80% of the terrain and then again a large performance to obscure 60% of the scene with the instrument panel. It seems to me we should be getting by with less than
half the work.
Anyway, that's the ideas which I've been thinking about of late - maybe they're helpful.
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
Flightgear-devel mailing listFlightgearemail@example.com://lists.sourceforge.net/lists/listinfo/flightgear-devel