Re: [Algorithms] portal engines in outdoor environments
Brought to you by:
vexxed72
From: jason w. <jas...@po...> - 2000-08-20 21:33:09
|
> That's not the performance hit. You are submitting tris like this: > > BeginChunk > Draw tester tris > EndChunk > if ( previous chunk rendered ) > { > Draw real tris > } Nope, think more like: begin(indexed_array); setboundshint(my_boundingvolume); draw(my_iarray); end(indexed_array); it's handed off as a single, seperatable transaction.. the hint merely allows the hardware to quickly reject the entire array *if* it's obvious it's hidden.. I would think this happens fairly often, like when a character model is behind a wall, for example. So what you're not getting, is that the *if* is _not_ a blocking *if*. It's just a hint.. the hardware can deal with the hint in many ways.. it's true that it would work best in a heirarchical z pipeline, but it should still work in the typical. How that z information gets relayed back to the rejection block is an open question.. but I can think of several ways in a typical architecture.. it's cacheing scanlines anyhow, so it could do something like relay the maximal value for every 4 z's in the scanline being unloaded from cache back to the rejection block, where the rejection block has it's own low res local cache. The details of how this works could take many different forms.. the point being is that you only need delayed z info, and that having the hint processed on chip means that you can do it inside a single frame instead of relying on a previous frame. > The problem is that the if() is being done at the very start of the pipeline > (i.e. the AGP bus - any later and you lose most of the gain), Nope.. you gain fill rate by reduced depth complexity. You could gain effective polygon bandwidth as well. but it needs > to know the results of the rasterisation & Z-test of all the pixels in the > tester tris. no.. it just quickly needs to know if any z in the bounded region is further than the maximal value of the bounding volume.. or some similar conserative heuristic. > looooong pipe between the AGP bus and pixel rasterisation). So your pipeline > is going to be completely empty between those two points. That is a huge nope.. the pipeline is going to be full with the previous array, unless it was rejected and the chip didn't have the start of the next stream prefetch... depending on now long it takes to get the start of the next stream, that could trigger a stall, but I'm sure the bus access could be designed in a way to cut that down to few cycles... > > You may be able to improve things by doing: > > for i = 0 to nObjects > { > BeginChunk(i) > Draw tester tris > EndChunk > } > for i = 0 to nObjects > { > If ( chunk(i) rendered ) > { > Draw object i > } > } > Right, this is closer.. the key being that the if needs to be just a hint instead, and that it's treated as a input to conservative estimation that uses a delayed and lower detail copy of the z. Another approach would be a pageing/tiling architecture like the bitboys vapor ware (apparently) is. In this case, triangles are rasterized in a checker pattern instead of a scanline pattern, where each page is some reasonable size, like 32x32 or so. It's a no brainer to maintain range values for each page, enabling very fast rejetion of a page as a triangle is scan converted. Better yet, with the rejection in front of some higher level surface tessilator or the T&L, it can quickly read a few range values and discard an entire array or patch. > But that is a lot of extra hardware to store all that chunk information, > retrieve it and so on. Lots of complexity. There are three very nice things > about the frame-to-frame coherency scheme: > > (1) No extra fillrate hit. If the object is invisible, it's the same > fillrate as your scheme. If the object is visible, then you still only draw > it once, not twice as with your scheme. You misunderstood.. I never said anything about drawing anything. Just a bounding volume hint, which is a very different thing. There's plenty of existing work for converting a OBB to exact screen regions *very* quickly without resorting to scan conversion/rasterization. We're only interested in conservative values as well, since it's common for a character model to be completely seperate from a set of wall polygons. > (2) No extra triangles needed. OK, the bounding box is a pretty small number > of tris, but what if you wanted to do this scheme with lots of smallish > object? Might get significant then. again, it's a hint, not a set of triangles.. no added triangles. > (3) (and this is the biggie) It is already supported by tons of existing, > normal, shipped, out there hardware. Not some mystical future device. Real, > existing ones that you have probably used. *very* true, very good point. > > Maybe > > the gain is reduce, when you think about how the rejection > > means there are > > skips in the flow from AGP RAM to the cards local > > storage/instruction bus, > > but as I understand it, that's all controled by DMA's from > > the card anyhow, > > so not a big deal. > Huge deal if the delay is longer than a few tens of clock cycles. The AGP > FIFOs are not very big, and bubbles of the sort of size you are talking > about are not going to be absorbed by them. So for part of your frame, the > AGP bus will be sitting idle. And if, as is happening, you are limited by > AGP speed, that is going to hurt quite a lot. as long as the gap in the pipeline as it starts fetching the next stream after a rejection is shorter than the number of cycles it would have taken to finish the rejected stream, you win. Considering that a cycle = 4 rasterized pixels or so, and that trinagles typically are 4-8x that, and that arrays are typically 10 tris or more, I think it's not to much of a worry. Unless it really does take 200 cycles of a ~150mhz part to set up/redirect the dma. > > It doesn't rely on frame2frame conherance (which I feel is often a bad > > thing). Perhaps it would maybe be best with a heirarchical z system. > > Doesn't help - you still need to rasterise your pixels, which is a long way > down the pipe. > What's wrong with frame-to-frame coherence? Remember, if there is a camera > change or cut, the application can simply discard all the visibility info it > has and just draw everything, until it has vis information for the new > camera position. A couple things.. originally I didn't think this was a big deal, but later changed... I think making assumptions is bad, and I definately think that consistant framerate is more important than a high instantanious. Nothings more annoying than jumping through a portal in a game to get dropped frames for a few frames before it gets everything sorted out and cached and gets back up to 60fps (or whatever the target is). Making the granularity of rejection sub-frame should help avoid this... Also, when you're using an in engine cinematic approach, it's really annoying when you get a dropped frame every time the camera cuts. > No no no. There is no way you could get the _hardware_ to reject state > change info and texture downloads because of some internally fed-back state. > Drivers rely very heavily on persistent state, i.e. not having to send state > that doesn't change. If the driver now doesn't know whether the state change > info it sent actually made it to the chip's registers or not, that's just > madness - the driver will go potty trying to figure out what it does and > doesn't need to send over the bus. Ditto for downloading textures. Since the > driver can't know whether the hardware is going to reject the tris or not, > it always has to do the texture downloads anyway. And if the hardware is > doing the downloads instead (e.g. AGP or cached-AGP textures), then either > solution works - the fast-Z-rejection of pixels means those textures never > get fetched. Ouch.. hadn't thought much about the driver related issues. However, *if* state was constant accross a primative, it's not a problem. That would be a big issue, but I don't think it's insurmountable. So, maybe I'm foggy on some details.. but I still thing early rejection in rasterization pipes is a *good thing*tm :). |