From: Keith W. <ke...@vm...> - 2010-03-12 13:41:01
|
On Fri, 2010-03-12 at 02:54 -0800, Corbin Simpson wrote: > Module: Mesa > Branch: master > Commit: 50876ddaaff72a324ac45e255985e0f84e108594 > URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=50876ddaaff72a324ac45e255985e0f84e108594 > > Author: Corbin Simpson <Mos...@gm...> > Date: Fri Mar 12 02:51:40 2010 -0800 > > st/mesa: Always recalculate invalid index bounds. > > These should always be sanitized before heading towards the pipe driver, > and if the calling function explicitly marked them as invalid, we need > to regenerate them. > > Allows r300g to properly pass a bit more of Wine's d3d9 testing without > dropping stuff on the floor. > > --- > > src/mesa/state_tracker/st_draw.c | 6 +++--- > 1 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c > index 8b01272..d81b361 100644 > --- a/src/mesa/state_tracker/st_draw.c > +++ b/src/mesa/state_tracker/st_draw.c > @@ -542,9 +542,9 @@ st_draw_vbo(GLcontext *ctx, > assert(ctx->NewState == 0x0); > > /* Gallium probably doesn't want this in some cases. */ > - if (!index_bounds_valid) > - if (!vbo_all_varyings_in_vbos(arrays)) > - vbo_get_minmax_index(ctx, prims, ib, &min_index, &max_index); > + if (index_bounds_valid != GL_TRUE) { > + vbo_get_minmax_index(ctx, prims, ib, &min_index, &max_index); > + } Corbin, This isn't really desirable. Ideally this range checking should be pushed into pipe driver, because it's an expensive operation that is not necessary on a lot of hardware. # Specifically, vertex fetch hardware can often be set up with the min/max permissible index bounds to avoid accessing vertex data outside of the bound VB's. This can be achieved by examining the VBs, with min_index == 0, max_index = vb.size / vb.stride. The case where we need to calculate them internally is if some data is not in a VB, meaning we can't guess what the legal min/max values are. Also note that we need that min/max information to be able to upload the data to a VB. So, I think the code was probably correct in the original version - defer the minmax scan to the hardware driver, which may or may not need it. But maybe there is a better way to let the driver know that we are not providing this information. Keith |
From: Corbin S. <mos...@gm...> - 2010-03-13 02:20:16
|
On Fri, Mar 12, 2010 at 5:20 PM, Jose Fonseca <jfo...@vm...> wrote: > Because if you have a huge vertex buffer and you only draw few indices you may choose to upload to VRAM only the vertices actually referred. Applications do this. And for certain hardware uploads are very slow, so it is an worthwhile optimization. Sure, not saying that it's not. > Efficiency is just or more important goal than principles like "state tracker should sanitize". There is hardware that can handle buffers with out of bounds indices without crashing. The APIs we expose also make the same promise. In such situation the shortcomings on one hardware should not be taxed to all. Oh, it's not a hardware limitation. If the vert and elt indices aren't clamped, then an attacker could theoretically retrieve bits of VRAM/GTT he shouldn't be allowed to peek at. In order to prevent this, the radeon kernel module requires that I set those clamps, and that they be within the bounds of the various buffers being used for rendering. I wouldn't call it a shortcoming so much as a security consideration. I've pushed a revert of the original patch, and an r300g patch that, while not perfect, covers the common case that Wine hits. ~ C. -- Only fools are easily impressed by what is only barely beyond their reach. ~ Unknown Corbin Simpson <Mos...@gm...> |
From: Roland S. <sr...@vm...> - 2010-03-13 02:53:21
|
On 13.03.2010 03:20, Corbin Simpson wrote: > On Fri, Mar 12, 2010 at 5:20 PM, Jose Fonseca <jfo...@vm...> > wrote: >> Because if you have a huge vertex buffer and you only draw few >> indices you may choose to upload to VRAM only the vertices actually >> referred. Applications do this. And for certain hardware uploads >> are very slow, so it is an worthwhile optimization. > > Sure, not saying that it's not. > >> Efficiency is just or more important goal than principles like >> "state tracker should sanitize". There is hardware that can handle >> buffers with out of bounds indices without crashing. The APIs we >> expose also make the same promise. In such situation the >> shortcomings on one hardware should not be taxed to all. > > Oh, it's not a hardware limitation. If the vert and elt indices > aren't clamped, then an attacker could theoretically retrieve bits of > VRAM/GTT he shouldn't be allowed to peek at. In order to prevent > this, the radeon kernel module requires that I set those clamps, and > that they be within the bounds of the various buffers being used for > rendering. I wouldn't call it a shortcoming so much as a security > consideration. > > I've pushed a revert of the original patch, and an r300g patch that, > while not perfect, covers the common case that Wine hits. I think I don't really understand that (and the fix you did for r300g). Why can't you simply clamp the maxIndex to the minimum of the submitted maxIndex and the vertex buffer max index? Now you have this: maxIndex = MIN3(maxIndex, r300->vertex_buffer_max_index, count - minIndex); This is then used to set the hardware max index clamp. However, for example count could be 3, min index 0, but the actual vertices fetched 0, 15, 30 - as long as the vertex buffers are large enough this is perfectly legal, but as far as I can tell your patch would force the hardware to just fetch the 0th vertex (3 times). Count really tells you nothing at all about the index range (would also be legal to have huge count but very small valid index range if you fetch same vertices repeatedly). Roland |
From: Corbin S. <mos...@gm...> - 2010-03-13 12:25:35
|
On Fri, Mar 12, 2010 at 6:53 PM, Roland Scheidegger <sr...@vm...> wrote: > On 13.03.2010 03:20, Corbin Simpson wrote: >> I've pushed a revert of the original patch, and an r300g patch that, >> while not perfect, covers the common case that Wine hits. > > I think I don't really understand that (and the fix you did for r300g). > Why can't you simply clamp the maxIndex to the minimum of the submitted > maxIndex and the vertex buffer max index? > > Now you have this: > maxIndex = MIN3(maxIndex, r300->vertex_buffer_max_index, count - minIndex); > > This is then used to set the hardware max index clamp. However, for > example count could be 3, min index 0, but the actual vertices fetched > 0, 15, 30 - as long as the vertex buffers are large enough this is > perfectly legal, but as far as I can tell your patch would force the > hardware to just fetch the 0th vertex (3 times). Count really tells you > nothing at all about the index range (would also be legal to have huge > count but very small valid index range if you fetch same vertices > repeatedly). That's why I said that it's not perfect. :3 r300->vertex_buffer_max_index is, on this particular rendering path, *also* bogus, which is too bad because it should be used instead of maxIndex/minIndex here. But, as I said before, if it's too big then it can't be used, and we need to err on the side of caution if we err at all. One misrendered draw call is better than dropping the entire packet of commands on the floor. -- Only fools are easily impressed by what is only barely beyond their reach. ~ Unknown Corbin Simpson <Mos...@gm...> |
From: Luca B. <luc...@gm...> - 2010-03-13 04:06:52
|
Isn't it possible to compute the maximum legal index by just taking the minimum of: (vb->buffer->size - vb->buffer_offset - ve->src_offset) / vb->stride over all vertex buffers/elements? Isn't the kernel checker doing something like this? |
From: Luca B. <luc...@gm...> - 2010-03-13 04:33:44
|
Actually, why is the state tracker doing the min/max computation at all? If the driver does the index lookup itself, as opposed to using an hardware index buffer, (e.g. the nouveau drivers do this in some cases) this is unnecessary and slow. Would completely removing the call to vbo_get_minmax_index break anything? Also, how about removing the max_index field in pipe_vertex_buffer? This seems to be set to the same value for all vertex buffers, and the value is then passed to draw_range_elements too. Isn't the value passed to draw_range_elements sufficient? |
From: Keith W. <kei...@go...> - 2010-03-13 09:22:25
|
On Sat, Mar 13, 2010 at 4:33 AM, Luca Barbieri <luc...@gm...> wrote: > Actually, why is the state tracker doing the min/max computation at all? > > If the driver does the index lookup itself, as opposed to using an > hardware index buffer, (e.g. the nouveau drivers do this in some > cases) this is unnecessary and slow. > > Would completely removing the call to vbo_get_minmax_index break anything? > > Also, how about removing the max_index field in pipe_vertex_buffer? > This seems to be set to the same value for all vertex buffers, and the > value is then passed to draw_range_elements too. > Isn't the value passed to draw_range_elements sufficient? > It's really only needed for the special (but common) case where the vertex data isn't in a GL VBO at all, but just sitting in regular memory. The state tracker currently has the task of wrapping those regions of memory in user_buffers, and needs this computation to figure out what memory to wrap in that way. User buffers are a useful concept for avoiding a data copy, and can be implemented one of three ways in the driver: a) for drivers doing swtnl, just access the wrapped memory directly b) for drivers with hwtnl but unsophisticated memory managers, upload the userbuffer contents to a real buffer, then use that. c) for drivers with sophisticated memory managers, instruct the kernel to allow the hardware to access these pages directly and do whatever memory management tricks necessary to avoid the contents being clobbered by subsequent CPU accesses. It's a difficult trick to avoid extra data copies in a layered stack like gallium, and user_buffers are an attempt to avoid one source of them. But for any such technique, the mesa state tracker will need to figure out what memory is being referred to by those non-VBO vertex buffers and to do that requires knowing the index min/max values. Keith |
From: Luca B. <luc...@gm...> - 2010-03-13 11:41:07
|
> But for any such technique, the mesa state tracker will need to figure > out what memory is being referred to by those non-VBO vertex buffers > and to do that requires knowing the index min/max values. Isn't the min/max value only required to compute a sensible value for the maximum user buffer length? (the base pointer is passed to gl*Pointer) The fact is, that we don't need to know how large the user buffer is if the CPU is accessing it (or if we have a very advanced driver that faults memory in the GPU VM on demand, and/or a mechanism to let the GPU share the process address space). As you said, this happens for instance with swtnl, but also with drivers that scan the index buffer and copy the referenced vertex for each index onto the GPU FIFO themselves (e.g. nv50 and experimental versions of nv30/nv40). So couldn't we pass ~0 or similar as the user buffer length, and have the driver use an auxiliary module on draw calls to determine the real length, if necessary? Of course, drivers that upload user buffers on creation (if any exists) would need to be changed to only do that on draw calls. |
From: Keith W. <kei...@go...> - 2010-03-13 12:23:52
|
On Sat, Mar 13, 2010 at 11:40 AM, Luca Barbieri <luc...@gm...> wrote: >> But for any such technique, the mesa state tracker will need to figure >> out what memory is being referred to by those non-VBO vertex buffers >> and to do that requires knowing the index min/max values. > > Isn't the min/max value only required to compute a sensible value for > the maximum user buffer length? (the base pointer is passed to > gl*Pointer) Yes, I think that's what I was trying to say. > The fact is, that we don't need to know how large the user buffer is > if the CPU is accessing it (or if we have a very advanced driver that > faults memory in the GPU VM on demand, and/or a mechanism to let the > GPU share the process address space). Even for software t&l, it's pretty important, see below. > As you said, this happens for instance with swtnl, but also with > drivers that scan the index buffer and copy the referenced vertex for > each index onto the GPU FIFO themselves (e.g. nv50 and experimental > versions of nv30/nv40). Having user-buffers with undefined size establishes a connection inside the driver between two things which could previously be fully understood separately - the vertex buffer is now no longer fully defined without reference to an index buffer. Effectively the user buffer become just unqualified pointers and we are back to the GL world pre-VBOs. In your examples, scanning and uploading (or transforming) per-index is only something which is sensible in special cases - eg where there are very few indices or sparse access to a large vertex buffer that hasn't already been uploaded/transformed. But you can't even know if the vertex buffer is sparse until you know how big it is, ie. what max_index is... Typical usage is the opposite - vertices are referenced more than once in a mesh and the efficient thing to do is: - for software tnl, transform all the vertices in the vertex buffer (requires knowing max_index) and then apply the indices to the transformed vertices - for hardware tnl, upload the vertex buffer in entirety (or better still, the referenced subrange). > So couldn't we pass ~0 or similar as the user buffer length, and have > the driver use an auxiliary module on draw calls to determine the real > length, if necessary? > Of course, drivers that upload user buffers on creation (if any > exists) would need to be changed to only do that on draw calls. My feeling is that user buffers are a pretty significant concession to legacy GL vertex arrays in the interface. If there was any change to them, it would be more along the lines of getting rid of them entirely and forcing the state trackers to use proper buffers for everything. They are a nice way to accomodate old-style GL arrays, but they don't have many other uses and are already on shaky ground semantically. In summary, I'm not 100% comfortable with the current userbuffer concept, and I'd be *really* uncomfortable with the idea of buffers who's size we don't know or which could change size when a new index buffer is bound... Keith |
From: Luca B. <luc...@gm...> - 2010-03-13 13:32:55
|
> Having user-buffers with undefined size establishes a connection > inside the driver between two things which could previously be fully > understood separately - the vertex buffer is now no longer fully > defined without reference to an index buffer. Effectively the user > buffer become just unqualified pointers and we are back to the GL > world pre-VBOs. Yes, indeed. > In your examples, scanning and uploading (or transforming) per-index > is only something which is sensible in special cases - eg where there > are very few indices or sparse access to a large vertex buffer that > hasn't already been uploaded/transformed. But you can't even know if > the vertex buffer is sparse until you know how big it is, ie. what > max_index is... Reconsidering it, it does indeed seem to be sensible to always scan the index buffer for min/max if any elements are in user buffers, since if we discover it is dense, we should upload and otherwise do the index lookup in software (or perhaps remap the indices, but this isn't necessarily a good idea). And if we are uploading buffers, we must always do the scan to avoid generating segfaults due to out-of-bound reads. Hardware without draw_elements support could do without the scan, but I think all Gallium-capable hardware supports it. So it seems the only case where we don't necessarily need it is for swtnl, and here the performance loss due to scanning is probably insignificant compared to the cost of actually transforming vertices. So, yes, I think you are right and the current solution is the best. However, I still have doubts on the semantics of max_index in pipe_vertex_buffer. Isn't it better to _always_ set it to a valid value, even if it is just (vb->buffer->size - vb->buffer_offset) / vb->stride ? It seems this would solve Corbin's problem and make a better interface, following your principle of having well defined vertex buffers. The only cost is doing up to 8/16 divisions, but the driver may needs to do them anyway. Perhaps we could amortize this by only creating/setting the pipe_vertex_buffers/elements on VBO/array change instead that on every draw call like we seem to be doing now, in cases where it is possible. An alternative option could be to remove it, and have the driver do the computation itself instead (and use draw_range_elements to pass a user-specified or scanned max value). |
From: Keith W. <kei...@go...> - 2010-03-13 14:30:42
|
On Sat, Mar 13, 2010 at 1:30 PM, Luca Barbieri <luc...@gm...> wrote: >> Having user-buffers with undefined size establishes a connection >> inside the driver between two things which could previously be fully >> understood separately - the vertex buffer is now no longer fully >> defined without reference to an index buffer. Effectively the user >> buffer become just unqualified pointers and we are back to the GL >> world pre-VBOs. > > Yes, indeed. > >> In your examples, scanning and uploading (or transforming) per-index >> is only something which is sensible in special cases - eg where there >> are very few indices or sparse access to a large vertex buffer that >> hasn't already been uploaded/transformed. But you can't even know if >> the vertex buffer is sparse until you know how big it is, ie. what >> max_index is... > > Reconsidering it, it does indeed seem to be sensible to always scan > the index buffer for min/max if any elements are in user buffers, > since if we discover it is dense, we should upload and otherwise do > the index lookup in software (or perhaps remap the indices, but this > isn't necessarily a good idea). > > And if we are uploading buffers, we must always do the scan to avoid > generating segfaults due to out-of-bound reads. > Hardware without draw_elements support could do without the scan, but > I think all Gallium-capable hardware supports it. > > So it seems the only case where we don't necessarily need it is for > swtnl, and here the performance loss due to scanning is probably > insignificant compared to the cost of actually transforming vertices. > > So, yes, I think you are right and the current solution is the best. > > However, I still have doubts on the semantics of max_index in > pipe_vertex_buffer. > Isn't it better to _always_ set it to a valid value, even if it is > just (vb->buffer->size - vb->buffer_offset) / vb->stride ? Yes, indeed. Sorry I must have missed that point in the earlier emails. I would have thought that was what it was *always* set to (and thus perhaps redundant). > It seems this would solve Corbin's problem and make a better > interface, following your principle of having well defined vertex > buffers. > The only cost is doing up to 8/16 divisions, but the driver may needs > to do them anyway. > > Perhaps we could amortize this by only creating/setting the > pipe_vertex_buffers/elements on VBO/array change instead that on every > draw call like we seem to be doing now, in cases where it is possible. Indeed that would be an improvement on the current situation. > An alternative option could be to remove it, and have the driver do > the computation itself instead (and use draw_range_elements to pass a > user-specified or scanned max value). I think I prefer the first approach, to try and reduce rework everywhere. Keith |
From: Luca B. <luc...@gm...> - 2010-03-13 14:47:13
|
>> However, I still have doubts on the semantics of max_index in >> pipe_vertex_buffer. >> Isn't it better to _always_ set it to a valid value, even if it is >> just (vb->buffer->size - vb->buffer_offset) / vb->stride ? > > Yes, indeed. Sorry I must have missed that point in the earlier > emails. I would have thought that was what it was *always* set to > (and thus perhaps redundant). This doesn't seem to be the case, and if I understand correctly, it is the source of the issues Corbin is facing. Quoting Corbin: > - I can't just use the vertex buffer sizes, because they're set to ~0 I assume he was referring max_index |
From: Keith W. <ke...@vm...> - 2010-03-12 13:51:07
|
Resending to the right address. Keith On Fri, 2010-03-12 at 13:40 +0000, Keith Whitwell wrote: > On Fri, 2010-03-12 at 02:54 -0800, Corbin Simpson wrote: > > Module: Mesa > > Branch: master > > Commit: 50876ddaaff72a324ac45e255985e0f84e108594 > > URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=50876ddaaff72a324ac45e255985e0f84e108594 > > > > Author: Corbin Simpson <Mos...@gm...> > > Date: Fri Mar 12 02:51:40 2010 -0800 > > > > st/mesa: Always recalculate invalid index bounds. > > > > These should always be sanitized before heading towards the pipe driver, > > and if the calling function explicitly marked them as invalid, we need > > to regenerate them. > > > > Allows r300g to properly pass a bit more of Wine's d3d9 testing without > > dropping stuff on the floor. > > > > --- > > > > src/mesa/state_tracker/st_draw.c | 6 +++--- > > 1 files changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c > > index 8b01272..d81b361 100644 > > --- a/src/mesa/state_tracker/st_draw.c > > +++ b/src/mesa/state_tracker/st_draw.c > > @@ -542,9 +542,9 @@ st_draw_vbo(GLcontext *ctx, > > assert(ctx->NewState == 0x0); > > > > /* Gallium probably doesn't want this in some cases. */ > > - if (!index_bounds_valid) > > - if (!vbo_all_varyings_in_vbos(arrays)) > > - vbo_get_minmax_index(ctx, prims, ib, &min_index, &max_index); > > + if (index_bounds_valid != GL_TRUE) { > > + vbo_get_minmax_index(ctx, prims, ib, &min_index, &max_index); > > + } > > > Corbin, > > This isn't really desirable. Ideally this range checking should be > pushed into pipe driver, because it's an expensive operation that is not > necessary on a lot of hardware. # > > Specifically, vertex fetch hardware can often be set up with the min/max > permissible index bounds to avoid accessing vertex data outside of the > bound VB's. This can be achieved by examining the VBs, with min_index > == 0, max_index = vb.size / vb.stride. > > The case where we need to calculate them internally is if some data is > not in a VB, meaning we can't guess what the legal min/max values are. > Also note that we need that min/max information to be able to upload the > data to a VB. > > So, I think the code was probably correct in the original version - > defer the minmax scan to the hardware driver, which may or may not need > it. > > But maybe there is a better way to let the driver know that we are not > providing this information. > > Keith > > |
From: Corbin S. <mos...@gm...> - 2010-03-13 00:06:15
|
This seems to be at odds with the idea that the pipe driver receives pre-sanitized inputs. If this patch is reverted: - I can't trust the min and max elts, because they're set to ~0 - I can't just use the vertex buffer sizes, because they're set to ~0 So I have to do the maths myself, guessing just like st/mesa guesses. Maybe this is specific to Radeons, but I *always* need those values. I don't mind this, but IMO it *must* be doc'd, "The minimum and maximum index limits passed to draw_elements and draw_range_elements may be invalid," and really, at that point, why are we even passing them? Seems absurd to me. :3 Posting from a mobile, pardon my terseness. ~ C. On Mar 12, 2010 5:40 AM, "Keith Whitwell" <ke...@vm...> wrote: On Fri, 2010-03-12 at 02:54 -0800, Corbin Simpson wrote: > Module: Mesa > Branch: master > Commit: 50876ddaaff72a324ac45e255985e0f84e108594 > URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=50876ddaaff72a324ac45e255985e0f84e108594 > > Author: Corbin Simpson <Mos...@gm...> > Date: Fri Mar 12 02:51:40 2010 -0800 > > st/mesa: Always recalculate invalid index bounds. > > These should always be sanitized before heading towards the pipe driver, > and if the calling function explicitly marked them as invalid, we need > to regenerate them. > > Allows r300g to properly pass a bit more of Wine's d3d9 testing without > dropping stuff on the floor. > > --- > > src/mesa/state_tracker/st_draw.c | 6 +++--- > 1 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c > index 8b01272..d81b361 100644 > --- a/src/mesa/state_tracker/st_draw.c > +++ b/src/mesa/state_tracker/st_draw.c > @@ -542,9 +542,9 @@ st_draw_vbo(GLcontext *ctx, > assert(ctx->NewState == 0x0); > > /* Gallium probably doesn't want this in some cases. */ > - if (!index_bounds_valid) > - if (!vbo_all_varyings_in_vbos(arrays)) > - vbo_get_minmax_index(ctx, prims, ib, &min_index, &max_index); > + if (index_bounds_valid != GL_TRUE) { > + vbo_get_minmax_index(ctx, prims, ib, &min_index, &max_index); > + } Corbin, This isn't really desirable. Ideally this range checking should be pushed into pipe driver, because it's an expensive operation that is not necessary on a lot of hardware. # Specifically, vertex fetch hardware can often be set up with the min/max permissible index bounds to avoid accessing vertex data outside of the bound VB's. This can be achieved by examining the VBs, with min_index == 0, max_index = vb.size / vb.stride. The case where we need to calculate them internally is if some data is not in a VB, meaning we can't guess what the legal min/max values are. Also note that we need that min/max information to be able to upload the data to a VB. So, I think the code was probably correct in the original version - defer the minmax scan to the hardware driver, which may or may not need it. But maybe there is a better way to let the driver know that we are not providing this information. Keith |
From: Brian P. <bri...@gm...> - 2010-03-13 00:17:44
|
The overriding goal is to avoid the expense of computing this in software when the hardware can do the bounds check. I guess we could add a PIPE_CAP query to ask the driver if it can do bounds checking and if it can't, do it in the state tracker. But I think Keith's interested in minimizing queries like this. Hence, try to do it in the driver. I think it should be easy to write a re-usable utility function to do the checks. If you're seeing ~0, that means the Mesa VBO module has not computed the bounds, and/or the user didn't provide them. I'd have to check the code. If vertex data is coming from regular memory and not a VBO we have no idea what the legal bounds are. -Brian On Fri, Mar 12, 2010 at 5:06 PM, Corbin Simpson <mos...@gm...> wrote: > This seems to be at odds with the idea that the pipe driver receives > pre-sanitized inputs. > > If this patch is reverted: > - I can't trust the min and max elts, because they're set to ~0 > - I can't just use the vertex buffer sizes, because they're set to ~0 > > So I have to do the maths myself, guessing just like st/mesa guesses. Maybe > this is specific to Radeons, but I *always* need those values. > > I don't mind this, but IMO it *must* be doc'd, "The minimum and maximum > index limits passed to draw_elements and draw_range_elements may be > invalid," and really, at that point, why are we even passing them? Seems > absurd to me. :3 > > Posting from a mobile, pardon my terseness. ~ C. > > On Mar 12, 2010 5:40 AM, "Keith Whitwell" <ke...@vm...> wrote: > > On Fri, 2010-03-12 at 02:54 -0800, Corbin Simpson wrote: >> Module: Mesa >> Branch: master >> Commit: 50876ddaaff72a324ac45e255985e0f84e108594 >> URL: >> http://cgit.freedesktop.org/mesa/mesa/commit/?id=50876ddaaff72a324ac45e255985e0f84e108594 >> >> Author: Corbin Simpson <Mos...@gm...> >> Date: Fri Mar 12 02:51:40 2010 -0800 >> >> st/mesa: Always recalculate invalid index bounds. >> >> These should always be sanitized before heading towards the pipe driver, >> and if the calling function explicitly marked them as invalid, we need >> to regenerate them. >> >> Allows r300g to properly pass a bit more of Wine's d3d9 testing without >> dropping stuff on the floor. >> >> --- >> >> src/mesa/state_tracker/st_draw.c | 6 +++--- >> 1 files changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/src/mesa/state_tracker/st_draw.c >> b/src/mesa/state_tracker/st_draw.c >> index 8b01272..d81b361 100644 >> --- a/src/mesa/state_tracker/st_draw.c >> +++ b/src/mesa/state_tracker/st_draw.c >> @@ -542,9 +542,9 @@ st_draw_vbo(GLcontext *ctx, >> assert(ctx->NewState == 0x0); >> >> /* Gallium probably doesn't want this in some cases. */ >> - if (!index_bounds_valid) >> - if (!vbo_all_varyings_in_vbos(arrays)) >> - vbo_get_minmax_index(ctx, prims, ib, &min_index, &max_index); >> + if (index_bounds_valid != GL_TRUE) { >> + vbo_get_minmax_index(ctx, prims, ib, &min_index, &max_index); >> + } > > > Corbin, > > This isn't really desirable. Ideally this range checking should be > pushed into pipe driver, because it's an expensive operation that is not > necessary on a lot of hardware. # > > Specifically, vertex fetch hardware can often be set up with the min/max > permissible index bounds to avoid accessing vertex data outside of the > bound VB's. This can be achieved by examining the VBs, with min_index > == 0, max_index = vb.size / vb.stride. > > The case where we need to calculate them internally is if some data is > not in a VB, meaning we can't guess what the legal min/max values are. > Also note that we need that min/max information to be able to upload the > data to a VB. > > So, I think the code was probably correct in the original version - > defer the minmax scan to the hardware driver, which may or may not need > it. > > But maybe there is a better way to let the driver know that we are not > providing this information. > > Keith > > > > > ------------------------------------------------------------------------------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > Mesa3d-dev mailing list > Mes...@li... > https://lists.sourceforge.net/lists/listinfo/mesa3d-dev > > |
From: Jose F. <jfo...@vm...> - 2010-03-13 01:20:17
|
What's the point of min/max ranges? Because if you have a huge vertex buffer and you only draw few indices you may choose to upload to VRAM only the vertices actually referred. Applications do this. And for certain hardware uploads are very slow, so it is an worthwhile optimization. Efficiency is just or more important goal than principles like "state tracker should sanitize". There is hardware that can handle buffers with out of bounds indices without crashing. The APIs we expose also make the same promise. In such situation the shortcomings on one hardware should not be taxed to all. Furthermore, if you need scan/clamp the indices in an index buffer then it might be worthwhile to do it in cached memory, and copy to uncached memory in the process. Often it is necessary to do software vertex processin fallbacks, so even for hardware that cannot do it there might draw calls where that's not necessary for the state tracker to do it. IMO, this entails too much hardware knowledge to be done both generally and efficiently by the state tracker. The "state tracker should sanitize" principle derives from our desire to share code as much as possible. But there is more than one way to share code. In this case it should be a pip driver auxiliary library. Jose ________________________________________ From: Corbin Simpson [mos...@gm...] Sent: Saturday, March 13, 2010 0:06 To: Keith Whitwell Cc: mes...@li... Subject: Re: [Mesa3d-dev] Mesa (master): st/mesa: Always recalculate invalid index bounds. This seems to be at odds with the idea that the pipe driver receives pre-sanitized inputs. If this patch is reverted: - I can't trust the min and max elts, because they're set to ~0 - I can't just use the vertex buffer sizes, because they're set to ~0 So I have to do the maths myself, guessing just like st/mesa guesses. Maybe this is specific to Radeons, but I *always* need those values. I don't mind this, but IMO it *must* be doc'd, "The minimum and maximum index limits passed to draw_elements and draw_range_elements may be invalid," and really, at that point, why are we even passing them? Seems absurd to me. :3 Posting from a mobile, pardon my terseness. ~ C. On Mar 12, 2010 5:40 AM, "Keith Whitwell" <ke...@vm...<mailto:ke...@vm...>> wrote: On Fri, 2010-03-12 at 02:54 -0800, Corbin Simpson wrote: > Module: Mesa > Branch: master > Commit: 50876ddaaff72a324ac45e255985e0f84e108594 > URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=50876ddaaff72a324ac45e255985e0f84e108594 > > Author: Corbin Simpson <Mos...@gm...<mailto:Mos...@gm...>> > Date: Fri Mar 12 02:51:40 2010 -0800 > > st/mesa: Always recalculate invalid index bounds. > > These should always be sanitized before heading towards the pipe driver, > and if the calling function explicitly marked them as invalid, we need > to regenerate them. > > Allows r300g to properly pass a bit more of Wine's d3d9 testing without > dropping stuff on the floor. > > --- > > src/mesa/state_tracker/st_draw.c | 6 +++--- > 1 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c > index 8b01272..d81b361 100644 > --- a/src/mesa/state_tracker/st_draw.c > +++ b/src/mesa/state_tracker/st_draw.c > @@ -542,9 +542,9 @@ st_draw_vbo(GLcontext *ctx, > assert(ctx->NewState == 0x0); > > /* Gallium probably doesn't want this in some cases. */ > - if (!index_bounds_valid) > - if (!vbo_all_varyings_in_vbos(arrays)) > - vbo_get_minmax_index(ctx, prims, ib, &min_index, &max_index); > + if (index_bounds_valid != GL_TRUE) { > + vbo_get_minmax_index(ctx, prims, ib, &min_index, &max_index); > + } Corbin, This isn't really desirable. Ideally this range checking should be pushed into pipe driver, because it's an expensive operation that is not necessary on a lot of hardware. # Specifically, vertex fetch hardware can often be set up with the min/max permissible index bounds to avoid accessing vertex data outside of the bound VB's. This can be achieved by examining the VBs, with min_index == 0, max_index = vb.size / vb.stride. The case where we need to calculate them internally is if some data is not in a VB, meaning we can't guess what the legal min/max values are. Also note that we need that min/max information to be able to upload the data to a VB. So, I think the code was probably correct in the original version - defer the minmax scan to the hardware driver, which may or may not need it. But maybe there is a better way to let the driver know that we are not providing this information. Keith |