Re: tests/blendminmax fails on r200

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Ian Romanick wrote:
>> Here's a patch. That << 8 is a bit ugly, though. Should I add the 
>> unshifted blend func values to r200_reg.h, together with some 
>> proper defined shifts? I like the blend func value selection better
>>  than before (where there was a huge function where the second half
>>  is (almost) identical to the first). Fixes blendminmax indeed...
> 
> 
> The shift is a bit ugly, but other than that, the patch looks like 
> what I would have written. :)
The patch cannot deny its copy-merge heritage ;-)

> My suggestion would be to add 
> SRC_BLEND_FACT() and DST_BLEND_FACT() macros like i830 has.
ok, I'll rip out the old blend blend values in the r200_reg.h file, add
unshifted ones and use BLEND_FACT macros instead (though, I HATE macros).

> Also, if the assertion that the RGB and alpha blend equations are the
>  same is left in, we should probably add assertions that the RGB and 
> alpha blend functions are the same as well.
Yes saw that too, but since I wanted to add support for the separate
functions I didn't bother to change it (and also, I think mesa
guarantees it anyway, so the assertion could go away).

> This would also be a good time to discuss adding support for 
> EXT_blend_func_separarte and EXT_blend_equation_separate.  If 
> separate blend equations or functions are used on R200, the 
> R200_RB3D_ABLENDCNTL and R200_RB3D_CBLENDCNTL are used instead of 
> R200_RB3D_BLENDCNTL.
Yes, I wanted to do that. In fact, I have written the code for it some
months ago, the only reason it's not finished is because the odd blend
behaviour, I didn't want to add additional blend stuff unless the old
(simpler to debug) code wasn't fixed (and I also couldn't test it).

> There are two issues.  How is the extra blend information 
> communicated to the DRM?  How do we handle the fact that 
> R200_RB3D_BLENDCNTL lives in a state atom with a lot of other stuff 
> and may, accidentally, overwrite the separate state?
> 
> For the first issue, I see two possible solutions.  Both require 
> changes to the Radeon DRM.  One way is to add a new "blend" state 
> atom.  This atom would include both R200_RB3D_ABLENDCNTL and 
> R200_RB3D_CBLENDCNTL. It would only be emitted if 
> R200_RB3D_ABLENDCNTL != R200_RB3D_CBLENDCNTL.
Not only these two, but also R200_RB3D_BLENDCOLOR (not yet in
r200_reg.h, but Keith has revealed the location of it - it's directly
before these two registers, and I have tested that it works). I'm not
sure though if it's a good idea to only emit it if the blend registers
are identical, it could simply always be emitted (I think there's no
performance difference for the blend operation if it's separate or not).

> The problem is that every time the ctx atom is emitted, the blend
> atom also must be emitted.  This could also cause problems with
> context switching.  I supposed this could be as easy as just
> modifying r200EmitState to always emit blend after anytime ctx is
> emitted.  That seems a bit ugly, though.
Yes, especially since ctx is emitted really a lot. Nonetheless, this is 
what I've used initially.

> The other way is to just extend the ctx atom with the 
> R200_RB3D_ABLENDCNTL register.  If R200_RB3D_ABLENDCNTL != 
> R200_RB3D_CBLENDCNTL the DRM would write to those registers, 
> otherwise it would use the R200_RB3D_BLENDCNTL.  This should be easy.
>  The cmd_size would change depending on the DRM version, so dealing 
> with binary compatibility in that direction is easy.  I'm not so sure
>  about the other direction.  In looking at how the cmdpacket
> mechanism works, I'm not sure this exact scheme would work, but some
> variation of it should.
> 
> Thoughts?
Ah, that's an interesting idea. I've first thought this is not possible
due to the macro-ish nature of the state atoms, but it actually seems 
easy to do (just don't use the fixed CTX_STATE_SIZE, doh!). Backward 
compatiblity in this direction might not be absolutely required though, 
but the other direction (new kernel module, old dri) is a must, 
otherwise Linus will not be happy...
I'll look into it.

> It could be that it is supposed to work the way would expect it, but 
> there are hardware bugs.  Who knows.  The only bad part of leaving 
> the blend enabled (but set to a no-op mode) is that the 
> read-modify-write access to the framebuffer may be slower.  It would 
> be interesting to test.  In fact a test like that would be useful for
>  other cards too.  If there is a performance difference, it may be 
> worth it to detect the no-op blend modes and disable blending. Hmm...
I've written a patch which just sets back the blend equation to func_add 
(and also set back the blend function, just to be sure) whenever 
blending is disabled in addition to disable blending, and also checks 
when the blend_func/equation is changed if blending is currently enabled 
(otherwise do nothing). Blend equation/function thus is only set if 
blending gets enabled, or if the equation/function is changed while it's 
enabled. Seems to work, if there are no objections I'll commit it (after 
some cleanups) tomorrow.

Then I can finally work on the blend_separate extensions ;-).

Roland