You can subscribe to this list here.
2000 |
Jan
|
Feb
|
Mar
(10) |
Apr
(28) |
May
(41) |
Jun
(91) |
Jul
(63) |
Aug
(45) |
Sep
(37) |
Oct
(80) |
Nov
(91) |
Dec
(47) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(48) |
Feb
(121) |
Mar
(126) |
Apr
(16) |
May
(85) |
Jun
(84) |
Jul
(115) |
Aug
(71) |
Sep
(27) |
Oct
(33) |
Nov
(15) |
Dec
(71) |
2002 |
Jan
(73) |
Feb
(34) |
Mar
(39) |
Apr
(135) |
May
(59) |
Jun
(116) |
Jul
(93) |
Aug
(40) |
Sep
(50) |
Oct
(87) |
Nov
(90) |
Dec
(32) |
2003 |
Jan
(181) |
Feb
(101) |
Mar
(231) |
Apr
(240) |
May
(148) |
Jun
(228) |
Jul
(156) |
Aug
(49) |
Sep
(173) |
Oct
(169) |
Nov
(137) |
Dec
(163) |
2004 |
Jan
(243) |
Feb
(141) |
Mar
(183) |
Apr
(364) |
May
(369) |
Jun
(251) |
Jul
(194) |
Aug
(140) |
Sep
(154) |
Oct
(167) |
Nov
(86) |
Dec
(109) |
2005 |
Jan
(176) |
Feb
(140) |
Mar
(112) |
Apr
(158) |
May
(140) |
Jun
(201) |
Jul
(123) |
Aug
(196) |
Sep
(143) |
Oct
(165) |
Nov
(158) |
Dec
(79) |
2006 |
Jan
(90) |
Feb
(156) |
Mar
(125) |
Apr
(146) |
May
(169) |
Jun
(146) |
Jul
(150) |
Aug
(176) |
Sep
(156) |
Oct
(237) |
Nov
(179) |
Dec
(140) |
2007 |
Jan
(144) |
Feb
(116) |
Mar
(261) |
Apr
(279) |
May
(222) |
Jun
(103) |
Jul
(237) |
Aug
(191) |
Sep
(113) |
Oct
(129) |
Nov
(141) |
Dec
(165) |
2008 |
Jan
(152) |
Feb
(195) |
Mar
(242) |
Apr
(146) |
May
(151) |
Jun
(172) |
Jul
(123) |
Aug
(195) |
Sep
(195) |
Oct
(138) |
Nov
(183) |
Dec
(125) |
2009 |
Jan
(268) |
Feb
(281) |
Mar
(295) |
Apr
(293) |
May
(273) |
Jun
(265) |
Jul
(406) |
Aug
(679) |
Sep
(434) |
Oct
(357) |
Nov
(306) |
Dec
(478) |
2010 |
Jan
(856) |
Feb
(668) |
Mar
(927) |
Apr
(269) |
May
(12) |
Jun
(13) |
Jul
(6) |
Aug
(8) |
Sep
(23) |
Oct
(4) |
Nov
(8) |
Dec
(11) |
2011 |
Jan
(4) |
Feb
(2) |
Mar
(3) |
Apr
(9) |
May
(6) |
Jun
|
Jul
(1) |
Aug
(1) |
Sep
|
Oct
(2) |
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(3) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2013 |
Jan
(2) |
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(7) |
Nov
(1) |
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Luca B. <lu...@lu...> - 2010-09-14 08:38:53
|
Currently, there are several functions where static dispatch has been disabled, but that are exported by either ATI, nVidia or both lib. To prevent compatibility issues, it seems a good idea to export those too at least. What do you think? Should we export all function exported by both nVidia and ATI, those exported by any of them, or even just export all functions? static_dispatch="false" but exported by both ATI and nVidia: glBlendEquationSeparateEXT glBlitFramebufferEXT glGetQueryObjecti64vEXT glGetQueryObjectui64vEXT glProgramEnvParameters4fvEXT glProgramLocalParameters4fvEXT static_dispatch="false" but exported by ATI, not by nVidia: glGetHistogramEXT glGetHistogramParameterfvEXT glGetHistogramParameterivEXT glGetMinmaxEXT glGetMinmaxParameterfvEXT glGetMinmaxParameterivEXT glGetTexParameterPointervAPPLE glHistogramEXT glMinmaxEXT glResetHistogramEXT glResetMinmaxEXT glStencilFuncSeparateATI glStencilOpSeparateATI glTextureRangeAPPLE static_dispatch="false" but exported by nVidia, not by ATI: glActiveStencilFaceEXT glColorSubTableEXT glDeleteFencesNV glDepthBoundsEXT glFinishFenceNV glGenFencesNV glGetFenceivNV glIsFenceNV glSetFenceNV glTestFenceNV |
From: José F. <jfo...@vm...> - 2010-09-07 08:01:08
|
On Mon, 2010-09-06 at 16:31 -0700, Marek Olšák wrote: > On Mon, Sep 6, 2010 at 9:57 PM, José Fonseca <jfo...@vm...> > wrote: > > On Mon, 2010-09-06 at 10:22 -0700, Marek Olšák wrote: > > On Mon, Sep 6, 2010 at 3:57 PM, José Fonseca > <jfo...@vm...> > > wrote: > > I'd like to know if there's any objection to change > the > > resource_copy_region semantics to allow copies > between > > different yet > > compatible formats, where the definition of > compatible formats > > is: > > > > "formats for which copying the bytes from the > source resource > > unmodified to the destination resource will achieve > the same > > effect of a > > textured quad blitter" > > > > There is an helper function > util_is_format_compatible() to > > help making > > this decision, and these are the non-trivial > conversions that > > this > > function currently recognizes, (which was produced > by > > u_format_compatible_test.c): > > > > b8g8r8a8_unorm -> b8g8r8x8_unorm > > > > This specific case (and others) might not work, because > there are no > > 0/1 swizzles when blending pixels with the framebuffer, e.g. > see this > > sequence of operations: > > - Blit from b8g8r8a8 to b8g8r8x8. > > - x8 now contains a8. > > - Bind b8g8r8x8 as a colorbuffer. > > - Use blending with the destination alpha channel. > > - The original a8 is read instead of 1 (x8) because of lack > of > > swizzles. > > > This is not correct. Or at least not my interpretation. > > The x in b8g8r8x8 means padding (potentially with with > unitialized > data). There is no implicit guarantee that it will contain > 0xff or > anything. > > When blending to b8g8r8x8, destination alpha is by definition > 1.0. It is > an implicit swizzle (see e.g., u_format.csv). > > If the hardware's fixed function blending doesn't understand > bgrx > formats natively, then the pipe driver should internally > replace the > destination alpha factor factor with one. It's really simple. > See for > > The dst blending parameter is just a factor the real dst value is > multiplied by (except for min/max). There is no way to multiply an > arbitrary value by a constant and get 1.0. But you can force 0, of > course. I don't think there is hardware which supports such flexible > swizzling in the blender. Lets assume your hardware doesn't understand bgrx rendertargets formats natively, and you program it with the bgra format instead. If so then you must do these replacements in rgb_src_factor and rgb_dst_factor: PIPE_BLENDFACTOR_DST_ALPHA -> PIPE_BLENDFACTOR_ONE; PIPE_BLENDFACTOR_INV_DST_ALPHA -> PIPE_BLENDFACTOR_ZERO; PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE -> PIPE_BLENDFACTOR_ZERO; This will ensure that's written in the red, green, and blue components is consistent with a bgrx format (that is, destination alpha is always one -- incoming values are discarded). In this scenario, how you program alpha_src_factor/alpha_dst_factor is irrelevant, because they will only affect what's written in the padding bits, which is just padding -- it can and should be treated as gibberish. > If x8 is just padding as you say, the value of it should be undefined > and every operation using the padding bits should be undefined too > except for texture sampling. It's not like I have any other choice. IMO, there is no such thing as an "operation using the padding bits". It is more like the contents of padding is undefined after/before any operation. And no operation should rely on it to have any particular value, by definition. Alpha blending of with a bgrx format should not (and needs not) to incorporate the padding bits for any computation. It may, however, write anything it feels like to the padding bits as a side effect. Now we could certainly impose the restriction in gallium that dst alpha blendfactors will produce undefined results for bgrx (and perhaps this is what you're arguing for). Then the burden of doing the replacements above shifts to the statetracker. I think Keith favors that stance. At any rate, going back to the original topic, I see no reason not to allow bgra -> bgrx region_copy_regions. Also, for the record, in the moment arbitrary swizzles in the texture sampler bgrx formats became almost redundant. And I say almost because knowning that there is no alpha in the color buffer allows for certain optimizations (e.g., llvmpipe's swizzled layout separates the red, green, blue, and alpha channels into different 128bit words, and will not read/write alpha or any channel that can't be represented in the final color buffer). Jose |
From: Luca B. <lu...@lu...> - 2010-09-06 23:49:36
|
> The dst blending parameter is just a factor the real dst value is multiplied > by (except for min/max). There is no way to multiply an arbitrary value by a > constant and get 1.0. But you can force 0, of course. I don't think there is > hardware which supports such flexible swizzling in the blender. If x8 is > just padding as you say, the value of it should be undefined and every > operation using the padding bits should be undefined too except for texture > sampling. It's not like I have any other choice. As far as I can tell, the only problem you have with blending with an X8 with random garbage, but with "read value" 1 is if any of the blending factors is DST_ALPHA or INV_DST_ALPHA (or COLOR as an alpha factor), in which case you can solve the issue by replacing the offending factor with ONE or ZERO, as long as you have support for RGB/A separate blend functions (which Gallium currenly assumes afaik). You can also disable the alpha channel in the writemask to avoid unnecessary work. On nv30/nv40, there is an actual render target format that instructs the card to read dst alpha as 1 (you can also choose whether to write 0 or 1). Of course, one could argue that mesa/st should do the transformation instead of Gallium drivers where hardware lacks such support. I suppose just not advertising X8 formats as render target formats could also work. |
From: Marek O. <ma...@gm...> - 2010-09-06 23:32:31
|
On Mon, Sep 6, 2010 at 9:57 PM, José Fonseca <jfo...@vm...> wrote: > On Mon, 2010-09-06 at 10:22 -0700, Marek Olšák wrote: > > On Mon, Sep 6, 2010 at 3:57 PM, José Fonseca <jfo...@vm...> > > wrote: > > I'd like to know if there's any objection to change the > > resource_copy_region semantics to allow copies between > > different yet > > compatible formats, where the definition of compatible formats > > is: > > > > "formats for which copying the bytes from the source resource > > unmodified to the destination resource will achieve the same > > effect of a > > textured quad blitter" > > > > There is an helper function util_is_format_compatible() to > > help making > > this decision, and these are the non-trivial conversions that > > this > > function currently recognizes, (which was produced by > > u_format_compatible_test.c): > > > > b8g8r8a8_unorm -> b8g8r8x8_unorm > > > > This specific case (and others) might not work, because there are no > > 0/1 swizzles when blending pixels with the framebuffer, e.g. see this > > sequence of operations: > > - Blit from b8g8r8a8 to b8g8r8x8. > > - x8 now contains a8. > > - Bind b8g8r8x8 as a colorbuffer. > > - Use blending with the destination alpha channel. > > - The original a8 is read instead of 1 (x8) because of lack of > > swizzles. > > This is not correct. Or at least not my interpretation. > > The x in b8g8r8x8 means padding (potentially with with unitialized > data). There is no implicit guarantee that it will contain 0xff or > anything. > > When blending to b8g8r8x8, destination alpha is by definition 1.0. It is > an implicit swizzle (see e.g., u_format.csv). > > If the hardware's fixed function blending doesn't understand bgrx > formats natively, then the pipe driver should internally replace the > destination alpha factor factor with one. It's really simple. See for > The dst blending parameter is just a factor the real dst value is multiplied by (except for min/max). There is no way to multiply an arbitrary value by a constant and get 1.0. But you can force 0, of course. I don't think there is hardware which supports such flexible swizzling in the blender. If x8 is just padding as you say, the value of it should be undefined and every operation using the padding bits should be undefined too except for texture sampling. It's not like I have any other choice. Marek |
From: Luca B. <lu...@lu...> - 2010-09-06 21:49:30
|
> When I said it won't work with decent hardware, I really meant it won't > work due to compression. Now, it's quite possible this can be disabled > on any chip, but you don't know that before hence you need to jump > through hoops to get an uncompressed version of your compressed buffer > later. Well, you can render to a compressed depth buffer and then bind it as a depth texture (routinely done for shadows), so there needs to be a way to get compressed data to the sampler either directly or via the driver automagically converting it with a blit beforehand. Of course, this may not actually work for stencil too, or might not allow to let you interpret depth as 8-bit color components, or perhaps not use directly as a render target, but it seems possible, especially on modern flexible hardware and on older dumber hardware that lacks/doesn't force compression. I haven't checked any hardware docs though, beyond the fact that nvfx currently doesn't support any compression and thus can "just do it". |
From: Roland S. <sr...@vm...> - 2010-09-06 21:31:34
|
On 06.09.2010 22:03, Luca Barbieri wrote: >>> This way you could copy z24s8 to r8g8b8a8, for instance. > >> I am not sure this makes a lot of sense. There's no guarantee the bit >> layout of these is even remotely similar (and it likely won't be on any >> decent hardware). I think the dx10 restriction makes sense here. > > Yes, it depends on the flexibility of the hardware and the driver. > Due to depth textures, I think it is actually likely that you can > easily treat depth as color. > > The worst issue right now is that stencil cannot be accessed in a > sensible way at all, which makes implementing glBlitFramebuffer of > STENCIL_BIT with NEAREST and different rect sizes impossible. > Some cards (r600+ at least) can write stencil in shaders, but on some > you must reinterpret the surface. > And resource_copy_region does not support stretching, so it can't be used. > > Since not all cards can write stencil in shaders, one either needs to > be able to bind depth/stencil as a color buffer, or extend > resource_copy_region to support stretching with nearest filtering, or > both (possibly in addition to having the option of using stencil > export in shaders). Yes, accessing stencil is a problem - other apis just disallow that... There are other problems with accessing stencil, like for instance WritePixels with multisampled depth/stencil buffer (which you can't really map hence cpu fallbacks don't even work). Plus you really don't want any cpu fallbacks anyway. Using stencil export (ARB_shader_stencil_export) seems like a clean solution, but as you said not all cards support it. Plus you can't actually get the stencil values with texture sampling neither, so this doesn't help that much (well you can't get them with GL though hardware may support it I guess). When I said it won't work with decent hardware, I really meant it won't work due to compression. Now, it's quite possible this can be disabled on any chip, but you don't know that before hence you need to jump through hoops to get an uncompressed version of your compressed buffer later. Do applications actually really ever use blitframebuffer with stencil bit (with different sizes, otherwise resource_copy_region could be used)? It just seems to me that casts to completely different formats (well still with same total bitwidth, but still) are very unclean, but I don't have any good solution of how to solve this - if noone ever uses this in practice cpu fallback is just fine, but as said won't work for multisampled buffers for instance neither. > > Other things would likely benefit, such as GL_NV_copy_depth_to_color. Roland |
From: Luca B. <lu...@lu...> - 2010-09-06 21:18:40
|
Yes, if x8 is interpreted as "writes can write arbitrary data, reads must return 1" (as you said), then this is not necessary in resource_copy_region even if A8 -> X8 becomes supported. You are right that format conversions would probably be better added as a separate function (if at all), in addition to the "reinterpret_cast" mechanism you proposed to add. |
From: José F. <jfo...@vm...> - 2010-09-06 21:09:34
|
On Mon, 2010-09-06 at 10:41 -0700, Luca Barbieri wrote: > How about dropping the idea that "resource_copy_region must be just a > memcpy" and have the driver instruct the hardware 2D blitter to write > 1s in the alpha channel if supported by hw or have u_blitter do this > in the shader? It's really different functionality. You're asking for a cast, as in b = (type)a; as in (int)1.0f = 1. Another thing is b = *(type *)&a; as *(int *)&1.0f = 0x3f800000. This is my understanding of region_copy_region (previously known as surface_copy). And Roland provided a compelling argument for that. Both these functionality are exposed by APIs, and neither is a superset of the other. Jose |
From: Luca B. <lu...@lu...> - 2010-09-06 20:03:16
|
>> This way you could copy z24s8 to r8g8b8a8, for instance. > I am not sure this makes a lot of sense. There's no guarantee the bit > layout of these is even remotely similar (and it likely won't be on any > decent hardware). I think the dx10 restriction makes sense here. Yes, it depends on the flexibility of the hardware and the driver. Due to depth textures, I think it is actually likely that you can easily treat depth as color. The worst issue right now is that stencil cannot be accessed in a sensible way at all, which makes implementing glBlitFramebuffer of STENCIL_BIT with NEAREST and different rect sizes impossible. Some cards (r600+ at least) can write stencil in shaders, but on some you must reinterpret the surface. And resource_copy_region does not support stretching, so it can't be used. Since not all cards can write stencil in shaders, one either needs to be able to bind depth/stencil as a color buffer, or extend resource_copy_region to support stretching with nearest filtering, or both (possibly in addition to having the option of using stencil export in shaders). Other things would likely benefit, such as GL_NV_copy_depth_to_color. |
From: José F. <jfo...@vm...> - 2010-09-06 20:00:24
|
On Mon, 2010-09-06 at 10:22 -0700, Marek Olšák wrote: > On Mon, Sep 6, 2010 at 3:57 PM, José Fonseca <jfo...@vm...> > wrote: > I'd like to know if there's any objection to change the > resource_copy_region semantics to allow copies between > different yet > compatible formats, where the definition of compatible formats > is: > > "formats for which copying the bytes from the source resource > unmodified to the destination resource will achieve the same > effect of a > textured quad blitter" > > There is an helper function util_is_format_compatible() to > help making > this decision, and these are the non-trivial conversions that > this > function currently recognizes, (which was produced by > u_format_compatible_test.c): > > b8g8r8a8_unorm -> b8g8r8x8_unorm > > This specific case (and others) might not work, because there are no > 0/1 swizzles when blending pixels with the framebuffer, e.g. see this > sequence of operations: > - Blit from b8g8r8a8 to b8g8r8x8. > - x8 now contains a8. > - Bind b8g8r8x8 as a colorbuffer. > - Use blending with the destination alpha channel. > - The original a8 is read instead of 1 (x8) because of lack of > swizzles. This is not correct. Or at least not my interpretation. The x in b8g8r8x8 means padding (potentially with with unitialized data). There is no implicit guarantee that it will contain 0xff or anything. When blending to b8g8r8x8, destination alpha is by definition 1.0. It is an implicit swizzle (see e.g., u_format.csv). If the hardware's fixed function blending doesn't understand bgrx formats natively, then the pipe driver should internally replace the destination alpha factor factor with one. It's really simple. See for example llvmpipe (which needs to do that because the swizzled tile format is always bgra, so it needs to ignore destination alpha when bgrx surface is bound). I'm not sure what OpenGL defines, but DirectX/DCT definetely prescribes/enforces this behavior. > The blitter and other util functions just need to be extended to > explicitly write 1 instead of copying the alpha channel. Something > likes this is already done in st/mesa, see the function > compatible_src_dst_formats. There is no alpha channel in b8g8r8x8 for anybody to write. The problem here is not what's written in the padding bits -- it is instead in making sure the padding bits are not interpreted as alpha. If the hardware *really* works better with 0xff in the padding bits, then that needs to be enforced not only in surface copy, but in transfers (i.e., when the transfer is unmapped, the pipe driver would need to fill padding bits with 0xff for every pixel. Jose |
From: Roland S. <sr...@vm...> - 2010-09-06 18:34:28
|
On 06.09.2010 17:16, Luca Barbieri wrote: > On Mon, Sep 6, 2010 at 3:57 PM, José Fonseca <jfo...@vm...> wrote: >> I'd like to know if there's any objection to change the >> resource_copy_region semantics to allow copies between different yet >> compatible formats, where the definition of compatible formats is: > > I was about to propose something like this. > > How about a much more powerful change though, that would make any pair > of non-blocked format of the same bit depth compatible? > This way you could copy z24s8 to r8g8b8a8, for instance. I am not sure this makes a lot of sense. There's no guarantee the bit layout of these is even remotely similar (and it likely won't be on any decent hardware). I think the dx10 restriction makes sense here. > > In addition to this, how about explicitly allowing sampler views to > use a compatible format, and add the ability for surfaces to use a > compatible format too? (with a new parameter to get_tex_surface) Note that get_tex_surface is dead (in gallium-array-textures - not merged yet but it will happen eventually). Its replacement (for render targets or depth stencil) create_surface(), already can be supplied with a format parameter. Compatible formats though should ultimately end up to something similar to dx10. > > This would allow for instance to implement glBlitFramebuffer on > stencil buffers by reinterpreting the buffer as r8g8b8a8, and allow > the blitter module to copy depth/stencil buffers by simply treating > them as color buffers. > > The only issue is that some drivers might hold depth/stencil surfaces > in compressed formats that cannot be interpreted as a color format, > and not have any mechanism for keeping temporaries or doing > conversions internally. I think that's a pretty big if. I could be wrong but I think operations like blitting stencil buffers are pretty rare anyway (afaik other apis don't allow things like that). > > DirectX seems to have something like this with the _TYPELESS formats. Yes, and it precisely won't allow you to interpret s24_z8 as r8g8b8a8 or other wonky stuff. Only if all components have same number of bits. Roland |
From: Luca B. <lu...@lu...> - 2010-09-06 17:42:11
|
How about dropping the idea that "resource_copy_region must be just a memcpy" and have the driver instruct the hardware 2D blitter to write 1s in the alpha channel if supported by hw or have u_blitter do this in the shader? nv30/nv40 and apparently nv50 can do this in the 2D blitter, and all Radeons seem to use the 3D engine, which obviously can do it in the shader. We may also want to allow actual conversion between arbitrary formats, since again u_blitter can do it trivially, and so can most/all hardware 2D engines. |
From: Marek O. <ma...@gm...> - 2010-09-06 17:22:54
|
On Mon, Sep 6, 2010 at 3:57 PM, José Fonseca <jfo...@vm...> wrote: > I'd like to know if there's any objection to change the > resource_copy_region semantics to allow copies between different yet > compatible formats, where the definition of compatible formats is: > > "formats for which copying the bytes from the source resource > unmodified to the destination resource will achieve the same effect of a > textured quad blitter" > > There is an helper function util_is_format_compatible() to help making > this decision, and these are the non-trivial conversions that this > function currently recognizes, (which was produced by > u_format_compatible_test.c): > > b8g8r8a8_unorm -> b8g8r8x8_unorm > This specific case (and others) might not work, because there are no 0/1 swizzles when blending pixels with the framebuffer, e.g. see this sequence of operations: - Blit from b8g8r8a8 to b8g8r8x8. - x8 now contains a8. - Bind b8g8r8x8 as a colorbuffer. - Use blending with the destination alpha channel. - The original a8 is read instead of 1 (x8) because of lack of swizzles. The blitter and other util functions just need to be extended to explicitly write 1 instead of copying the alpha channel. Something likes this is already done in st/mesa, see the function compatible_src_dst_formats. Marek a8r8g8b8_unorm -> x8r8g8b8_unorm > b5g5r5a1_unorm -> b5g5r5x1_unorm > b4g4r4a4_unorm -> b4g4r4x4_unorm > l8_unorm -> r8_unorm > i8_unorm -> l8_unorm > i8_unorm -> a8_unorm > i8_unorm -> r8_unorm > l16_unorm -> r16_unorm > z24_unorm_s8_uscaled -> z24x8_unorm > s8_uscaled_z24_unorm -> x8z24_unorm > r8g8b8a8_unorm -> r8g8b8x8_unorm > a8b8g8r8_srgb -> x8b8g8r8_srgb > b8g8r8a8_srgb -> b8g8r8x8_srgb > a8r8g8b8_srgb -> x8r8g8b8_srgb > a8b8g8r8_unorm -> x8b8g8r8_unorm > r10g10b10a2_uscaled -> r10g10b10x2_uscaled > r10sg10sb10sa2u_norm -> r10g10b10x2_snorm > > Note that format compatibility is not commutative. > > For software drivers this means that memcpy/util_copy_rect() will > achieve the correct result. > > For hardware drivers this means that a VRAM->VRAM 2D blit engine will > also achieve the correct result. > > So I'd expect no implementation change of resource_copy_region() for any > driver AFAICT. But I'd like to be sure. > > Jose > > > > ------------------------------------------------------------------------------ > This SF.net Dev2Dev email is sponsored by: > > Show off your parallel programming skills. > Enter the Intel(R) Threading Challenge 2010. > http://p.sf.net/sfu/intel-thread-sfd > _______________________________________________ > Mesa3d-dev mailing list > Mes...@li... > https://lists.sourceforge.net/lists/listinfo/mesa3d-dev > > |
From: José F. <jfo...@vm...> - 2010-09-06 15:47:16
|
On Mon, 2010-09-06 at 08:11 -0700, Roland Scheidegger wrote: > On 06.09.2010 15:57, José Fonseca wrote: > > I'd like to know if there's any objection to change the > > resource_copy_region semantics to allow copies between different yet > > compatible formats, where the definition of compatible formats is: > > > > "formats for which copying the bytes from the source resource > > unmodified to the destination resource will achieve the same effect of a > > textured quad blitter" > > > > There is an helper function util_is_format_compatible() to help making > > this decision, and these are the non-trivial conversions that this > > function currently recognizes, (which was produced by > > u_format_compatible_test.c): > > > > b8g8r8a8_unorm -> b8g8r8x8_unorm > > a8r8g8b8_unorm -> x8r8g8b8_unorm > > b5g5r5a1_unorm -> b5g5r5x1_unorm > > b4g4r4a4_unorm -> b4g4r4x4_unorm > > l8_unorm -> r8_unorm > > i8_unorm -> l8_unorm > > i8_unorm -> a8_unorm > > i8_unorm -> r8_unorm > > l16_unorm -> r16_unorm > > z24_unorm_s8_uscaled -> z24x8_unorm > > s8_uscaled_z24_unorm -> x8z24_unorm > > r8g8b8a8_unorm -> r8g8b8x8_unorm > > a8b8g8r8_srgb -> x8b8g8r8_srgb > > b8g8r8a8_srgb -> b8g8r8x8_srgb > > a8r8g8b8_srgb -> x8r8g8b8_srgb > > a8b8g8r8_unorm -> x8b8g8r8_unorm > > r10g10b10a2_uscaled -> r10g10b10x2_uscaled > > r10sg10sb10sa2u_norm -> r10g10b10x2_snorm > > > > Note that format compatibility is not commutative. > > > > For software drivers this means that memcpy/util_copy_rect() will > > achieve the correct result. > > > > For hardware drivers this means that a VRAM->VRAM 2D blit engine will > > also achieve the correct result. > > > > So I'd expect no implementation change of resource_copy_region() for any > > driver AFAICT. But I'd like to be sure. > > > > Jose > > José, > > this looks good to me. Note that the analogous function in d3d10, > ResourceCopyRegion, only requires formats to be in the same typeless > group (hence same number of bits for all components), which is certainly > a broader set of compatible formats to what util_is_format_compatible() > is outputting. As far as I can tell, no conversion is happening at all > in d3d10, this is just like memcpy. I think we might want to support > that in the future as well, but for now extending this to the formats > you listed certainly sounds ok. Yes, that makes sense. Thanks for the feedback, Roland. Jose |
From: Luca B. <lu...@lu...> - 2010-09-06 15:16:18
|
On Mon, Sep 6, 2010 at 3:57 PM, José Fonseca <jfo...@vm...> wrote: > I'd like to know if there's any objection to change the > resource_copy_region semantics to allow copies between different yet > compatible formats, where the definition of compatible formats is: I was about to propose something like this. How about a much more powerful change though, that would make any pair of non-blocked format of the same bit depth compatible? This way you could copy z24s8 to r8g8b8a8, for instance. In addition to this, how about explicitly allowing sampler views to use a compatible format, and add the ability for surfaces to use a compatible format too? (with a new parameter to get_tex_surface) This would allow for instance to implement glBlitFramebuffer on stencil buffers by reinterpreting the buffer as r8g8b8a8, and allow the blitter module to copy depth/stencil buffers by simply treating them as color buffers. The only issue is that some drivers might hold depth/stencil surfaces in compressed formats that cannot be interpreted as a color format, and not have any mechanism for keeping temporaries or doing conversions internally. DirectX seems to have something like this with the _TYPELESS formats. |
From: Roland S. <sr...@vm...> - 2010-09-06 15:12:05
|
On 06.09.2010 15:57, José Fonseca wrote: > I'd like to know if there's any objection to change the > resource_copy_region semantics to allow copies between different yet > compatible formats, where the definition of compatible formats is: > > "formats for which copying the bytes from the source resource > unmodified to the destination resource will achieve the same effect of a > textured quad blitter" > > There is an helper function util_is_format_compatible() to help making > this decision, and these are the non-trivial conversions that this > function currently recognizes, (which was produced by > u_format_compatible_test.c): > > b8g8r8a8_unorm -> b8g8r8x8_unorm > a8r8g8b8_unorm -> x8r8g8b8_unorm > b5g5r5a1_unorm -> b5g5r5x1_unorm > b4g4r4a4_unorm -> b4g4r4x4_unorm > l8_unorm -> r8_unorm > i8_unorm -> l8_unorm > i8_unorm -> a8_unorm > i8_unorm -> r8_unorm > l16_unorm -> r16_unorm > z24_unorm_s8_uscaled -> z24x8_unorm > s8_uscaled_z24_unorm -> x8z24_unorm > r8g8b8a8_unorm -> r8g8b8x8_unorm > a8b8g8r8_srgb -> x8b8g8r8_srgb > b8g8r8a8_srgb -> b8g8r8x8_srgb > a8r8g8b8_srgb -> x8r8g8b8_srgb > a8b8g8r8_unorm -> x8b8g8r8_unorm > r10g10b10a2_uscaled -> r10g10b10x2_uscaled > r10sg10sb10sa2u_norm -> r10g10b10x2_snorm > > Note that format compatibility is not commutative. > > For software drivers this means that memcpy/util_copy_rect() will > achieve the correct result. > > For hardware drivers this means that a VRAM->VRAM 2D blit engine will > also achieve the correct result. > > So I'd expect no implementation change of resource_copy_region() for any > driver AFAICT. But I'd like to be sure. > > Jose José, this looks good to me. Note that the analogous function in d3d10, ResourceCopyRegion, only requires formats to be in the same typeless group (hence same number of bits for all components), which is certainly a broader set of compatible formats to what util_is_format_compatible() is outputting. As far as I can tell, no conversion is happening at all in d3d10, this is just like memcpy. I think we might want to support that in the future as well, but for now extending this to the formats you listed certainly sounds ok. Roland |
From: José F. <jfo...@vm...> - 2010-09-06 13:57:26
|
I'd like to know if there's any objection to change the resource_copy_region semantics to allow copies between different yet compatible formats, where the definition of compatible formats is: "formats for which copying the bytes from the source resource unmodified to the destination resource will achieve the same effect of a textured quad blitter" There is an helper function util_is_format_compatible() to help making this decision, and these are the non-trivial conversions that this function currently recognizes, (which was produced by u_format_compatible_test.c): b8g8r8a8_unorm -> b8g8r8x8_unorm a8r8g8b8_unorm -> x8r8g8b8_unorm b5g5r5a1_unorm -> b5g5r5x1_unorm b4g4r4a4_unorm -> b4g4r4x4_unorm l8_unorm -> r8_unorm i8_unorm -> l8_unorm i8_unorm -> a8_unorm i8_unorm -> r8_unorm l16_unorm -> r16_unorm z24_unorm_s8_uscaled -> z24x8_unorm s8_uscaled_z24_unorm -> x8z24_unorm r8g8b8a8_unorm -> r8g8b8x8_unorm a8b8g8r8_srgb -> x8b8g8r8_srgb b8g8r8a8_srgb -> b8g8r8x8_srgb a8r8g8b8_srgb -> x8r8g8b8_srgb a8b8g8r8_unorm -> x8b8g8r8_unorm r10g10b10a2_uscaled -> r10g10b10x2_uscaled r10sg10sb10sa2u_norm -> r10g10b10x2_snorm Note that format compatibility is not commutative. For software drivers this means that memcpy/util_copy_rect() will achieve the correct result. For hardware drivers this means that a VRAM->VRAM 2D blit engine will also achieve the correct result. So I'd expect no implementation change of resource_copy_region() for any driver AFAICT. But I'd like to be sure. Jose |
From: Dan N. <dbn...@gm...> - 2010-08-04 19:08:04
|
2010/8/4 Tomáš Chvátal <sca...@ge...>: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Dne 2.12.2009 03:33, Dan Nicholson napsal(a): >> 2009/12/1 Tomáš Chvátal <sca...@ge...>: >>> Hi, >>> I am cleaning bugs open in gentoo for mesa and i found one with patch attached >>> to it, which is applicable upstream. >>> >>> Patch just allows us to to disable writable relocations in gl. It is "must" >>> for users using PaX and others. For more detailed rationale please look onto >>> bug in our bugzilla [1] >>> >>> Jeremy Huddleston created the patch i attach here and it really fixes the >>> problem. >>> >>> So my question is, could you consider applying this for 7.6 7.7 and trunk? >> >> Yeah, that works, although I don't know what the else part is for >> (enable_glx_rts would already by no). >> >> Signed-off-by: Dan Nicholson <dbn...@gm...> >> >> -- >> Dan > > So guys sorry for restoring this thread but is this going to happen or not? > > I know they should consider not using DRI at all if they want their > security, but the patch will please them for now and it is just few > lines in configure.ac that will default to off :] Forgot about this. I don't have any problem with it, and the configure option only makes the already existing macro visible. If no one pipes up with any objections, I'll commit it in a couple days. -- Dan |
From: Tomáš C. <sca...@ge...> - 2010-08-04 16:05:19
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dne 2.12.2009 03:33, Dan Nicholson napsal(a): > 2009/12/1 Tomáš Chvátal <sca...@ge...>: >> Hi, >> I am cleaning bugs open in gentoo for mesa and i found one with patch attached >> to it, which is applicable upstream. >> >> Patch just allows us to to disable writable relocations in gl. It is "must" >> for users using PaX and others. For more detailed rationale please look onto >> bug in our bugzilla [1] >> >> Jeremy Huddleston created the patch i attach here and it really fixes the >> problem. >> >> So my question is, could you consider applying this for 7.6 7.7 and trunk? > > Yeah, that works, although I don't know what the else part is for > (enable_glx_rts would already by no). > > Signed-off-by: Dan Nicholson <dbn...@gm...> > > -- > Dan So guys sorry for restoring this thread but is this going to happen or not? I know they should consider not using DRI at all if they want their security, but the patch will please them for now and it is just few lines in configure.ac that will default to off :] Cheers - -------- Tomáš Chvátal Gentoo Linux Developer [Clustering/Council/KDE/QA/Sci/X11] E-Mail : sca...@ge... GnuPG FP : 94A4 5CCD 85D3 DE24 FE99 F924 1C1E 9CDE 0341 4587 GnuPG ID : 03414587 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkxZj6wACgkQHB6c3gNBRYc1bwCgyUc8z5s59jslXn/Ul+S0Km/z fOIAn3dP5b6VLQ2/E9g1H1VEFc9dtEMF =aWWG -----END PGP SIGNATURE----- |
From: Mario K. <mar...@tu...> - 2010-08-02 19:10:05
|
On Aug 2, 2010, at 3:55 PM, Kristian Høgsberg wrote: > > I changed the code to just drop the lock while we create and > initialize the glx display. Once we're ready to add it to the list, > we take the lock again. After making sure nobody beat us to > initializing glx on the display, we add it to the global list. I > think that should work. > Looks good to me and works well for my test cases/toolkit. Thanks. In some of the error exits in __glXInitialize() there are some superfluous leftover XUnlockMutex() calls for the no longer held mutex. Not harmful i guess, but could be deleted. best, -mario |
From: Mario K. <mar...@tu...> - 2010-08-02 19:05:35
|
On Aug 2, 2010, at 3:41 PM, Jerome Glisse wrote: > I will push these patches once i deal with the bugs they > introduce. > > Jerome I see it has "landed". Works. Thanks. -mario |
From: Jerome G. <gl...@fr...> - 2010-08-02 14:06:37
|
On 08/01/2010 11:05 PM, Mario Kleiner wrote: > Hello, > > could you please review & apply the following patch: > > When a DRI2 swap buffer is pending we need to make sure we > have the flush extension so radeon doesn't resume rendering to > or reading from the not yet blitted front buffer. > > This fixes: > > https://bugs.freedesktop.org/show_bug.cgi?id=28341 > https://bugs.freedesktop.org/show_bug.cgi?id=28410 > > Signed-off-by: Jerome Glisse <jg...@re...> > Signed-off-by: Mario Kleiner <mar...@tu...> > > thanks, > -mario > > I will push these patches once i deal with the bugs they introduce. Jerome |
From: Kristian H. <kr...@bi...> - 2010-08-02 13:55:54
|
On Sun, Aug 1, 2010 at 11:19 PM, Mario Kleiner <mar...@tu...> wrote: > Think i sent it to the wrong mailing list. Here we go again... > > Begin forwarded message: > >> From: Mario Kleiner <mar...@tu...> >> Date: July 27, 2010 2:56:04 AM GMT+02:00 >> To: Kristian Hogsberg <kr...@bi...>, xor...@li... >> Subject: Bug: Deadlock for multi-threaded glx apps inside >> __glXInitialize() >> >> Hi Kristian >> >> Testing with current mesa master, my toolkit deadlocks on the first call >> to a glX function (glXChooseVisual()). >> >> The deadlock was probably introduced by your recent commit: >> >> "glx: Use _Xglobal_lock for protecting extension display list" >> >> ab434f6b7641a64d30725a9ac24929240362d466 >> >> The problem is that the _Xglobal_lock is locked twice inside the >> __glXInitialize() function of mesa/src/glx/glxext.c, once inside >> __glXInitialize(), and then as part of dri2CreateDisplay() -> ... -> >> XextFindDisplay. The 2nd locking call on the already held lock deadlocks. >> >> Attached a backtrace with the problem and a patch/hack that "fixes" it for >> me, but introduces a race-condition itself, so this is obviously not the >> correct solution. The race condition would trigger if two threads would >> simultaneously do their first call to a glX function for the same Display* >> dpy handle, rather unlikely to happen in practice? If so, then the patch >> might be an acceptable fix until a better solution is found? I changed the code to just drop the lock while we create and initialize the glx display. Once we're ready to add it to the list, we take the lock again. After making sure nobody beat us to initializing glx on the display, we add it to the global list. I think that should work. >> Other applications (glxgears, games etc.) work correctly. The difference >> is that my toolkit calls XInitThreads() at startup to use xlib >> multi-threaded and afaik the locking calls only get enabled if xlib is >> switched to thread-safe mode, otherwise they are no-ops? If i omit >> XInitThreads() everything works again. Yes, that makes sense, that's why I didn't see it. thanks, Kristian |
From: Mario K. <mar...@tu...> - 2010-08-02 03:19:35
|
Think i sent it to the wrong mailing list. Here we go again... Begin forwarded message: > From: Mario Kleiner <mar...@tu...> > Date: July 27, 2010 2:56:04 AM GMT+02:00 > To: Kristian Hogsberg <kr...@bi...>, xor...@li... > Subject: Bug: Deadlock for multi-threaded glx apps inside > __glXInitialize() > > Hi Kristian > > Testing with current mesa master, my toolkit deadlocks on the first > call to a glX function (glXChooseVisual()). > > The deadlock was probably introduced by your recent commit: > > "glx: Use _Xglobal_lock for protecting extension display list" > > ab434f6b7641a64d30725a9ac24929240362d466 > > The problem is that the _Xglobal_lock is locked twice inside the > __glXInitialize() function of mesa/src/glx/glxext.c, once inside > __glXInitialize(), and then as part of dri2CreateDisplay() -> ... - > > XextFindDisplay. The 2nd locking call on the already held lock > deadlocks. > > Attached a backtrace with the problem and a patch/hack that "fixes" > it for me, but introduces a race-condition itself, so this is > obviously not the correct solution. The race condition would > trigger if two threads would simultaneously do their first call to > a glX function for the same Display* dpy handle, rather unlikely to > happen in practice? If so, then the patch might be an acceptable > fix until a better solution is found? > > Other applications (glxgears, games etc.) work correctly. The > difference is that my toolkit calls XInitThreads() at startup to > use xlib multi-threaded and afaik the locking calls only get > enabled if xlib is switched to thread-safe mode, otherwise they are > no-ops? If i omit XInitThreads() everything works again. > > thanks, > -mario > |
From: Mario K. <mar...@tu...> - 2010-08-02 03:05:42
|
Hello, could you please review & apply the following patch: When a DRI2 swap buffer is pending we need to make sure we have the flush extension so radeon doesn't resume rendering to or reading from the not yet blitted front buffer. This fixes: https://bugs.freedesktop.org/show_bug.cgi?id=28341 https://bugs.freedesktop.org/show_bug.cgi?id=28410 Signed-off-by: Jerome Glisse <jg...@re...> Signed-off-by: Mario Kleiner <mar...@tu...> thanks, -mario |