From: Luca B. <lu...@lu...> - 2010-04-01 22:29:26
|
The half float conversion code initially used a global C++ constructor to initialize the tables. That fails to work since ld does not get the symbol from the shared library, so I changed to use register a global constructor from C, using GCC- or MSVC-specific code. This is not necessarily the best option, but clearly putting a check in the functions as Corbin did is a bad idea performance-wise. So, how should this be done? Options are: 1. Revert Corbin Simpsons's commit and if anyone complains about an unsupported compiler, implement UTIL_INIT for that compiler too 2a. Write the init module in C++ and use portable global constructor syntax (but with other C++-related problems) 2b. Write an auxiliary file in C++ with a global constructor and reference it from the C init file so the static linker pulls it from the .a 3. Have all modules that need half float conversion directly or indirectly call the init functions in their init code 4. Make the build pregenerate the tables and ship them in the executables Option 1: Pros: just works, other auxiliary modules can use the same system Cons: need to write UTIL_INIT for each compiler, only GCC and MSVC (and compatible ones) are currently supported Option 2a: Pros: no compiler-specific UTIL_INIT Cons: significant code written in C++ instead of C and you must link all targets with a C++ compiler or use compiler-specific options to prevent stuff like the G++ personality causing the link to fail Option 2b: Like option 2a, but Pro: less code written in C++ Con: need an extra C++ file for every module with data to be initialized Option 3: Pros: none of the cons of the other options Cons: cumbersome to do, must not forget to call the init function or you get silent corruption. Init calls creep through the whole codebase. Option 4: Pros: no need to do anything at runtime, pages can be shared between OpenGL apps Cons: need to write a special table generator, all DRI drivers get 10KB larger, solution does not apply to all similar problems I would lean for either option 1 or option 4, perhaps option 4 considering there seems to be disagreement over option 1. Option 4 however is likely not universally applicable (not everything can necessarily be pregenerated), so I'd keep the UTIL_INIT code anyway. Which one do we pick? |
From: Brian P. <bri...@gm...> - 2010-04-01 22:39:00
|
On Thu, Apr 1, 2010 at 4:29 PM, Luca Barbieri <lu...@lu...> wrote: > The half float conversion code initially used a global C++ constructor > to initialize the tables. > > That fails to work since ld does not get the symbol from the shared > library, so I changed to use register a global constructor from C, > using GCC- or MSVC-specific code. > > This is not necessarily the best option, but clearly putting a check > in the functions as Corbin did is a bad idea performance-wise. > > So, how should this be done? > > Options are: > 1. Revert Corbin Simpsons's commit and if anyone complains about an > unsupported compiler, implement UTIL_INIT for that compiler too > 2a. Write the init module in C++ and use portable global constructor > syntax (but with other C++-related problems) > 2b. Write an auxiliary file in C++ with a global constructor and > reference it from the C init file so the static linker pulls it from > the .a > 3. Have all modules that need half float conversion directly or > indirectly call the init functions in their init code > 4. Make the build pregenerate the tables and ship them in the executables > > Option 1: > Pros: just works, other auxiliary modules can use the same system > Cons: need to write UTIL_INIT for each compiler, only GCC and MSVC > (and compatible ones) are currently supported > > Option 2a: > Pros: no compiler-specific UTIL_INIT > Cons: significant code written in C++ instead of C and you must link > all targets with a C++ compiler or use compiler-specific options to > prevent stuff like the G++ personality causing the link to fail > > Option 2b: > Like option 2a, but > Pro: less code written in C++ > Con: need an extra C++ file for every module with data to be initialized > > Option 3: > Pros: none of the cons of the other options > Cons: cumbersome to do, must not forget to call the init function or > you get silent corruption. Init calls creep through the whole > codebase. > > Option 4: > Pros: no need to do anything at runtime, pages can be shared between OpenGL apps > Cons: need to write a special table generator, all DRI drivers get > 10KB larger, solution does not apply to all similar problems > > I would lean for either option 1 or option 4, perhaps option 4 > considering there seems to be disagreement over option 1. > Option 4 however is likely not universally applicable (not everything > can necessarily be pregenerated), so I'd keep the UTIL_INIT code > anyway. > > Which one do we pick? #3. Yeah, in this day and age we should have a nice way to do global constructors but we don't. This solution is simple. A debug-only assert could be used to check that the table's initialized. It's not that big of deal. -Brian |
From: Luca B. <lu...@lu...> - 2010-04-01 23:00:22
|
OK. I'd like to add that u_atomic.h already requires either GCC, MSVC or Solaris, and GCC and MSVC are already supported by this system. Thus we do indeed now have a simple way to do global constructors, that only removes support for non-GCC Solaris until someone figures out how to do that. And it's relatively simple, you just have to figure out the section name of the global constructor table, and how to instruct the specific compiler to put a variable in a specific section. GCC even has __attribute__((constructor)) which does it all for you. At any rate, util_format_s3tc_init has similar issues, and is currently called from a few places. I think the best thing to do to implement your suggestion is adding an util_format_init that calls both init functions and leave the UTIL_INIT code in place (since it seems we now got it right): it is easy to remove by deleting u_init.h if desired. |
From: Luca B. <lu...@lu...> - 2010-04-02 01:17:16
|
Are you sure about this? I've tried doing it, and it turns out that basically every Gallium module needs this initialized. For instance: st/mesa due to glReadPixels vg/mesa due to vgReadPixels st/python due to mesa sampling several programs in rbug to dump textures to disk softpipe due to texture sampling nv50 due to static attrbutes Also, if translate did not needlessly duplicate the generic format support, it would also need it, and draw would too. Basically everything in Gallium will end up having util_format initialized, and it seems there are at least 10 different places in the code where such a call would need to be added (including strange places like rbug with call pipe*tile* which calls util_format_read*). I added it for nv50 before realizing the extent of the changes needed, but now think it is not really a feasible solution. In other words, I think this should be revisited as it results in cluttering the codebase and creating a somewhat unreliable system. I believe that we should either use the global constructor-like technique I introduced, or do the following: 1. Pregenerate half float tables 2. Initialize the S3TC function pointers to stubs that dlopen the library, initialize the function pointers to the real functions and then delegate to the real function corresponding to the stub More specifically, I think this two-step approach is superior to the global constructor, but that the global constructor technique may be useful in other cases where it is not possible to either pregenerate or have a "free initialization check" due to the S3TC dynamic loading. |
From: Corbin S. <mos...@gm...> - 2010-04-02 02:24:06
|
On Thu, Apr 1, 2010 at 6:17 PM, Luca Barbieri <lu...@lu...> wrote: > Are you sure about this? > > I've tried doing it, and it turns out that basically every Gallium > module needs this initialized. > > For instance: > st/mesa due to glReadPixels > vg/mesa due to vgReadPixels > st/python due to mesa sampling > several programs in rbug to dump textures to disk > softpipe due to texture sampling > nv50 due to static attrbutes > > Also, if translate did not needlessly duplicate the generic format > support, it would also need it, and draw would too. > > Basically everything in Gallium will end up having util_format > initialized, and it seems there are at least 10 different places in > the code where such a call would need to be added (including strange > places like rbug with call pipe*tile* which calls util_format_read*). > > I added it for nv50 before realizing the extent of the changes needed, > but now think it is not really a feasible solution. > > In other words, I think this should be revisited as it results in > cluttering the codebase and creating a somewhat unreliable system. > > I believe that we should either use the global constructor-like > technique I introduced, or do the following: > 1. Pregenerate half float tables > 2. Initialize the S3TC function pointers to stubs that dlopen the > library, initialize the function pointers to the real functions and > then delegate to the real function corresponding to the stub > > More specifically, I think this two-step approach is superior to the > global constructor, but that the global constructor technique may be > useful in other cases where it is not possible to either pregenerate > or have a "free initialization check" due to the S3TC dynamic loading. I still think you might be overthinking this, but whatever. :3 ~ C. -- When the facts change, I change my mind. What do you do, sir? ~ Keynes Corbin Simpson <Mos...@gm...> |
From: Jose F. <jfo...@vm...> - 2010-04-02 07:51:10
|
Reliability goes both ways: using a global constructor simplifies the problem substantially, but it is only as reliable as the global constructor mechanism itself -- which it's not much given the compiler/linker magic necessary to work. I also think that we should code generate most constat lookup tables: it avoids the initialization problem, and it saves memory when the SO is loaded in several processes as pages can be shared. For S3TC it is not a real problem: any code that exposes S3TC support needs to check if S3TC (de)compresion support is available before advertising (xxx_screen.c). Checking the support and initialization should be done by a single function. That is, if the libtxc_dxtn.so is not available the s3tc functions should *never* be called. Jose ________________________________________ From: Luca Barbieri [lu...@lu...] Sent: Friday, April 02, 2010 2:17 To: Brian Paul Cc: mes...@li... Subject: Re: [Mesa3d-dev] How do we init half float tables? Are you sure about this? I've tried doing it, and it turns out that basically every Gallium module needs this initialized. For instance: st/mesa due to glReadPixels vg/mesa due to vgReadPixels st/python due to mesa sampling several programs in rbug to dump textures to disk softpipe due to texture sampling nv50 due to static attrbutes Also, if translate did not needlessly duplicate the generic format support, it would also need it, and draw would too. Basically everything in Gallium will end up having util_format initialized, and it seems there are at least 10 different places in the code where such a call would need to be added (including strange places like rbug with call pipe*tile* which calls util_format_read*). I added it for nv50 before realizing the extent of the changes needed, but now think it is not really a feasible solution. In other words, I think this should be revisited as it results in cluttering the codebase and creating a somewhat unreliable system. I believe that we should either use the global constructor-like technique I introduced, or do the following: 1. Pregenerate half float tables 2. Initialize the S3TC function pointers to stubs that dlopen the library, initialize the function pointers to the real functions and then delegate to the real function corresponding to the stub More specifically, I think this two-step approach is superior to the global constructor, but that the global constructor technique may be useful in other cases where it is not possible to either pregenerate or have a "free initialization check" due to the S3TC dynamic loading. ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Mesa3d-dev mailing list Mes...@li... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
From: Luca B. <lu...@lu...> - 2010-04-02 15:09:32
|
On Fri, Apr 2, 2010 at 9:51 AM, Jose Fonseca <jfo...@vm...> wrote: > Reliability goes both ways: using a global constructor simplifies the problem substantially, but it is only as reliable as the global constructor mechanism itself -- which it's not much given the compiler/linker magic necessary to work. Once it is written and works, it should just work indefinitely, since the C++ ABI relies on it. > I also think that we should code generate most constat lookup tables: it avoids the initialization problem, and it saves memory when the SO is loaded in several processes as pages can be shared. Yes, I have done exactly this, for the reasons you describe. The half float tables are now pre-generated by a Python script. > For S3TC it is not a real problem: any code that exposes S3TC support needs to check if S3TC (de)compresion support is available before advertising (xxx_screen.c). Checking the support and initialization should be done by a single function. That is, if the libtxc_dxtn.so is not available the s3tc functions should *never* be called. I changed the format code to add a new "util_format_is_supported" that allows users to query if the util_format code can pack/unpack a given format. That function, if called on an s3tc format, will automatically call the S3TC init function. This way, users of the format helpers no longer need to know about S3TC at all, and it will just work. Even if the util_format_is_supported function is not called before reading/writing an S3TC format, the fetch/pack function pointers are initialized to versions that will autoload the S3TC library. Additionally, the S3TC library may now support only a subset of the formats. This may be even more useful as further compressed formats are added. There are two areas where I'm not totally happy with the approach I did though: 1. Currently upon a successful load of S3TC, we hack the format descriptions to set the is_supported bit to 1. Not sure if it is the best way to do it. 2. If S3TC is not available, the functions just do nothing: this is consistent with other unsupported formats. We may however want to read black pixels, throw an error, or read a special "error" image instead (I'd suggest having "unsupported" unpack/pack functions and pointing all unsupported formats to them) Also, I think we should remove all the Mesa internal format code and make it use util_format instead, unless there is really a good reason to duplicate it all. Same thing for util_half. |
From: Brian P. <br...@vm...> - 2010-04-02 16:07:31
|
Luca Barbieri wrote: > On Fri, Apr 2, 2010 at 9:51 AM, Jose Fonseca <jfo...@vm...> wrote: >> Reliability goes both ways: using a global constructor simplifies the problem substantially, but it is only as reliable as the global constructor mechanism itself -- which it's not much given the compiler/linker magic necessary to work. > > Once it is written and works, it should just work indefinitely, since > the C++ ABI relies on it. > >> I also think that we should code generate most constat lookup tables: it avoids the initialization problem, and it saves memory when the SO is loaded in several processes as pages can be shared. > > Yes, I have done exactly this, for the reasons you describe. > The half float tables are now pre-generated by a Python script. > >> For S3TC it is not a real problem: any code that exposes S3TC support needs to check if S3TC (de)compresion support is available before advertising (xxx_screen.c). Checking the support and initialization should be done by a single function. That is, if the libtxc_dxtn.so is not available the s3tc functions should *never* be called. > > I changed the format code to add a new "util_format_is_supported" that > allows users to query if the util_format code can pack/unpack a given > format. > > That function, if called on an s3tc format, will automatically call > the S3TC init function. > This way, users of the format helpers no longer need to know about > S3TC at all, and it will just work. > > Even if the util_format_is_supported function is not called before > reading/writing an S3TC format, the fetch/pack function pointers are > initialized to versions that will autoload the S3TC library. > > Additionally, the S3TC library may now support only a subset of the > formats. This may be even more useful as further compressed formats > are added. > > There are two areas where I'm not totally happy with the approach I did though: > 1. Currently upon a successful load of S3TC, we hack the format > descriptions to set the is_supported bit to 1. Not sure if it is the > best way to do it. > 2. If S3TC is not available, the functions just do nothing: this is > consistent with other unsupported formats. We may however want to read > black pixels, throw an error, or read a special "error" image instead > (I'd suggest having "unsupported" unpack/pack functions and pointing > all unsupported formats to them) > > Also, I think we should remove all the Mesa internal format code and > make it use util_format instead, unless there is really a good reason > to duplicate it all. > Same thing for util_half. So far, there are no dependencies on Gallium in core Mesa. We've talked about refactoring some of the Gallium code into a separate module that'd be sharable between Gallium and Mesa. The format code would probably fit into that. -Brian |
From: Luca B. <lu...@lu...> - 2010-04-02 18:46:14
|
> So far, there are no dependencies on Gallium in core Mesa. > > We've talked about refactoring some of the Gallium code into a separate > module that'd be sharable between Gallium and Mesa. The format code would > probably fit into that. Can't we just unconditionally pull gallium/auxiliary in Mesa? (unused stuff will be ignored by the linker due to .a behavior) |
From: Roland S. <sr...@vm...> - 2010-04-02 16:27:38
|
On 02.04.2010 17:09, Luca Barbieri wrote: > Additionally, the S3TC library may now support only a subset of the > formats. This may be even more useful as further compressed formats > are added. FWIW, I don't see any new s3tc formats. rgtc will not be handled by s3tc library since it isn't patent encumbered. util_format_is_s3tc will not include rgtc formats. (Though I guess that external decoding per-pixel is really rather lame, should do it per-block...) Roland |
From: Luca B. <lu...@lu...> - 2010-04-02 18:45:11
|
> FWIW, I don't see any new s3tc formats. rgtc will not be handled by s3tc > library since it isn't patent encumbered. util_format_is_s3tc will not > include rgtc formats. > (Though I guess that external decoding per-pixel is really rather lame, > should do it per-block...) Yes the other formats (rgtc and bptc) have no patent claims listed. |
From: Jose F. <jfo...@vm...> - 2010-04-03 00:23:08
|
Both ways are useful: single pixel decompression for texture sampling, whole block for whole image conversions. Jose ________________________________________ From: Roland Scheidegger [sr...@vm...] Sent: Friday, April 02, 2010 17:27 To: Luca Barbieri Cc: Jose Fonseca; mes...@li... Subject: Re: [Mesa3d-dev] How do we init half float tables? On 02.04.2010 17:09, Luca Barbieri wrote: > Additionally, the S3TC library may now support only a subset of the > formats. This may be even more useful as further compressed formats > are added. FWIW, I don't see any new s3tc formats. rgtc will not be handled by s3tc library since it isn't patent encumbered. util_format_is_s3tc will not include rgtc formats. (Though I guess that external decoding per-pixel is really rather lame, should do it per-block...) Roland |
From: Jose F. <jfo...@vm...> - 2010-04-02 23:36:47
|
S3TC was working on my machine and now it is not! How that goes for reliability? I was unsure but now you sure convinced me. I'm removing this constructor magic madness. Please leave the much more reliable system of explicitly calling initializers. Jose ________________________________________ From: luc...@gm... [luc...@gm...] On Behalf Of Luca Barbieri [lu...@lu...] Sent: Friday, April 02, 2010 16:09 To: Jose Fonseca Cc: Brian Paul; mes...@li... Subject: Re: [Mesa3d-dev] How do we init half float tables? On Fri, Apr 2, 2010 at 9:51 AM, Jose Fonseca <jfo...@vm...> wrote: > Reliability goes both ways: using a global constructor simplifies the problem substantially, but it is only as reliable as the global constructor mechanism itself -- which it's not much given the compiler/linker magic necessary to work. Once it is written and works, it should just work indefinitely, since the C++ ABI relies on it. > I also think that we should code generate most constat lookup tables: it avoids the initialization problem, and it saves memory when the SO is loaded in several processes as pages can be shared. Yes, I have done exactly this, for the reasons you describe. The half float tables are now pre-generated by a Python script. > For S3TC it is not a real problem: any code that exposes S3TC support needs to check if S3TC (de)compresion support is available before advertising (xxx_screen.c). Checking the support and initialization should be done by a single function. That is, if the libtxc_dxtn.so is not available the s3tc functions should *never* be called. I changed the format code to add a new "util_format_is_supported" that allows users to query if the util_format code can pack/unpack a given format. That function, if called on an s3tc format, will automatically call the S3TC init function. This way, users of the format helpers no longer need to know about S3TC at all, and it will just work. Even if the util_format_is_supported function is not called before reading/writing an S3TC format, the fetch/pack function pointers are initialized to versions that will autoload the S3TC library. Additionally, the S3TC library may now support only a subset of the formats. This may be even more useful as further compressed formats are added. There are two areas where I'm not totally happy with the approach I did though: 1. Currently upon a successful load of S3TC, we hack the format descriptions to set the is_supported bit to 1. Not sure if it is the best way to do it. 2. If S3TC is not available, the functions just do nothing: this is consistent with other unsupported formats. We may however want to read black pixels, throw an error, or read a special "error" image instead (I'd suggest having "unsupported" unpack/pack functions and pointing all unsupported formats to them) Also, I think we should remove all the Mesa internal format code and make it use util_format instead, unless there is really a good reason to duplicate it all. Same thing for util_half. |
From: Luca B. <lu...@lu...> - 2010-04-02 23:50:14
|
What are you seeing a regression on? texcompress and texcompsub seemed to work for me: I'll try to test something else and recheck the code. |
From: Luca B. <lu...@lu...> - 2010-04-03 00:08:47
|
Sorry for the regression. This whole thing was done to fix the u_gctors.cpp issue, originally done by me, sent out without full testing since I saw duplicate work being done, and then merged by Roland if I recall correctly. I probably should not have fixed s3tc/util_format like it was done for u_half and instead put it in a branch and sent it to the ML first. Note that everything that reads pixels and does not call util_format_s3tc_init (e.g. I think rbug tools) needs something like this, or an explicit call which is likely to be forgotten (even finding out everything that ends up calling util_format is nontrivial). Anyway, this patch fixes a couple of bugs that may have caused the regression. How can I reproduce it locally? The DXTn unit tests do fail, but the values have usually a difference of 1, so I assume it's an approximation error. commit 80214ef6265d406496dc4fd3c76d8ac782cd012b Author: Luca Barbieri <lu...@lu...> Date: Sat Apr 3 01:55:27 2010 +0200 gallium/util: fix inverted if is_nop logic in s3tc diff --git a/src/gallium/auxiliary/util/u_format_s3tc.c b/src/gallium/auxiliary/util/u_format_s3tc.c index d48551f..7808210 100644 --- a/src/gallium/auxiliary/util/u_format_s3tc.c +++ b/src/gallium/auxiliary/util/u_format_s3tc.c @@ -303,7 +303,7 @@ util_format_dxt3_rgba_unpack_8unorm(uint8_t *dst_row, unsigned dst_stride, const void util_format_dxt5_rgba_unpack_8unorm(uint8_t *dst_row, unsigned dst_stride, const uint8_t *src_row, unsigned src_stride, unsigned width, unsigned height) { - if (is_nop(util_format_dxt5_rgba_fetch)) { + if (!is_nop(util_format_dxt5_rgba_fetch)) { unsigned x, y, i, j; for(y = 0; y < height; y += 4) { const uint8_t *src = src_row; @@ -324,7 +324,7 @@ util_format_dxt5_rgba_unpack_8unorm(uint8_t *dst_row, unsigned dst_stride, const void util_format_dxt1_rgb_unpack_float(float *dst_row, unsigned dst_stride, const uint8_t *src_row, unsigned src_stride, unsigned width, unsigned height) { - if (is_nop(util_format_dxt1_rgb_fetch)) { + if (!is_nop(util_format_dxt1_rgb_fetch)) { unsigned x, y, i, j; for(y = 0; y < height; y += 4) { const uint8_t *src = src_row; |
From: Jose F. <jfo...@vm...> - 2010-04-03 00:26:11
|
Probably the problems are just as you describe. But I'll be offline soon so I'll only review this and all your other changes carefully another day. Jose ________________________________________ From: luc...@gm... [luc...@gm...] On Behalf Of Luca Barbieri [lu...@lu...] Sent: Saturday, April 03, 2010 1:08 To: Jose Fonseca Cc: Brian Paul; mes...@li... Subject: Re: [Mesa3d-dev] How do we init half float tables? Sorry for the regression. This whole thing was done to fix the u_gctors.cpp issue, originally done by me, sent out without full testing since I saw duplicate work being done, and then merged by Roland if I recall correctly. I probably should not have fixed s3tc/util_format like it was done for u_half and instead put it in a branch and sent it to the ML first. Note that everything that reads pixels and does not call util_format_s3tc_init (e.g. I think rbug tools) needs something like this, or an explicit call which is likely to be forgotten (even finding out everything that ends up calling util_format is nontrivial). Anyway, this patch fixes a couple of bugs that may have caused the regression. How can I reproduce it locally? The DXTn unit tests do fail, but the values have usually a difference of 1, so I assume it's an approximation error. commit 80214ef6265d406496dc4fd3c76d8ac782cd012b Author: Luca Barbieri <lu...@lu...> Date: Sat Apr 3 01:55:27 2010 +0200 gallium/util: fix inverted if is_nop logic in s3tc diff --git a/src/gallium/auxiliary/util/u_format_s3tc.c b/src/gallium/auxiliary/util/u_format_s3tc.c index d48551f..7808210 100644 --- a/src/gallium/auxiliary/util/u_format_s3tc.c +++ b/src/gallium/auxiliary/util/u_format_s3tc.c @@ -303,7 +303,7 @@ util_format_dxt3_rgba_unpack_8unorm(uint8_t *dst_row, unsigned dst_stride, const void util_format_dxt5_rgba_unpack_8unorm(uint8_t *dst_row, unsigned dst_stride, const uint8_t *src_row, unsigned src_stride, unsigned width, unsigned height) { - if (is_nop(util_format_dxt5_rgba_fetch)) { + if (!is_nop(util_format_dxt5_rgba_fetch)) { unsigned x, y, i, j; for(y = 0; y < height; y += 4) { const uint8_t *src = src_row; @@ -324,7 +324,7 @@ util_format_dxt5_rgba_unpack_8unorm(uint8_t *dst_row, unsigned dst_stride, const void util_format_dxt1_rgb_unpack_float(float *dst_row, unsigned dst_stride, const uint8_t *src_row, unsigned src_stride, unsigned width, unsigned height) { - if (is_nop(util_format_dxt1_rgb_fetch)) { + if (!is_nop(util_format_dxt1_rgb_fetch)) { unsigned x, y, i, j; for(y = 0; y < height; y += 4) { const uint8_t *src = src_row; |
From: Jose F. <jfo...@vm...> - 2010-04-03 00:12:43
|
u_format_test started failing and it was not one day ago. Vinson reported some texture compression tests that just got working with my recent changes started to failing again. I'm not sure if it's the constructor mechanism, my platform (64bit), or some bug in the code. I just reverted all your recent util format changes. Not all look bad but I just don't have the time to separate the baby from the water. Sorry. I'll cherry pick some of them after I have more time to review and test them. One more thing: I'm maintaining the u_format* modules. I'm not speaking the just in the long term, but in the sense I'm actually working on this as we speak. Please do not make this kind of deep reaching changes to the u_format stuff in master without clearing them first with me. Either: - send me an email and buy in my support before implementing - send a patch of the implementation changes so that I can review - implement in a feature branch - or, if you think I'm unreasonable, just make a fork of the whole thing and do whatever you like without breaking the existing code that relies on it. master branch should be broken as little as possible as there is a lot of automated/manual testing going on that depends upon it. And going over and modifying code I just commited hinders my progress. Jose ________________________________________ From: luc...@gm... [luc...@gm...] On Behalf Of Luca Barbieri [lu...@lu...] Sent: Saturday, April 03, 2010 0:50 To: Jose Fonseca Cc: Brian Paul; mes...@li... Subject: Re: [Mesa3d-dev] How do we init half float tables? What are you seeing a regression on? texcompress and texcompsub seemed to work for me: I'll try to test something else and recheck the code. |
From: Luca B. <lu...@lu...> - 2010-04-03 00:23:31
|
> One more thing: I'm maintaining the u_format* modules. I'm not speaking the just in the long term, but in the sense I'm actually working on this as we speak. Please do not make this kind of deep reaching changes to the u_format stuff in master without clearing them first with me. Yes sorry, it was an attempt to fix breakage originally caused by code of mine that was sent out in a non-fully-mergeable state (to prevent duplicate work on half float conversion) and got merged anyway. Since master was already broken (due to u_gctors.cpp not being picked up by ld), it seemed a good idea to try to fix it. Unfortunately what seemed to be an easy fix gradually became something much more invasive than originally envisioned. After realizing the util_format_init thing wouldn't work out, I should have made these call util_format_s3tc_init again (was changed so they would init util_half as well) and then sent the util_foramt changes for review. I added a gallium-util-format-is-supported branch to hold the work and the fix I just sent. Sorry for not doing that in the first place. |
From: Jose F. <jfo...@vm...> - 2010-04-03 00:28:42
|
OK, I can relate with your reasoning. It's no biggie. Jose ________________________________________ From: luc...@gm... [luc...@gm...] On Behalf Of Luca Barbieri [lu...@lu...] Sent: Saturday, April 03, 2010 1:23 To: Jose Fonseca Cc: mes...@li... Subject: Re: [Mesa3d-dev] How do we init half float tables? > One more thing: I'm maintaining the u_format* modules. I'm not speaking the just in the long term, but in the sense I'm actually working on this as we speak. Please do not make this kind of deep reaching changes to the u_format stuff in master without clearing them first with me. Yes sorry, it was an attempt to fix breakage originally caused by code of mine that was sent out in a non-fully-mergeable state (to prevent duplicate work on half float conversion) and got merged anyway. Since master was already broken (due to u_gctors.cpp not being picked up by ld), it seemed a good idea to try to fix it. Unfortunately what seemed to be an easy fix gradually became something much more invasive than originally envisioned. After realizing the util_format_init thing wouldn't work out, I should have made these call util_format_s3tc_init again (was changed so they would init util_half as well) and then sent the util_foramt changes for review. I added a gallium-util-format-is-supported branch to hold the work and the fix I just sent. Sorry for not doing that in the first place. |
From: Luca B. <lu...@lu...> - 2010-04-03 00:48:15
|
The s3tc-teximage test seems fixed by the two line change I put in gallium-util-format-is-supported. s3tc-texsubimage prints: Mesa: User error: GL_INVALID_VALUE in glTexSubImage2D(xoffset+width) Probe at (285,12) Expected: 1.000000 0.000000 0.000000 Observed: 0.000000 0.000000 0.000000 which seems to be due to a Mesa or testcase bug. As for u_format_test.c, it looks like it simply fails to account for DXTn being lossy. |
From: Jose F. <jfo...@vm...> - 2010-04-03 10:15:48
|
Thanks Luca. Concerning u_format_test.c, I'm not sure the problem is lossiness or ambivalence in the format or a bug in the compressor, but there was logic in u_format_test.c to skip the DXT1_RGBA packing -- all other tests were passing. Lossiness by itself doesn't explain the test failure because we're feeding to the compressor the RGBA data that resulted from decompressing. Given that DXT compression works by interpolating colors in a line segment of the RGB color space, when re-feeding the decompressed output to the compressor it should quickly find the line as all colors points lie exactly on it. "Exactly" is a too string word, as there is rounding, which could be in the root of the differences. Jose ________________________________________ From: luc...@gm... [luc...@gm...] On Behalf Of Luca Barbieri [lu...@lu...] Sent: Saturday, April 03, 2010 1:48 To: Jose Fonseca Cc: mes...@li... Subject: Re: [Mesa3d-dev] How do we init half float tables? The s3tc-teximage test seems fixed by the two line change I put in gallium-util-format-is-supported. s3tc-texsubimage prints: Mesa: User error: GL_INVALID_VALUE in glTexSubImage2D(xoffset+width) Probe at (285,12) Expected: 1.000000 0.000000 0.000000 Observed: 0.000000 0.000000 0.000000 which seems to be due to a Mesa or testcase bug. As for u_format_test.c, it looks like it simply fails to account for DXTn being lossy. |
From: Luca B. <lu...@lu...> - 2010-04-03 15:21:06
|
They are not passing for me with current master and a 32-bit system: Here are the failures: Testing util_format_dxt1_rgb_pack_8unorm ... FAILED: f2 d7 90 20 ae 2c 6f 97 obtained f2 d7 b0 20 ae 2c 6f 97 expected Testing util_format_dxt5_rgba_pack_8unorm ... FAILED: f7 10 c5 0c 9a 73 b4 9c f6 8f ab 32 2a 9a 95 5a obtained f8 11 c5 0c 9a 73 b4 9c f6 8f ab 32 2a 9a 95 5a expected Testing util_format_dxt1_rgb_unpack_8unorm ... FAILED: {0x99, 0xb0, 0x8e, 0xff}, {0x5d, 0x62, 0x89, 0xff}, {0x99, 0xb0, 0x8e, 0xff}, {0x99, 0xb0, 0x8e, 0xff}, {0xd6, 0xff, 0x94, 0xff}, {0x5d, 0x62, 0x89, 0xff}, {0x99, 0xb0, 0x8e, 0xff}, {0xd6, 0xff, 0x94, 0xff}, {0x5d, 0x62, 0x89, 0xff}, {0x5d, 0x62, 0x89, 0xff}, {0x99, 0xb0, 0x8e, 0xff}, {0x21, 0x14, 0x84, 0xff}, {0x5d, 0x62, 0x89, 0xff}, {0x21, 0x14, 0x84, 0xff}, {0x21, 0x14, 0x84, 0xff}, {0x99, 0xb0, 0x8e, 0xff} obtained {0x98, 0xaf, 0x8e, 0xff}, {0x5c, 0x62, 0x88, 0xff}, {0x98, 0xaf, 0x8e, 0xff}, {0x98, 0xaf, 0x8e, 0xff}, {0xd6, 0xff, 0x94, 0xff}, {0x5c, 0x62, 0x88, 0xff}, {0x98, 0xaf, 0x8e, 0xff}, {0xd6, 0xff, 0x94, 0xff}, {0x5c, 0x62, 0x88, 0xff}, {0x5c, 0x62, 0x88, 0xff}, {0x98, 0xaf, 0x8e, 0xff}, {0x21, 0x13, 0x84, 0xff}, {0x5c, 0x62, 0x88, 0xff}, {0x21, 0x13, 0x84, 0xff}, {0x21, 0x13, 0x84, 0xff}, {0x98, 0xaf, 0x8e, 0xff} expected Testing util_format_dxt1_rgba_unpack_8unorm ... FAILED: {0x00, 0x00, 0x00, 0x00}, {0x4e, 0xaa, 0x90, 0xff}, {0x4e, 0xaa, 0x90, 0xff}, {0x00, 0x00, 0x00, 0x00}, {0x4e, 0xaa, 0x90, 0xff}, {0x29, 0xff, 0xff, 0xff}, {0x00, 0x00, 0x00, 0x00}, {0x4e, 0xaa, 0x90, 0xff}, {0x73, 0x55, 0x21, 0xff}, {0x00, 0x00, 0x00, 0x00}, {0x00, 0x00, 0x00, 0x00}, {0x4e, 0xaa, 0x90, 0xff}, {0x4e, 0xaa, 0x90, 0xff}, {0x00, 0x00, 0x00, 0x00}, {0x00, 0x00, 0x00, 0x00}, {0x4e, 0xaa, 0x90, 0xff} obtained {0x00, 0x00, 0x00, 0x00}, {0x4e, 0xa9, 0x8f, 0xff}, {0x4e, 0xa9, 0x8f, 0xff}, {0x00, 0x00, 0x00, 0x00}, {0x4e, 0xa9, 0x8f, 0xff}, {0x29, 0xff, 0xff, 0xff}, {0x00, 0x00, 0x00, 0x00}, {0x4e, 0xa9, 0x8f, 0xff}, {0x73, 0x54, 0x21, 0xff}, {0x00, 0x00, 0x00, 0x00}, {0x00, 0x00, 0x00, 0x00}, {0x4e, 0xa9, 0x8f, 0xff}, {0x4e, 0xa9, 0x8f, 0xff}, {0x00, 0x00, 0x00, 0x00}, {0x00, 0x00, 0x00, 0x00}, {0x4e, 0xa9, 0x8f, 0xff} expected Testing util_format_dxt3_rgba_unpack_8unorm ... FAILED: {0x6d, 0xc6, 0x96, 0x77}, {0x6d, 0xc6, 0x96, 0xee}, {0x6d, 0xc6, 0x96, 0xaa}, {0x8c, 0xff, 0xb5, 0x44}, {0x6d, 0xc6, 0x96, 0xff}, {0x6d, 0xc6, 0x96, 0x88}, {0x31, 0x55, 0x5a, 0x66}, {0x6d, 0xc6, 0x96, 0x99}, {0x31, 0x55, 0x5a, 0xbb}, {0x31, 0x55, 0x5a, 0x55}, {0x31, 0x55, 0x5a, 0x11}, {0x6d, 0xc6, 0x96, 0xcc}, {0x6d, 0xc6, 0x96, 0xcc}, {0x6d, 0xc6, 0x96, 0x11}, {0x31, 0x55, 0x5a, 0x44}, {0x31, 0x55, 0x5a, 0x88} obtained {0x6c, 0xc6, 0x96, 0x77}, {0x6c, 0xc6, 0x96, 0xee}, {0x6c, 0xc6, 0x96, 0xa9}, {0x8c, 0xff, 0xb5, 0x43}, {0x6c, 0xc6, 0x96, 0xff}, {0x6c, 0xc6, 0x96, 0x87}, {0x31, 0x54, 0x5a, 0x66}, {0x6c, 0xc6, 0x96, 0x98}, {0x31, 0x54, 0x5a, 0xba}, {0x31, 0x54, 0x5a, 0x54}, {0x31, 0x54, 0x5a, 0x10}, {0x6c, 0xc6, 0x96, 0xcc}, {0x6c, 0xc6, 0x96, 0xcc}, {0x6c, 0xc6, 0x96, 0x10}, {0x31, 0x54, 0x5a, 0x43}, {0x31, 0x54, 0x5a, 0x87} expected Testing util_format_dxt5_rgba_unpack_8unorm ... FAILED: {0x6d, 0xc6, 0x96, 0x74}, {0x6d, 0xc6, 0x96, 0xf8}, {0x6d, 0xc6, 0x96, 0xb6}, {0x8c, 0xff, 0xb5, 0x53}, {0x6d, 0xc6, 0x96, 0xf8}, {0x6d, 0xc6, 0x96, 0x95}, {0x31, 0x55, 0x5a, 0x53}, {0x6d, 0xc6, 0x96, 0x95}, {0x31, 0x55, 0x5a, 0xb6}, {0x31, 0x55, 0x5a, 0x53}, {0x31, 0x55, 0x5a, 0x11}, {0x6d, 0xc6, 0x96, 0xd7}, {0x6d, 0xc6, 0x96, 0xb6}, {0x6d, 0xc6, 0x96, 0x11}, {0x31, 0x55, 0x5a, 0x32}, {0x31, 0x55, 0x5a, 0x95} obtained {0x6c, 0xc6, 0x96, 0x73}, {0x6c, 0xc6, 0x96, 0xf7}, {0x6c, 0xc6, 0x96, 0xb6}, {0x8c, 0xff, 0xb5, 0x53}, {0x6c, 0xc6, 0x96, 0xf7}, {0x6c, 0xc6, 0x96, 0x95}, {0x31, 0x54, 0x5a, 0x53}, {0x6c, 0xc6, 0x96, 0x95}, {0x31, 0x54, 0x5a, 0xb6}, {0x31, 0x54, 0x5a, 0x53}, {0x31, 0x54, 0x5a, 0x10}, {0x6c, 0xc6, 0x96, 0xd7}, {0x6c, 0xc6, 0x96, 0xb6}, {0x6c, 0xc6, 0x96, 0x10}, {0x31, 0x54, 0x5a, 0x31}, {0x31, 0x54, 0x5a, 0x95} expected Compiling libtxc_dxtn with -O0 or with -march=core2 -msse2 -mfpmath=sse did not make them work. As you can see the tests seem mostly off-by-one, which makes me think of an approximation problem. libtxc_dxtn seems to take 8-bit input instead of floating point input, so and it seems to be inherently hard to get it to roundtrip sensibly. Since only integer-coordinate points can be used, they are unlikely to be exactly on a line unless specifically crafted to be so. Thus, a possible solution could be to actually pick a starting color, pick an increment, and generate an exact line by adding multiples of that increment to the starting color. |
From: Luca B. <lu...@lu...> - 2010-04-03 15:31:53
|
For instance, the DXT1 test is wrong. The red values used are: 33 93 153 214 99 - 33 = 60 153 - 93 = 60 214 - 153 = 61 213 should be used instead (i.e. 0xd5 instead 0xd6) |