lcms-user Mailing List for Little cms color engine (Page 13)
An ICC-based CMM for color management
Brought to you by:
mm2
You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
(15) |
Jun
(24) |
Jul
(9) |
Aug
(14) |
Sep
|
Oct
(12) |
Nov
(17) |
Dec
(31) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(34) |
Feb
(7) |
Mar
(7) |
Apr
(16) |
May
(4) |
Jun
(14) |
Jul
(34) |
Aug
(54) |
Sep
(11) |
Oct
(25) |
Nov
(1) |
Dec
(6) |
2003 |
Jan
(27) |
Feb
(54) |
Mar
(23) |
Apr
(68) |
May
(82) |
Jun
(36) |
Jul
(45) |
Aug
(45) |
Sep
(49) |
Oct
(30) |
Nov
(65) |
Dec
(23) |
2004 |
Jan
(52) |
Feb
(52) |
Mar
(35) |
Apr
(38) |
May
(93) |
Jun
(22) |
Jul
(51) |
Aug
(50) |
Sep
(73) |
Oct
(28) |
Nov
(30) |
Dec
(51) |
2005 |
Jan
(22) |
Feb
(79) |
Mar
(38) |
Apr
(51) |
May
(95) |
Jun
(60) |
Jul
(56) |
Aug
(49) |
Sep
(22) |
Oct
(43) |
Nov
(15) |
Dec
(40) |
2006 |
Jan
(51) |
Feb
(31) |
Mar
(37) |
Apr
(25) |
May
(9) |
Jun
(13) |
Jul
(17) |
Aug
(66) |
Sep
(7) |
Oct
(12) |
Nov
(14) |
Dec
(31) |
2007 |
Jan
(18) |
Feb
(9) |
Mar
(22) |
Apr
(18) |
May
(5) |
Jun
(25) |
Jul
(2) |
Aug
(15) |
Sep
(12) |
Oct
(40) |
Nov
(10) |
Dec
(23) |
2008 |
Jan
(21) |
Feb
(56) |
Mar
(12) |
Apr
(23) |
May
(47) |
Jun
(75) |
Jul
(24) |
Aug
(2) |
Sep
(7) |
Oct
(26) |
Nov
(20) |
Dec
(16) |
2009 |
Jan
(14) |
Feb
(1) |
Mar
(29) |
Apr
(54) |
May
(18) |
Jun
(16) |
Jul
(5) |
Aug
(3) |
Sep
(38) |
Oct
(6) |
Nov
(25) |
Dec
(28) |
2010 |
Jan
(11) |
Feb
(26) |
Mar
(2) |
Apr
(10) |
May
(45) |
Jun
(94) |
Jul
(11) |
Aug
(32) |
Sep
(18) |
Oct
(37) |
Nov
(19) |
Dec
(34) |
2011 |
Jan
(21) |
Feb
(16) |
Mar
(16) |
Apr
(29) |
May
(17) |
Jun
(18) |
Jul
(7) |
Aug
(21) |
Sep
(10) |
Oct
(7) |
Nov
(15) |
Dec
(6) |
2012 |
Jan
(13) |
Feb
(16) |
Mar
(15) |
Apr
(12) |
May
(15) |
Jun
(31) |
Jul
(22) |
Aug
(15) |
Sep
(46) |
Oct
(21) |
Nov
(15) |
Dec
(33) |
2013 |
Jan
(19) |
Feb
(17) |
Mar
(31) |
Apr
(17) |
May
(27) |
Jun
(24) |
Jul
(26) |
Aug
(11) |
Sep
(9) |
Oct
(22) |
Nov
(14) |
Dec
(16) |
2014 |
Jan
(20) |
Feb
(66) |
Mar
(29) |
Apr
(13) |
May
(9) |
Jun
|
Jul
(11) |
Aug
(21) |
Sep
(15) |
Oct
(5) |
Nov
(5) |
Dec
(10) |
2015 |
Jan
(6) |
Feb
(26) |
Mar
(26) |
Apr
|
May
(9) |
Jun
(5) |
Jul
(5) |
Aug
(11) |
Sep
(8) |
Oct
|
Nov
|
Dec
|
2016 |
Jan
(3) |
Feb
|
Mar
(9) |
Apr
(3) |
May
(16) |
Jun
(26) |
Jul
(32) |
Aug
(27) |
Sep
(9) |
Oct
|
Nov
(4) |
Dec
(10) |
2017 |
Jan
(11) |
Feb
(44) |
Mar
(6) |
Apr
(8) |
May
(1) |
Jun
(2) |
Jul
(34) |
Aug
(28) |
Sep
(3) |
Oct
(9) |
Nov
(3) |
Dec
|
2018 |
Jan
(1) |
Feb
(5) |
Mar
(6) |
Apr
(1) |
May
(1) |
Jun
(2) |
Jul
|
Aug
(1) |
Sep
(6) |
Oct
|
Nov
(6) |
Dec
|
2019 |
Jan
(18) |
Feb
(16) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(7) |
Sep
(3) |
Oct
(10) |
Nov
(1) |
Dec
(3) |
2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(17) |
Jun
(23) |
Jul
|
Aug
(4) |
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
(10) |
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(5) |
Oct
|
Nov
(1) |
Dec
|
2022 |
Jan
(8) |
Feb
|
Mar
(9) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(13) |
Nov
(12) |
Dec
|
2023 |
Jan
|
Feb
(1) |
Mar
(9) |
Apr
|
May
(3) |
Jun
(5) |
Jul
(3) |
Aug
(8) |
Sep
|
Oct
|
Nov
(1) |
Dec
(9) |
2024 |
Jan
(8) |
Feb
|
Mar
(14) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2025 |
Jan
(1) |
Feb
(1) |
Mar
(1) |
Apr
(2) |
May
(5) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Robin W. <rob...@ar...> - 2017-08-01 17:06:32
|
On 01/08/2017 16:47, Bob Friesenhahn wrote: > The package name and version used in configure.ac is identical to lcms2 > 2.8 and that the shared library name would be the same (header file > names have changed). It appears that if someone were to install > LCMS2.art and lcms2 simultaneously that there would be a conflict. It > would be wise and very helpful if it is possible to install stock lcms2 > and LCMS2.art into the same installation tree (e.g. different base > shared library names and proper independent library versioning) so that > applications can use either one without conflict. OOps, thanks. We don't use the lcms2 autoconf in MuPDF, so I'd missed that. How about what's on the branch now? Thanks, Robin |
From: Bob F. <bfr...@si...> - 2017-08-01 15:47:38
|
The package name and version used in configure.ac is identical to lcms2 2.8 and that the shared library name would be the same (header file names have changed). It appears that if someone were to install LCMS2.art and lcms2 simultaneously that there would be a conflict. It would be wise and very helpful if it is possible to install stock lcms2 and LCMS2.art into the same installation tree (e.g. different base shared library names and proper independent library versioning) so that applications can use either one without conflict. Bob -- Bob Friesenhahn bfr...@si..., http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ |
From: Noel C. <NCa...@Pr...> - 2017-08-01 14:10:08
|
Hi Tobias, For what it's worth, since I'm already set up to do testing, I just altered testcms2.c to use the following definition for a Float XYZA->XYZA test step, modeled after the Float XYZ->XYZ test step, including the cmsFLAGS_COPY_ALPHA flag to cause the Alpha value to be passed through, and it passed with flying colors. // XYZA to XYZA with preserve extra channels set input = IdentityMatrixProfile( cmsSigXYZData); # define TYPE_XYZA_FLT (FLOAT_SH(1)|COLORSPACE_SH(PT_XYZ)|EXTRA_SH(1)|CHANNELS_SH(3)|BYTES_SH(4 )) xform = cmsCreateTransform( input, TYPE_XYZA_FLT, xyzProfile, TYPE_XYZA_FLT, INTENT_RELATIVE_COLORIMETRIC, cmsFLAGS_COPY_ALPHA); cmsCloseProfile(input); cmsDoTransform( xform, in, out, 1); cmsDeleteTransform( xform); if (!IsGoodVal("Float XYZA->XYZA", in[0], out[0], FLOAT_PRECISSION) || !IsGoodVal("Float XYZA->XYZA", in[1], out[1], FLOAT_PRECISSION) || !IsGoodVal("Float XYZA->XYZA", in[2], out[2], FLOAT_PRECISSION) || !IsGoodVal("Float XYZA->XYZA", in[3], out[3], FLOAT_PRECISSION)) return 0; This suggests that what you need works just fine. However, I don't know why Marti might have removed that definition from the list. Perhaps back then, when there was no ability to copy alpha values through (cmsFLAGS_COPY_ALPHA is a recent addition), it was deemed unlikely to be needed. Nothing keeps you from defining it yourself, as I did above. -Noel -----Original Message----- From: Tobias Ellinghaus [mailto:me...@ho...] Sent: Tue, August 1, 2017 8:55 AM To: lcm...@li... Subject: Re: [Lcms-user] Is TYPE_XYZA_FLT safe to use? Sorry to bump this, but since the list came back to life recently I hope that someone is able to answer this. Am Dienstag, 30. Mai 2017, 12:16:48 CEST schrieb Tobias Ellinghaus: > Hi, > > just a quick question: I want to use TYPE_XYZA_FLT in my code which is still > mentioned in the recent API documentation. However, it seems that it was > removed from lcms2 in 2.4 rc2 (d00163e17de9399a77138d035104ff1786d89d1d). > > Is there a reason not to use it? For the time being I can just #define it > myself the way it used to be done before, but maybe I am missing something? > > Tobias |
From: Tobias E. <me...@ho...> - 2017-08-01 13:21:38
|
Sorry to bump this, but since the list came back to life recently I hope that someone is able to answer this. Am Dienstag, 30. Mai 2017, 12:16:48 CEST schrieb Tobias Ellinghaus: > Hi, > > just a quick question: I want to use TYPE_XYZA_FLT in my code which is still > mentioned in the recent API documentation. However, it seems that it was > removed from lcms2 in 2.4 rc2 (d00163e17de9399a77138d035104ff1786d89d1d). > > Is there a reason not to use it? For the time being I can just #define it > myself the way it used to be done before, but maybe I am missing something? > > Tobias |
From: Noel C. <NCa...@Pr...> - 2017-08-01 12:02:44
|
As you may have guessed from my recent list eMails, I've been looking at Little CMS performance over the past few days. I have improved Little CMS performance some overall through changes to the C source code. This expands on the warning reduction changes I worked up before, so Marti, if you haven't reviewed that code yet, you can put it aside and just check the changes with this code base against your Git repository: http://Noel.ProDigitalSoftware.com/temp/LittleCMSOptimizations20170801.z ip Just one file has changed since my last submission: cmsintrp.c Measured performance, via the testbed, as built with Visual Studio 2017's C++ compiler: The code built from the above zip file, 64 bit: 16 bits on CLUT profiles : 38.7409 MPixel/sec. 8 bits on CLUT profiles : 32.9897 MPixel/sec. 8 bits on Matrix-Shaper profiles : 66.6667 MPixel/sec. 8 bits on SAME Matrix-Shaper profiles : 121.212 MPixel/sec. 8 bits on Matrix-Shaper profiles (AbsCol) : 66.6667 MPixel/sec. 16 bits on Matrix-Shaper profiles : 38.7409 MPixel/sec. 16 bits on SAME Matrix-Shaper profiles : 137.931 MPixel/sec. 16 bits on Matrix-Shaper profiles (AbsCol) : 38.7409 MPixel/sec. 8 bits on curves : 89.3855 MPixel/sec. 16 bits on curves : 94.1176 MPixel/sec. 8 bits on CMYK profiles : 14.5455 MPixel/sec. 16 bits on CMYK profiles : 14.5985 MPixel/sec. 8 bits on gray-to gray : 124.031 MPixel/sec. 8 bits on gray-to-lab gray : 125 MPixel/sec. 8 bits on SAME gray-to-gray : 124.031 MPixel/sec. Git trunk as of 7/28, 64 bit: 16 bits on CLUT profiles : 34.4086 MPixel/sec. 8 bits on CLUT profiles : 32.2581 MPixel/sec. 8 bits on Matrix-Shaper profiles : 66.6667 MPixel/sec. 8 bits on SAME Matrix-Shaper profiles : 117.647 MPixel/sec. 8 bits on Matrix-Shaper profiles (AbsCol) : 66.1157 MPixel/sec. 16 bits on Matrix-Shaper profiles : 34.3348 MPixel/sec. 16 bits on SAME Matrix-Shaper profiles : 137.931 MPixel/sec. 16 bits on Matrix-Shaper profiles (AbsCol) : 34.4086 MPixel/sec. 8 bits on curves : 88.8889 MPixel/sec. 16 bits on curves : 94.1176 MPixel/sec. 8 bits on CMYK profiles : 11.9048 MPixel/sec. 16 bits on CMYK profiles : 11.9581 MPixel/sec. 8 bits on gray-to gray : 103.896 MPixel/sec. 8 bits on gray-to-lab gray : 105.263 MPixel/sec. 8 bits on SAME gray-to-gray : 104.575 MPixel/sec. Release 2.8 64, bit: 16 bits on CLUT profiles : 34.4828 MPixel/sec. 8 bits on CLUT profiles : 32.3887 MPixel/sec. 8 bits on Matrix-Shaper profiles : 66.6667 MPixel/sec. 8 bits on SAME Matrix-Shaper profiles : 121.212 MPixel/sec. 8 bits on Matrix-Shaper profiles (AbsCol) : 66.6667 MPixel/sec. 16 bits on Matrix-Shaper profiles : 34.4828 MPixel/sec. 16 bits on SAME Matrix-Shaper profiles : 126.984 MPixel/sec. 16 bits on Matrix-Shaper profiles (AbsCol) : 34.4828 MPixel/sec. 8 bits on curves : 88.8889 MPixel/sec. 16 bits on curves : 94.1176 MPixel/sec. 8 bits on CMYK profiles : 11.9671 MPixel/sec. 16 bits on CMYK profiles : 12.012 MPixel/sec. 8 bits on gray-to gray : 105.263 MPixel/sec. 8 bits on gray-to-lab gray : 105.263 MPixel/sec. 8 bits on SAME gray-to-gray : 104.575 MPixel/sec. The code from the above zip file 32 bit: 16 bits on CLUT profiles : 28.2686 MPixel/sec. 8 bits on CLUT profiles : 24.6914 MPixel/sec. 8 bits on Matrix-Shaper profiles : 51.7799 MPixel/sec. 8 bits on SAME Matrix-Shaper profiles : 78.4314 MPixel/sec. 8 bits on Matrix-Shaper profiles (AbsCol) : 51.6129 MPixel/sec. 16 bits on Matrix-Shaper profiles : 28.2686 MPixel/sec. 16 bits on SAME Matrix-Shaper profiles : 94.6746 MPixel/sec. 16 bits on Matrix-Shaper profiles (AbsCol) : 28.2686 MPixel/sec. 8 bits on curves : 63.745 MPixel/sec. 16 bits on curves : 71.7489 MPixel/sec. 8 bits on CMYK profiles : 11.3395 MPixel/sec. 16 bits on CMYK profiles : 11.4204 MPixel/sec. 8 bits on gray-to gray : 83.7696 MPixel/sec. 8 bits on gray-to-lab gray : 84.2105 MPixel/sec. 8 bits on SAME gray-to-gray : 84.2105 MPixel/sec. Git trunk as of 7/28 32 bit: 16 bits on CLUT profiles : 27.1186 MPixel/sec. 8 bits on CLUT profiles : 24.2424 MPixel/sec. 8 bits on Matrix-Shaper profiles : 51.7799 MPixel/sec. 8 bits on SAME Matrix-Shaper profiles : 78.4314 MPixel/sec. 8 bits on Matrix-Shaper profiles (AbsCol) : 51.7799 MPixel/sec. 16 bits on Matrix-Shaper profiles : 27.1186 MPixel/sec. 16 bits on SAME Matrix-Shaper profiles : 94.6746 MPixel/sec. 16 bits on Matrix-Shaper profiles (AbsCol) : 27.1186 MPixel/sec. 8 bits on curves : 64 MPixel/sec. 16 bits on curves : 71.7489 MPixel/sec. 8 bits on CMYK profiles : 10.7889 MPixel/sec. 16 bits on CMYK profiles : 10.7817 MPixel/sec. 8 bits on gray-to gray : 79.2079 MPixel/sec. 8 bits on gray-to-lab gray : 79.2079 MPixel/sec. 8 bits on SAME gray-to-gray : 73.0594 MPixel/sec. Release 2.8 32 bit: 16 bits on CLUT profiles : 27.1186 MPixel/sec. 8 bits on CLUT profiles : 24.5776 MPixel/sec. 8 bits on Matrix-Shaper profiles : 51.7799 MPixel/sec. 8 bits on SAME Matrix-Shaper profiles : 75.4717 MPixel/sec. 8 bits on Matrix-Shaper profiles (AbsCol) : 51.7799 MPixel/sec. 16 bits on Matrix-Shaper profiles : 27.1647 MPixel/sec. 16 bits on SAME Matrix-Shaper profiles : 94.6746 MPixel/sec. 16 bits on Matrix-Shaper profiles (AbsCol) : 27.1647 MPixel/sec. 8 bits on curves : 56.5371 MPixel/sec. 16 bits on curves : 63.4921 MPixel/sec. 8 bits on CMYK profiles : 10.7239 MPixel/sec. 16 bits on CMYK profiles : 10.8181 MPixel/sec. 8 bits on gray-to gray : 83.3333 MPixel/sec. 8 bits on gray-to-lab gray : 82.9016 MPixel/sec. 8 bits on SAME gray-to-gray : 82.9016 MPixel/sec. |
From: Robin W. <rob...@ar...> - 2017-08-01 11:26:07
|
Gents, Having watched the conversations that have gone on here in the past couple of days about LCMS2 and possible things that people would like to see happen with it, it seems a sensible time to break cover with an announcement. Artifex Software are about to fork LCMS2. I'll go into the reasons for this now, and say a bit about why you might care, and why you might not. Artifex Software has used LCMS2 in Ghostscript for a while. It's served us very well, and we are very happy with it. Marti worked with us to make some changes when we incorporated it to ensure it could work with Ghostscripts way of handling multiple threads. This work has been in mainline LCMS2 for several releases. Accordingly, when we were looking for a color management engine to use in MuPDF (another of our projects), we wanted to use LCMS2 too. Unfortunately, the way LCMS2 is currently set up to handle multi-threading is not quite general enough. (Technically, Ghostscript creates profiles/links etc from multiple different threads, but only uses a given profile/link in the single thread in which is is created. Objects are never shared between threads. MuPDF has a more general approach to such things, and objects are frequently created in one thread, and used in multiple ones, often at the same time) There is no way to fix this in LCMS2 without breaking the API. We went ahead and changed the API in our own copy, and found that what we ended up with was: * a very small conceptual change (basically, all the functions are the same as the 'cms...THR' functions were before, in that every function that might allocate memory takes a ContextID). * a smaller API overall (we no longer have the cms...THR variants of the API functions) * a cleaner implementation (IMHO) (we no longer ever store a ContextID in an LCMS2 structure) * One extra change; in lcms2, the number of 'extra' channels available to a transform is limited to 7. We've tweaked the order of the format specifier bits to lifted that to at least 63 (with room to increase it further in future). We offered these changes back to Marti, but he is (understandably, and correctly) reluctant to break the API. One possibility would be to have a new major LCMS release (LCMS3 ?) in which the new API became standard, but Marti has plans of his own, and it's not clear how this fits in. In the meantime, the release date for the new color-management capable version of MuPDF is looming. Because MuPDF is open source, this means we effectively need to do a release of our modified LCMS2. In order to do this in the least intrusive way possible, we've decided to call our release LCMS2.art. The plan is that every time Marti does a release of LCMS2, we'll do a parallel release with our changes in. We've renamed the include files to be lcms2art.h, so you can't inadvertently mismatch the 2 versions. We've tweaked the version define and the check for it, so that if you try and link an lcms2art with a program that expects lcms2 (or vice versa), you'll get an error. Our intent is to keep as close to mainline as we can, so we can share fixes both ways, and hope that we can rejoin the 2 forks at some point in future (on the next major release hopefully). So, why should anyone other than us care? Well, if you're wanting to use LCMS2 in a threaded environment, then it might be worth considering using our fork rather than mainline LCMS2. Otherwise, you probably want to stick with mainline. It shouldn't take more than an hour or so to move from mainline LCMS2 to our tweaked API (or vice versa), as they really are very close. The latest version of our code can be found at: git.ghostscript.com/thirdparty-lcms2.git in the lcms2-art branch. Or via the web on: http://git.ghostscript.com/?p=thirdparty-lcms2.git;h=lcms2-art Robin |
From: Graeme G. <gr...@ar...> - 2017-07-31 23:13:24
|
Noel Carboni wrote: > By the way, as an exercise to reinforce the above, I re-coded the > LittleCMS floating point trilinear interpolation algorithm using SSE2 > intrinsics. It ended up delivering the same performance as the C-coded > version. Why not better? Because the table-based design of the Little > CMS library doesn't suit parallel calculations so there were only limited > things I could do. Simplex interpolation is generally faster since it touches fewer node points - something that increases in importance with higher input dimensions - but simplex isn't terribly parallelizable, since it involves a sort. Once the weighting of each nodes is known using simplex or multi-linear, paralleling the output dimensions calculations is a good speedup though. [ How much of a win vector CPU instructions would be is not something I've ever had time to explore in my color engine, and I've been content to stick to portable C code, while wringing what I can out of it. Exploiting GPU texture lookup hardware seems far simpler to code for, for maximum overall speed. ] Cheers, Graeme Gill. |
From: Noel C. <NCa...@Pr...> - 2017-07-31 21:13:18
|
> The ideal situation is if the C code is written in such a way that > modern optimizing compilers do the right > thing by default and produce good code for any CPU. This should mean > that the compilers automatically > produce SSE code where they should if it is enabled. Yes, a good thought. Unfortunately the compilers are not NEARLY there yet w/regard to using SSE instructions in the best way possible. And I'm not sure they're really going to get there... It would be difficult for C/C++, where the natural value is a single int, without very specific design considerations in the source code to take good advantage of the things SSE has to offer, which is primarily to carry multiple data items per register and to parallelize calculations on that data. The challenge is to design an application to take advantage of being able to do multiple calculations at once, and to carry related chunks of data in a big (e.g., __m128) register. Pixel manipulations CAN map well into this sort of thing... My Photoshop plug-ins, for example, now use SSE2 throughout and everything's stored chunky and in floating point (e.g., we put one RGBA pixel in an __m128). The floating point gives advantages in terms of overflow/underflow/loss-of-precision protection, and the parallel processing offsets the disadvantages of the increased memory bandwidth utilization for the long values. But of course the source code is less maintainable and less portable because of the SSE usage - it's less like C and more like embedded assembly (we use the Intel intrinsics such as _mm_mul_ps). We went into this with our eyes open and I'm glad we made the decisions we did. The interesting thing is that while we were refactoring our plug-ins, the use of SSE really didn't pay off in performance until we had the code embracing the concepts throughout. Everything got faster all at once near the end of the project. I can say from that experience that the one thing you absolutely DON'T want is to have is HALF an SSE implementation... Getting things into and out of XMM registers (i.e., during conversions) is inefficient. The application has to "think overall" in parallel and use a format (e.g., floating point) throughout that matches the SSE capabilities to work well. By the way, as an exercise to reinforce the above, I re-coded the LittleCMS floating point trilinear interpolation algorithm using SSE2 intrinsics. It ended up delivering the same performance as the C-coded version. Why not better? Because the table-based design of the Little CMS library doesn't suit parallel calculations so there were only limited things I could do. Let me be clear, I'm not suggesting redesigning Little CMS soup to nuts. Just throwing out a few thoughts and ideas. :) Regarding Marti's comment: > I think optimizations have to be done by arranging C code to help > compiler I agree and I'm finding by doing so and testing the results that there is still some additional performance to be had from rearranging the code to do things like put less broad requirements on the compiler to keep intermediate data for a long sequence of instructions (e.g., to reduce register starvation). By the way, the performance appeared to have dropped a fair bit between release 2.8 and what I downloaded from Git just the other day. I think I've got it all back and a little more at this point. -Noel |
From: Marti M. <mar...@li...> - 2017-07-31 20:02:13
|
<div dir='auto'><div>Many time ago, the code had some inline assembly code. I removed every trace of assembly about 15 years ago and take the requirement of pure C99 code forever. Was a good idea. Worked great and survived aging. I think optimizations have to be done by arranging C code to help compiler, assembly is all but helpful in those cases.<div dir="auto"><br></div><div dir="auto">Regards</div><div dir="auto">Marti</div><br><div class="gmail_extra"><br><div class="gmail_quote">On Jul 31, 2017 21:45, Bob Friesenhahn <bfr...@si...> wrote:<br type="attribution"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><p dir="ltr">On Sun, 30 Jul 2017, Lorenzo Ridolfi wrote: <br> <br> > There’s a lot of Vector Libraries that wrap the usage of SSE <br> > instructions. I used one of that libraries in a project long time a <br> > go and I could not find it. But the performance improvement was <br> > great. <br> <br> As a lcms user, I would definitely prefer if lcms has no external <br> dependencies. <br> <br> The ideal situation is if the C code is written in such a way that <br> modern optimizing compilers do the right thing by default and produce <br> good code for any CPU. This should mean that the compilers <br> automatically produce SSE code where they should if it is enabled. <br> <br> Bob <br> -- <br> Bob Friesenhahn <br> bfr...@si..., http://www.simplesystems.org/users/bfriesen/ <br> GraphicsMagick Maintainer, http://www.GraphicsMagick.org/<br> ------------------------------------------------------------------------------ <br> Check out the vibrant tech community on one of the world's most <br> engaging tech sites, Slashdot.org! http://sdm.link/slashdot<br> _______________________________________________ <br> Lcms-user mailing list <br> Lcm...@li... <br> https://lists.sourceforge.net/lists/listinfo/lcms-user <br> </p> </blockquote></div><br></div></div></div> |
From: Lorenzo R. <lo...@ma...> - 2017-07-31 19:56:02
|
Hi Bob, I totally agree with you. Any architecture dependent code should be always optional. The vector library I used came in two versions, a generic one and a SSE version. The usage of the SSE version, IMHO, must be explicitly indicated by a flag during the build. Best Regards, Lorenzo > On 31 Jul 2017, at 16:45, Bob Friesenhahn <bfr...@si...> wrote: > > On Sun, 30 Jul 2017, Lorenzo Ridolfi wrote: > >> There’s a lot of Vector Libraries that wrap the usage of SSE instructions. I used one of that libraries in a project long time a go and I could not find it. But the performance improvement was great. > > As a lcms user, I would definitely prefer if lcms has no external dependencies. > > The ideal situation is if the C code is written in such a way that modern optimizing compilers do the right thing by default and produce good code for any CPU. This should mean that the compilers automatically produce SSE code where they should if it is enabled. > > Bob > -- > Bob Friesenhahn > bfr...@si..., http://www.simplesystems.org/users/bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ |
From: Bob F. <bfr...@si...> - 2017-07-31 19:45:13
|
On Sun, 30 Jul 2017, Lorenzo Ridolfi wrote: > There’s a lot of Vector Libraries that wrap the usage of SSE > instructions. I used one of that libraries in a project long time a > go and I could not find it. But the performance improvement was > great. As a lcms user, I would definitely prefer if lcms has no external dependencies. The ideal situation is if the C code is written in such a way that modern optimizing compilers do the right thing by default and produce good code for any CPU. This should mean that the compilers automatically produce SSE code where they should if it is enabled. Bob -- Bob Friesenhahn bfr...@si..., http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ |
From: Martí M. <mar...@li...> - 2017-07-31 07:47:07
|
Thanks! Works great Regards Marti On 7/26/2017 7:05 PM, Aaron Boxer wrote: > Hello Marti, > > I noticed you don't have a .gitignore file in your repo. > I've attached one that may be useful for the project: > it filters out binaries and other artifacts produced while building. > > Cheers, > Aaron > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > > _______________________________________________ > Lcms-user mailing list > Lcm...@li... > https://lists.sourceforge.net/lists/listinfo/lcms-user |
From: Martí M. <mar...@li...> - 2017-07-31 07:26:58
|
Thanks Noel, I'm reviewing the changes an will return to you in few days. Regards Marti On 7/28/2017 11:13 PM, Noel Carboni wrote: > > Hi Marti, > > I've completed my changes to the LittleCMS sources. I've reviewed and > tested them with the testbed. > > http://Noel.ProDigitalSoftware.com/temp/ProposedLittleCMSChanges.zip > > With this set of files, derived from the Git trunk as of July 26, you > can now use the C++ compiler and higher warning levels. > > I followed this philosophy: > > 1.Don't alter the external interface (lcms2.h) > > 2.Make the use of signed and unsigned types consistent to reduce the > hundreds of warnings emitted when -Wall is used with the pickier C++ > compiler. > > 3.Avoid casts if possible. > > 4.Review all changes and pass all tests in the testbed project. > > 5.Maintain or improve performance. > > If you choose to enable the Visual Studio 2017 C++ compiler and the > highest warning level for your ongoing work, the following may be of > interest: > > You may want to specifically disable several of the specific overly > pedantic -Wall warnings by putting the following additional options on > the VS 2017 C++ compiler command line: > > * /wd4711 /wd4820 /wd4061 /wd4774 /wd4710 /wd4668 /wd4738* > > The above options specifically quiet the following warnings, which may > not be helpful to you in ongoing development: > > * warning C4711: function 'xxxxxxxxxxxxxx' selected for automatic > inline expansion* > > * warning C4820: 'xxxxxxxx': 'n' bytes padding added after data member * > > * warning C4061: enumerator 'xxxxxxxxxxxxx' in switch of enum > 'yyyyyyyyyyyyy' is not explicitly handled by a case label* > > * warning C4774: '_snprintf' : format string expected in argument 3 > is not a string literal* > > * warning C4710: 'int _snprintf(char *const ,const ::size_t,const > char *const ,...)': function not inlined* > > * warning C4668: '_M_HYBRID_X86_ARM64' is not defined as a > preprocessor macro, replacing with '0' for '#if/#elif'* > > * warning C4738: storing 32-bit float result in memory, possible loss > of performance* > > After suppressing the above, there is a remaining "unsafe conversion" > warning emitted for cmsxform.c that may indicate a possible real > problem, especially in light of compiling this library for different > systems with different compilers. This is no different than in the > code I started with.** > > Besides the code compiling cleaner, the testbed project also shows > performance changes for the better in my testing. > > -Noel > |
From: Lorenzo R. <lo...@ma...> - 2017-07-31 01:19:41
|
Hi Noel, There’s a lot of Vector Libraries that wrap the usage of SSE instructions. I used one of that libraries in a project long time a go and I could not find it. But the performance improvement was great. Here’s an example I found: http://fastcpp.blogspot.com.br/2011/12/simple-vector3-class-with-sse-support.html <http://fastcpp.blogspot.com.br/2011/12/simple-vector3-class-with-sse-support.html> Best Regards, Lorenzo > On 30 Jul 2017, at 12:05, Noel Carboni <NCa...@Pr...> wrote: > > Hi Lorenzo, > > > Did you try Intel C++ Compiler? (It’s free for open source projects on Linux). > > In some programs I got 2:1 performance improvements. > > No, I haven't run that one. At the moment we have really only one practical choice here: The Microsoft Visual Studio 2017 C++ compiler for Windows, though we may be looking into alternatives in the future. I have heard good things about Intel's compiler elsewhere as well. Thanks for the data point. > > For what it's worth, looking over Microsoft compiler-generated code, with some of the complex routines like multi-input/output interpolation the compiler is starved for registers and has to resort to storing intermediate compute products in RAM. Just small things like changing the way loops are managed to free up a register here and there make a noticeable difference in throughput, especially with e.g., the 32 bit floating point routines, where the channel data is 4 bytes each and the process is already quite RAM-bound when multi-threaded. > > It's possible using SSE instructions to facilitate things like doing 4 calculations simultaneously could speed things up further. I'm looking into that now. > > -Noel |
From: Greg T. <gd...@le...> - 2017-07-31 01:12:25
|
"Noel Carboni" <NCa...@Pr...> writes: > I'm just curious: Given that PCs and Macs are based on Intel chipsets > nowadays... > > Do we have a feel for how much Little CMS is being used on other > processor architectures? I don't know, but I would guess that at least ARM is an important target. Also, if you're looking for code that generates warnings, I would also recommend building with clang, in addition to gcc and whatever MS compiler you have. It seems like each new compiler results in more warnings. |
From: Boudewijn R. <bo...@va...> - 2017-07-30 16:05:36
|
On Sun, 30 Jul 2017, Noel Carboni wrote: > Source rearrangement notwithstanding, if one were to create routines > that would make use of the vector instructions virtually every Intel > system already has (e.g., SSE2) the results could be markedly better > still. I've been through converting all my own software to use vectors > and the results were well worth the effort. We now run faster with 32 > bit floating point than we used to with integer formats. We use the vc library for vectorization, and it's pretty amazing. Of course, it's C++, not C, but still... https://github.com/VcDevel -- Boudewijn Rempt | http://www.krita.org, http://www.valdyas.org |
From: Noel C. <NCa...@Pr...> - 2017-07-30 15:06:02
|
Hi Lorenzo, > Did you try Intel C++ Compiler? (It’s free for open source projects on Linux). > In some programs I got 2:1 performance improvements. No, I haven't run that one. At the moment we have really only one practical choice here: The Microsoft Visual Studio 2017 C++ compiler for Windows, though we may be looking into alternatives in the future. I have heard good things about Intel's compiler elsewhere as well. Thanks for the data point. For what it's worth, looking over Microsoft compiler-generated code, with some of the complex routines like multi-input/output interpolation the compiler is starved for registers and has to resort to storing intermediate compute products in RAM. Just small things like changing the way loops are managed to free up a register here and there make a noticeable difference in throughput, especially with e.g., the 32 bit floating point routines, where the channel data is 4 bytes each and the process is already quite RAM-bound when multi-threaded. It's possible using SSE instructions to facilitate things like doing 4 calculations simultaneously could speed things up further. I'm looking into that now. -Noel |
From: Lorenzo R. <lo...@ma...> - 2017-07-30 14:43:59
|
Noel, Did you try Intel C++ Compiler? (I’ts free for open source projects on Linux). In some programs I got 2:1 performance improvements. Best Regards, Lorenzo > On 30 Jul 2017, at 01:20, Noel Carboni <NCa...@Pr...> wrote: > > Hi folks of the Little CMS mailing list, > > I'm just curious: Given that PCs and Macs are based on Intel chipsets nowadays... > > Do we have a feel for how much Little CMS is being used on other processor architectures? > > I ask because I'm considering submitting some optimizations I've made to the interpolation routines that speed things up for Intel-based systems. They're not Intel-specific, but are just optimization of the source code that pick up 5% to 20% in speed in the x64 testbed tests. > > Git code as of yesterday, as measured on my dual-Xeon Westmere workstation: > > P E R F O R M A N C E T E S T S > ================================= > > 16 bits on CLUT profiles : 34.4828 MPixel/sec. > 8 bits on CLUT profiles : 32.3232 MPixel/sec. > 8 bits on Matrix-Shaper profiles : 66.6667 MPixel/sec. > 8 bits on SAME Matrix-Shaper profiles : 120.301 MPixel/sec. > 8 bits on Matrix-Shaper profiles (AbsCol) : 66.6667 MPixel/sec. > 16 bits on Matrix-Shaper profiles : 34.4828 MPixel/sec. > 16 bits on SAME Matrix-Shaper profiles : 137.931 MPixel/sec. > 16 bits on Matrix-Shaper profiles (AbsCol) : 34.4828 MPixel/sec. > 8 bits on curves : 88.8889 MPixel/sec. > 16 bits on curves : 91.4286 MPixel/sec. > 8 bits on CMYK profiles : 11.9314 MPixel/sec. > 16 bits on CMYK profiles : 11.976 MPixel/sec. > 8 bits on gray-to gray : 104.575 MPixel/sec. > 8 bits on gray-to-lab gray : 105.263 MPixel/sec. > 8 bits on SAME gray-to-gray : 105.263 MPixel/sec. > > > My current code: > > P E R F O R M A N C E T E S T S > ================================= > > 16 bits on CLUT profiles : 38.5542 MPixel/sec. > 8 bits on CLUT profiles : 33.0579 MPixel/sec. > 8 bits on Matrix-Shaper profiles : 66.1157 MPixel/sec. > 8 bits on SAME Matrix-Shaper profiles : 121.212 MPixel/sec. > 8 bits on Matrix-Shaper profiles (AbsCol) : 66.9456 MPixel/sec. > 16 bits on Matrix-Shaper profiles : 38.5542 MPixel/sec. > 16 bits on SAME Matrix-Shaper profiles : 142.857 MPixel/sec. > 16 bits on Matrix-Shaper profiles (AbsCol) : 38.5542 MPixel/sec. > 8 bits on curves : 89.3855 MPixel/sec. > 16 bits on curves : 94.1176 MPixel/sec. > 8 bits on CMYK profiles : 14.4796 MPixel/sec. > 16 bits on CMYK profiles : 14.5587 MPixel/sec. > 8 bits on gray-to gray : 125 MPixel/sec. > 8 bits on gray-to-lab gray : 124.031 MPixel/sec. > 8 bits on SAME gray-to-gray : 124.031 MPixel/sec. > > These translate to real product gains... For example, with a 100 megapixel 32 bit grayscale image our heavily multi-threaded transform time dropped from 1485 milliseconds to 968 milliseconds. > > Source rearrangement notwithstanding, if one were to create routines that would make use of the vector instructions virtually every Intel system already has (e.g., SSE2) the results could be markedly better still. I've been through converting all my own software to use vectors and the results were well worth the effort. We now run faster with 32 bit floating point than we used to with integer formats. > > There is also the further possibility of extending the Little CMS algorithms into the GPU for huge gains. I suppose the trouble with that would be figuring out what subsystem to use (OpenCL programs... OpenGL shaders... Vulkan? Others?) > > -Noel > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org <http://slashdot.org/>! http://sdm.link/slashdot_______________________________________________ <http://sdm.link/slashdot_______________________________________________> > Lcms-user mailing list > Lcm...@li... <mailto:Lcm...@li...> > https://lists.sourceforge.net/lists/listinfo/lcms-user <https://lists.sourceforge.net/lists/listinfo/lcms-user> |
From: Noel C. <NCa...@Pr...> - 2017-07-30 05:35:41
|
Hi folks of the Little CMS mailing list, I'm just curious: Given that PCs and Macs are based on Intel chipsets nowadays... Do we have a feel for how much Little CMS is being used on other processor architectures? I ask because I'm considering submitting some optimizations I've made to the interpolation routines that speed things up for Intel-based systems. They're not Intel-specific, but are just optimization of the source code that pick up 5% to 20% in speed in the x64 testbed tests. Git code as of yesterday, as measured on my dual-Xeon Westmere workstation: P E R F O R M A N C E T E S T S ================================= 16 bits on CLUT profiles : 34.4828 MPixel/sec. 8 bits on CLUT profiles : 32.3232 MPixel/sec. 8 bits on Matrix-Shaper profiles : 66.6667 MPixel/sec. 8 bits on SAME Matrix-Shaper profiles : 120.301 MPixel/sec. 8 bits on Matrix-Shaper profiles (AbsCol) : 66.6667 MPixel/sec. 16 bits on Matrix-Shaper profiles : 34.4828 MPixel/sec. 16 bits on SAME Matrix-Shaper profiles : 137.931 MPixel/sec. 16 bits on Matrix-Shaper profiles (AbsCol) : 34.4828 MPixel/sec. 8 bits on curves : 88.8889 MPixel/sec. 16 bits on curves : 91.4286 MPixel/sec. 8 bits on CMYK profiles : 11.9314 MPixel/sec. 16 bits on CMYK profiles : 11.976 MPixel/sec. 8 bits on gray-to gray : 104.575 MPixel/sec. 8 bits on gray-to-lab gray : 105.263 MPixel/sec. 8 bits on SAME gray-to-gray : 105.263 MPixel/sec. My current code: P E R F O R M A N C E T E S T S ================================= 16 bits on CLUT profiles : 38.5542 MPixel/sec. 8 bits on CLUT profiles : 33.0579 MPixel/sec. 8 bits on Matrix-Shaper profiles : 66.1157 MPixel/sec. 8 bits on SAME Matrix-Shaper profiles : 121.212 MPixel/sec. 8 bits on Matrix-Shaper profiles (AbsCol) : 66.9456 MPixel/sec. 16 bits on Matrix-Shaper profiles : 38.5542 MPixel/sec. 16 bits on SAME Matrix-Shaper profiles : 142.857 MPixel/sec. 16 bits on Matrix-Shaper profiles (AbsCol) : 38.5542 MPixel/sec. 8 bits on curves : 89.3855 MPixel/sec. 16 bits on curves : 94.1176 MPixel/sec. 8 bits on CMYK profiles : 14.4796 MPixel/sec. 16 bits on CMYK profiles : 14.5587 MPixel/sec. 8 bits on gray-to gray : 125 MPixel/sec. 8 bits on gray-to-lab gray : 124.031 MPixel/sec. 8 bits on SAME gray-to-gray : 124.031 MPixel/sec. These translate to real product gains... For example, with a 100 megapixel 32 bit grayscale image our heavily multi-threaded transform time dropped from 1485 milliseconds to 968 milliseconds. Source rearrangement notwithstanding, if one were to create routines that would make use of the vector instructions virtually every Intel system already has (e.g., SSE2) the results could be markedly better still. I've been through converting all my own software to use vectors and the results were well worth the effort. We now run faster with 32 bit floating point than we used to with integer formats. There is also the further possibility of extending the Little CMS algorithms into the GPU for huge gains. I suppose the trouble with that would be figuring out what subsystem to use (OpenCL programs... OpenGL shaders... Vulkan? Others?) -Noel |
From: Noel C. <NCa...@Pr...> - 2017-07-28 21:14:08
|
Hi Marti, I've completed my changes to the LittleCMS sources. I've reviewed and tested them with the testbed. http://Noel.ProDigitalSoftware.com/temp/ProposedLittleCMSChanges.zip With this set of files, derived from the Git trunk as of July 26, you can now use the C++ compiler and higher warning levels. I followed this philosophy: 1. Don't alter the external interface (lcms2.h) 2. Make the use of signed and unsigned types consistent to reduce the hundreds of warnings emitted when -Wall is used with the pickier C++ compiler. 3. Avoid casts if possible. 4. Review all changes and pass all tests in the testbed project. 5. Maintain or improve performance. If you choose to enable the Visual Studio 2017 C++ compiler and the highest warning level for your ongoing work, the following may be of interest: You may want to specifically disable several of the specific overly pedantic -Wall warnings by putting the following additional options on the VS 2017 C++ compiler command line: /wd4711 /wd4820 /wd4061 /wd4774 /wd4710 /wd4668 /wd4738 The above options specifically quiet the following warnings, which may not be helpful to you in ongoing development: warning C4711: function 'xxxxxxxxxxxxxx' selected for automatic inline expansion warning C4820: 'xxxxxxxx': 'n' bytes padding added after data member warning C4061: enumerator 'xxxxxxxxxxxxx' in switch of enum 'yyyyyyyyyyyyy' is not explicitly handled by a case label warning C4774: '_snprintf' : format string expected in argument 3 is not a string literal warning C4710: 'int _snprintf(char *const ,const ::size_t,const char *const ,...)': function not inlined warning C4668: '_M_HYBRID_X86_ARM64' is not defined as a preprocessor macro, replacing with '0' for '#if/#elif' warning C4738: storing 32-bit float result in memory, possible loss of performance After suppressing the above, there is a remaining "unsafe conversion" warning emitted for cmsxform.c that may indicate a possible real problem, especially in light of compiling this library for different systems with different compilers. This is no different than in the code I started with. Besides the code compiling cleaner, the testbed project also shows performance changes for the better in my testing. -Noel |
From: Aaron B. <bo...@gm...> - 2017-07-26 17:05:57
|
Hello Marti, I noticed you don't have a .gitignore file in your repo. I've attached one that may be useful for the project: it filters out binaries and other artifacts produced while building. Cheers, Aaron |
From: Noel C. <NCa...@Pr...> - 2017-07-26 16:40:33
|
> please note that this is using C language, not C++. It is possible the VS 2017 C++ is compiler is pickier about matching data types. There is nothing wrong with that, as it could uncover issues where data types are not consistent. LOL about your comment earlier about it taking months to merge the changes. I finished it in about 90 minutes. May I recommend the tool Beyond Compare by Scooter Software. :-) I'm now setting about doing the testing with the test suite. -Noel |
From: Martí M. <mar...@li...> - 2017-07-26 15:17:48
|
That is what I get when cross-compiling to ARM by gcc 4.6.3 arm-elf-gcc.exe -c -DCMS_NO_PTHREADS -std=c99 --pedantic -Wall -I ../include *.c cmscgats.c: In function 'ParseFloatNumber': cmscgats.c:643:5: warning: array subscript has type 'char' [-Wchar-subscripts] cmsopt.c: In function 'OptimizeByComputingLinearization': cmsopt.c:1035:26: warning: variable 'lIsLinear' set but not used [-Wunused-but-set-variable] cmspcs.c: In function '_cmsLCMScolorSpace': cmspcs.c:872:5: warning: overflow in implicit constant conversion [-Woverflow] I tried also a modern compiler. gcc 6.3.1 complains on indentation because -Wmisleading-indentation, so turning it off: arm-none-eabi-gcc.exe -c -DCMS_NO_PTHREADS -std=c99 --pedantic -Wall -Wno-misleading-indentation -I ../include *.c cmsopt.c: In function 'OptimizeByComputingLinearization': cmsopt.c:1035:26: warning: variable 'lIsLinear' set but not used [-Wunused-but-set-variable] cmsBool lIsSuitable, lIsLinear; ^~~~~~~~~ cmspcs.c: In function '_cmsLCMScolorSpace': cmspcs.c:872:22: warning: overflow in implicit constant conversion [-Woverflow] default: return (cmsColorSpaceSignature) (-1); ^ Again, please note that this is using C language, not C++. Marti On 7/26/2017 3:37 PM, Bob Friesenhahn wrote: > On Wed, 26 Jul 2017, Aaron Boxer wrote: > >> Thanks, Noel. Might be safer to do this on linux, where you can run make >> check >> to test. May I ask how I turn on -Wall on linux build for lcms ? > > The normal way (quite well documented) is > > ./configure CFLAGS='-O2 -Wall' ... > > I use these GCC options while building GraphicsMagick: > > ./configure 'CFLAGS=-O2 -g -ggdb -Wall -Winline -W -Wformat-security\ > -Wpointer-arith -Wdisabled-optimization > -Wdeclaration-after-statement' > > There are of course many more warning options which can be enabled for > people who have plenty of time on their hands. Even these options are > not likely to include type conversion warnings. > > Optimizing warnings for just one compiler is a bad idea. At least > three completely different compilers should be used. > > Bob |
From: Noel C. <NCa...@Pr...> - 2017-07-26 13:38:11
|
> Thanks, Noel. Might be safer to do this on linux, where you can run make check > to test. May I ask how I turn on -Wall on linux build for lcms ? There is a capability to do the checking on Windows as well; I just don't have it set up presently. We rushed through getting the project to build with VS 2017 when it came out and never brought up the test bed stuff. As I mentioned before, everyone wait w/regard to trying to use the sources I posted earlier, as I need to both figure out what I broke as well as merge the changes into the current trunk to be of the most help to Marti. -Noel |
From: Bob F. <bfr...@si...> - 2017-07-26 13:37:55
|
On Wed, 26 Jul 2017, Aaron Boxer wrote: > Thanks, Noel. Might be safer to do this on linux, where you can run make > check > to test. May I ask how I turn on -Wall on linux build for lcms ? The normal way (quite well documented) is ./configure CFLAGS='-O2 -Wall' ... I use these GCC options while building GraphicsMagick: ./configure 'CFLAGS=-O2 -g -ggdb -Wall -Winline -W -Wformat-security\ -Wpointer-arith -Wdisabled-optimization -Wdeclaration-after-statement' There are of course many more warning options which can be enabled for people who have plenty of time on their hands. Even these options are not likely to include type conversion warnings. Optimizing warnings for just one compiler is a bad idea. At least three completely different compilers should be used. Bob -- Bob Friesenhahn bfr...@si..., http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ |