libjpeg-turbo-users Mailing List for libjpeg-turbo (Page 18)
SIMD-accelerated libjpeg-compatible JPEG codec library
Brought to you by:
dcommander
You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
(1) |
Apr
(3) |
May
(1) |
Jun
(4) |
Jul
(2) |
Aug
(3) |
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
|
May
(21) |
Jun
(8) |
Jul
(20) |
Aug
|
Sep
|
Oct
(3) |
Nov
|
Dec
(4) |
2012 |
Jan
(3) |
Feb
(6) |
Mar
|
Apr
|
May
|
Jun
(12) |
Jul
(1) |
Aug
(8) |
Sep
(4) |
Oct
(9) |
Nov
(2) |
Dec
|
2013 |
Jan
(28) |
Feb
(24) |
Mar
(10) |
Apr
(10) |
May
(2) |
Jun
|
Jul
|
Aug
(12) |
Sep
(14) |
Oct
(15) |
Nov
(5) |
Dec
(2) |
2014 |
Jan
(21) |
Feb
(12) |
Mar
(13) |
Apr
(39) |
May
(3) |
Jun
|
Jul
|
Aug
(8) |
Sep
(5) |
Oct
|
Nov
(5) |
Dec
(8) |
2015 |
Jan
(10) |
Feb
(10) |
Mar
|
Apr
(5) |
May
(8) |
Jun
(23) |
Jul
(1) |
Aug
(8) |
Sep
(1) |
Oct
(1) |
Nov
(4) |
Dec
(4) |
2016 |
Jan
|
Feb
(2) |
Mar
(17) |
Apr
|
May
(1) |
Jun
(1) |
Jul
|
Aug
|
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
(1) |
2017 |
Jan
(4) |
Feb
(1) |
Mar
|
Apr
|
May
(1) |
Jun
(1) |
Jul
(1) |
Aug
(2) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
From: Siarhei S. <sia...@gm...> - 2011-06-07 12:11:32
|
Hi All, Here are the current benchmark results done on ARM Cortex-A8 and ARM Cortex-A9 hardware according to http://www.libjpeg-turbo.org/About/Performance The attached logs have been obtained using the scripts from: http://cgit.freedesktop.org/~siamashka/libjpeg-turbo/commit/?h=permalink/20110607-benchmark The performance improvement for jpeg decoding does not look that great in the attached logs because only the effect YUV->RGB NEON conversion is getting measured (tjbench does not use 'jpeg_idct_ifast' which is also NEON optimized). If additionally using NEON optimized IDCT, the overall decoding performance becomes at least twice better than C. For example, here are the results of decoding jpeg file using 'djpeg' on ARM Cortex-A9 1GHz, where 'jpeg_idct_ifast_neon' gets used because of '-dct fast' option: $ time JSIMD_FORCE_NO_SIMD=1 ./djpeg -dct fast nightshot_iso_100_422_Q95.jpg > /dev/null real 0m0.890s user 0m0.859s sys 0m0.023s $ time ./djpeg -dct fast nightshot_iso_100_422_Q95.jpg > /dev/null real 0m0.360s user 0m0.336s sys 0m0.016s -- Best regards, Siarhei Siamashka |
From: Siarhei S. <sia...@gm...> - 2011-05-18 16:11:05
|
On Wed, May 18, 2011 at 1:57 PM, Martin Aumüller <au...@re...> wrote: > On May 18, 2011, at 03:48 , Siarhei Siamashka wrote: >> As you are already using git yourself, maybe you could push your >> patches to some git repository? It just would be a bit more convenient >> for me. > > I published my repository on github: git://github.com/aumuell/libjpeg-turbo.git OK, thanks. Now let's move to the patch tracker for the rest of the stuff :) -- Best regards, Siarhei Siamashka |
From: Siarhei S. <sia...@gm...> - 2011-05-18 15:45:45
|
On Wed, May 18, 2011 at 7:18 AM, DRC <dco...@us...> wrote: > This puts me in an awkward position, because I have no way of verifying > this code on either platform. Regarding some way to verify the code, at least compile testing of libjpeg-turbo for arm linux is easy when having a working arm crosscompiler. One of the easiest ways of getting some crosscompiler is to downloading a free codesourcery lite toolchain: http://www.codesourcery.com/sgpp/lite/arm Just download the toolchain (Sourcery G++ Lite 2011.03-41 for ARM GNU/Linux is the latest version at the moment) and extract the tarball somewhere (for example into /opt). After this, compiling libjpeg-turbo can be done in the following way: $ make distclean $ export CFLAGS="-O3 -mcpu=cortex-a8 -mfloat-abi=softfp -mfpu=neon" $ export PATH=/opt/arm-2011.03/bin:$PATH $ ./configure --host=arm-none-linux-gnueabi && make Setting CFLAGS is optional. It is also possible to run 'make test' if 'qemu-user' is installed (assuming that binfmt-misc is set up), but there are some quirks: 1. A fully static build of libjpeg-turbo is needed (even statically linked with glibc), this can be done via adding the line "LDFLAGS = -all-static" in the beginning of Makefile.am (maybe an extra configure option for such build may be a good addition to libjpeg-turbo) 2. The qemu-user versions which are currently bundled with the linux distros have broken ARM NEON support, so the test will likely fail if they are used. A version of qemu with sufficiently good NEON support can be obtained here: https://meego.gitorious.org/qemu-maemo/qemu 3. The NEON code will not be executed because it will fail runtime NEON detection. The runtime detection can be hacked to unconditionally enable NEON, or maybe it can be enabled at runtime by just checking __ARM_NEON__ macro (it should be defined by the compiler if CFLAGS contains '-mfpu=neon' option). And finally, just in order to verify that this setup is really working, it makes sense to artificially break the code in 'simd/jsimd_arm_neon.S' by commenting out some of the instructions. If this breakage is detected by 'make test', then everything has been set up correctly. There are many other alternative ways of testing when not having ARM hardware, but this is just the easiest one in my opinion. And it is not limited to ARM, builds for the other architectures can be tested too. -- Best regards, Siarhei Siamashka |
From: Martin A. <au...@re...> - 2011-05-18 11:15:37
|
On May 18, 2011, at 03:34 , DRC wrote: > This list is for end user issues. Please do not submit patches here. I > would prefer that they be submitted via the Patch tracker. Thank you for the note. I submitted the patches on the Sourceforge patch tracker. Regards, Martin > > > On 5/17/11 12:51 PM, Martin Aumüller wrote: >> --- >> simd/jsimd_arm.c | 10 ++++++++++ >> 1 files changed, 10 insertions(+), 0 deletions(-) >> >> diff --git a/simd/jsimd_arm.c b/simd/jsimd_arm.c >> index b70b94e..725b546 100644 >> --- a/simd/jsimd_arm.c >> +++ b/simd/jsimd_arm.c >> @@ -100,7 +100,9 @@ LOCAL(void) >> init_simd (void) >> { >> char *env = NULL; >> +#ifdef __linux__ >> int bufsize = 1024; /* an initial guess for the line buffer size limit */ >> +#endif >> >> if (simd_support != ~0) >> return; >> @@ -115,6 +117,14 @@ init_simd (void) >> } >> #endif >> >> +#ifdef __APPLE__ >> +#ifdef __arm__ >> +#ifdef __ARM_NEON__ >> + simd_support |= JSIMD_ARM_NEON; >> +#endif >> +#endif >> +#endif >> + >> /* Force different settings through environment variables */ >> env = getenv("JSIMD_FORCE_ARM_NEON"); >> if ((env != NULL) && (strcmp(env, "1") == 0)) > > ------------------------------------------------------------------------------ > What Every C/C++ and Fortran developer Should Know! > Read this article and learn how Intel has extended the reach of its > next-generation tools to help Windows* and Linux* C/C++ and Fortran > developers boost performance applications - including clusters. > http://p.sf.net/sfu/intel-dev2devmay > _______________________________________________ > Libjpeg-turbo-users mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-users |
From: Martin A. <au...@re...> - 2011-05-18 10:57:46
|
Hi, On May 18, 2011, at 03:48 , Siarhei Siamashka wrote: > On Tue, May 17, 2011 at 8:51 PM, Martin Aumüller <au...@re...> wrote: >> In a recent project, we use libjpeg-turbo for remote rendering on >> iPads. So we are really happy to see ARM NEON support in trunk, >> especially for the idct. Thank you for this! > > Hi, I'm the one who developed these initial NEON patches in libjpeg-turbo trunk. > > Thanks for trying this code on iOS. I always wondered whether it would > be difficult to support both linux and iOS with as little hassle as > possible. But now with your help it looks really doable. > >> Unfortunately, building with Apple's ancient gas 1.38 requires some >> changes and work-arounds: >> - macros are more limited >> This is mostly worked around by using gas-preprocessor.pl [1], a >> perl script mainly developed for compiling ffmpeg for iOS. However, >> it requires that macro parameters be prefixed with a \. It required >> some additional changes, the version which works with libjpeg-turbo >> is available on github [2]. I also asked for the changes to be >> pulled into the regular gas-preprocessor. >> If configure detects that the ARM assembler is not suitable, it >> tries again with gas-preprocessor.pl. It has to be in the PATH for >> this to work. >> >> - the adrl pseudo instruction crashes as >> I reported this bug to Apple and hope that it gets fixed soon [3]. >> In the mean time, I propose to use adr and several copies of the >> same constants instead. I also tried to use 'ldr ip, =consts' inst >> of 'adrl ip, consts', but then ld failed with 'illegal text reloc'. >> >> - symbol names need to be prefixed with _ > > I only quickly looked through your patches and think that some of the > things like \ prefixes are probably fine. I'm still a bit worried > whether it could be a maintenance problem in the long run (the need to > always remember what things not to use in in the assembly code in > order not to break iOS). I wonder if it would be difficult to move > this part to gas-preprocessor.pl too and aim for full gas > compatibility eventually? That would probably require some additional changes to gas-preprocessor.pl: it should be avoided that text that comes from a macro parameter substitution is modified again. But that's probably doable. I will look into that. > > The workaround for adrl is a bit ugly, but probably it can't be helped > until the problem is fixed in Apple assembler. > > As you are already using git yourself, maybe you could push your > patches to some git repository? It just would be a bit more convenient > for me. I published my repository on github: git://github.com/aumuell/libjpeg-turbo.git > >> - on iOS, the detection of the availability of NEON acceleration is >> done at compile time > > OK, I think that it is very unlikely for any future ARM based Apple > product to suddenly have NEON support removed. There is a lot more > hardware diversity across linux/android. > >> I used the NEON idct successfully on iOS, however the color space >> conversion routines (the extrgb one) did not work correctly. I also >> checked the jpegut output in a QEMU-emulated ARM environment, both >> the original version (SVN r611) and the patched one produced >> incorrect, but almost identical, results. And both versions produced >> varying results across different runs. > > NEON support in upstream QEMU is not very mature yet. You may have > more luck with http://meego.gitorious.org/qemu-maemo (it should have > more NEON fixes) That was a very helpful hint: with this qemu, make test ran fine even after applying my patches. > >> Hence, I don't think that >> these problems are due to my changes. A version where I disabled the >> SIMD color space conversion worked correctly (r611 and patched). >> >> I hope that these changes in the following mails are acceptable for >> libjpeg-turbo. I'd be happy if somebody could help out with ideas for >> the color conversion problems. And of course I'm open for suggestions >> for improvement. > > It looks like a bug in gas-preprocessor.pl > > Just as an experiment, I tried to compare the objdump output for > normally compiled jsimd_arm_neon.S (in linux) and the same source > compiled after running through the preprocessor. The result is > expected to be the same, but this was not the case. One obvious > problem is that the preprocessor generates the following code: > > vst3.8 {d10[0], d11[0], d12[0]}, [RGB]! > vst3.8 {d10[1], d11[1], d12[1]}, [RGB]! > vst3.8 {d10[2], d11[2], d12[2]}, [RGB]! > vst3.8 {d10[3], d11[3], d12[3]}, [RGB]! > vst4.8 {d10[0], d11[0], d12[0], d13[0]}, [RGB]! > vst4.8 {d10[1], d11[1], d12[1], d13[1]}, [RGB]! > vst4.8 {d10[2], d11[2], d12[2], d13[2]}, [RGB]! > vst4.8 {d10[3], d11[3], d12[3], d13[3]}, [RGB]! > > But it makes no sense because vst3.8 and vst4.8 are used in the > different if/else branches and should not be used within the same > conversion function. This - and the debugging trick with objdump - is very useful. Thanks! I will look into my gas-preprocessor.pl changes and try to fix that. Best regards, Martin > > On a positive side, after applying your patches, the code still builds > and passes tests in linux for me. So probably just the preprocessor > issue needs to be solved. I'll try to have one more look at the > patches a bit later. > > -- > Best regards, > Siarhei Siamashka > > ------------------------------------------------------------------------------ > What Every C/C++ and Fortran developer Should Know! > Read this article and learn how Intel has extended the reach of its > next-generation tools to help Windows* and Linux* C/C++ and Fortran > developers boost performance applications - including clusters. > http://p.sf.net/sfu/intel-dev2devmay > _______________________________________________ > Libjpeg-turbo-users mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-users |
From: DRC <dco...@us...> - 2011-05-18 04:54:35
|
https://sourceforge.net/projects/libjpeg-turbo/files/1.1.1/ Significant changes since 1.1.0 =============================== [1] Fixed a 1-pixel error in row 0, column 21 of the luminance plane generated by tjEncodeYUV(). [2] libjpeg-turbo's accelerated Huffman decoder previously ignored unexpected markers found in the middle of the JPEG data stream during decompression. It will now hand off decoding of a particular block to the unaccelerated Huffman decoder if an unexpected marker is found, so that the unaccelerated Huffman decoder can generate an appropriate warning. [3] Older versions of MinGW64 prefixed symbol names with underscores by default, which differed from the behavior of 64-bit Visual C++. MinGW64 1.0 has adopted the behavior of 64-bit Visual C++ as the default, so to accommodate this, the libjpeg-turbo SIMD function names are no longer prefixed with an underscore when building with MinGW64. This means that, when building libjpeg-turbo with older versions of MinGW64, you will now have to add -fno-leading-underscore to the CFLAGS. [4] Fixed a regression bug in the NSIS script that caused the Windows installer build to fail when using the Visual Studio IDE. [5] Fixed a bug in jpeg_read_coefficients() whereby it would not initialize cinfo->image_width and cinfo->image_height if libjpeg v7 or v8 emulation was enabled. This specifically caused the jpegoptim program to fail if it was linked against a version of libjpeg-turbo that was built with libjpeg v7 or v8 emulation. [6] Eliminated excessive I/O overhead that occurred when reading BMP files in cjpeg. [7] Eliminated errors in the output of cjpeg on Windows that occurred when the application was invoked using I/O redirection (cjpeg <inputfile >output.jpg). |
From: DRC <dco...@us...> - 2011-05-18 04:18:43
|
This puts me in an awkward position, because I have no way of verifying this code on either platform. What I'm going to suggest is that these patches be moved to the Patch tracker and, once the two of you have agreed upon a set of changes that works for both Android and iOS, then I will evaluate those. I will say that the code must absolutely pass 'make test' before it is acceptable. libjpeg-turbo-users is a "developer support" list and is meant for discussing issues encountered when using libjpeg-turbo in other projects. It is not meant for in-depth code discussions of libjpeg-turbo itself. If there is enough interest for creating a libjpeg-turbo-devel list, I can do so. I haven't bothered up until now because I was the only one doing active development. On 5/17/11 8:48 PM, Siarhei Siamashka wrote: > On Tue, May 17, 2011 at 8:51 PM, Martin Aumüller <au...@re...> wrote: >> In a recent project, we use libjpeg-turbo for remote rendering on >> iPads. So we are really happy to see ARM NEON support in trunk, >> especially for the idct. Thank you for this! > > Hi, I'm the one who developed these initial NEON patches in libjpeg-turbo trunk. > > Thanks for trying this code on iOS. I always wondered whether it would > be difficult to support both linux and iOS with as little hassle as > possible. But now with your help it looks really doable. > >> Unfortunately, building with Apple's ancient gas 1.38 requires some >> changes and work-arounds: >> - macros are more limited >> This is mostly worked around by using gas-preprocessor.pl [1], a >> perl script mainly developed for compiling ffmpeg for iOS. However, >> it requires that macro parameters be prefixed with a \. It required >> some additional changes, the version which works with libjpeg-turbo >> is available on github [2]. I also asked for the changes to be >> pulled into the regular gas-preprocessor. >> If configure detects that the ARM assembler is not suitable, it >> tries again with gas-preprocessor.pl. It has to be in the PATH for >> this to work. >> >> - the adrl pseudo instruction crashes as >> I reported this bug to Apple and hope that it gets fixed soon [3]. >> In the mean time, I propose to use adr and several copies of the >> same constants instead. I also tried to use 'ldr ip, =consts' inst >> of 'adrl ip, consts', but then ld failed with 'illegal text reloc'. >> >> - symbol names need to be prefixed with _ > > I only quickly looked through your patches and think that some of the > things like \ prefixes are probably fine. I'm still a bit worried > whether it could be a maintenance problem in the long run (the need to > always remember what things not to use in in the assembly code in > order not to break iOS). I wonder if it would be difficult to move > this part to gas-preprocessor.pl too and aim for full gas > compatibility eventually? > > The workaround for adrl is a bit ugly, but probably it can't be helped > until the problem is fixed in Apple assembler. > > As you are already using git yourself, maybe you could push your > patches to some git repository? It just would be a bit more convenient > for me. > >> - on iOS, the detection of the availability of NEON acceleration is >> done at compile time > > OK, I think that it is very unlikely for any future ARM based Apple > product to suddenly have NEON support removed. There is a lot more > hardware diversity across linux/android. > >> I used the NEON idct successfully on iOS, however the color space >> conversion routines (the extrgb one) did not work correctly. I also >> checked the jpegut output in a QEMU-emulated ARM environment, both >> the original version (SVN r611) and the patched one produced >> incorrect, but almost identical, results. And both versions produced >> varying results across different runs. > > NEON support in upstream QEMU is not very mature yet. You may have > more luck with http://meego.gitorious.org/qemu-maemo (it should have > more NEON fixes) > >> Hence, I don't think that >> these problems are due to my changes. A version where I disabled the >> SIMD color space conversion worked correctly (r611 and patched). >> >> I hope that these changes in the following mails are acceptable for >> libjpeg-turbo. I'd be happy if somebody could help out with ideas for >> the color conversion problems. And of course I'm open for suggestions >> for improvement. > > It looks like a bug in gas-preprocessor.pl > > Just as an experiment, I tried to compare the objdump output for > normally compiled jsimd_arm_neon.S (in linux) and the same source > compiled after running through the preprocessor. The result is > expected to be the same, but this was not the case. One obvious > problem is that the preprocessor generates the following code: > > vst3.8 {d10[0], d11[0], d12[0]}, [RGB]! > vst3.8 {d10[1], d11[1], d12[1]}, [RGB]! > vst3.8 {d10[2], d11[2], d12[2]}, [RGB]! > vst3.8 {d10[3], d11[3], d12[3]}, [RGB]! > vst4.8 {d10[0], d11[0], d12[0], d13[0]}, [RGB]! > vst4.8 {d10[1], d11[1], d12[1], d13[1]}, [RGB]! > vst4.8 {d10[2], d11[2], d12[2], d13[2]}, [RGB]! > vst4.8 {d10[3], d11[3], d12[3], d13[3]}, [RGB]! > > But it makes no sense because vst3.8 and vst4.8 are used in the > different if/else branches and should not be used within the same > conversion function. > > On a positive side, after applying your patches, the code still builds > and passes tests in linux for me. So probably just the preprocessor > issue needs to be solved. I'll try to have one more look at the > patches a bit later. > |
From: Siarhei S. <sia...@gm...> - 2011-05-18 01:48:26
|
On Tue, May 17, 2011 at 8:51 PM, Martin Aumüller <au...@re...> wrote: > In a recent project, we use libjpeg-turbo for remote rendering on > iPads. So we are really happy to see ARM NEON support in trunk, > especially for the idct. Thank you for this! Hi, I'm the one who developed these initial NEON patches in libjpeg-turbo trunk. Thanks for trying this code on iOS. I always wondered whether it would be difficult to support both linux and iOS with as little hassle as possible. But now with your help it looks really doable. > Unfortunately, building with Apple's ancient gas 1.38 requires some > changes and work-arounds: > - macros are more limited > This is mostly worked around by using gas-preprocessor.pl [1], a > perl script mainly developed for compiling ffmpeg for iOS. However, > it requires that macro parameters be prefixed with a \. It required > some additional changes, the version which works with libjpeg-turbo > is available on github [2]. I also asked for the changes to be > pulled into the regular gas-preprocessor. > If configure detects that the ARM assembler is not suitable, it > tries again with gas-preprocessor.pl. It has to be in the PATH for > this to work. > > - the adrl pseudo instruction crashes as > I reported this bug to Apple and hope that it gets fixed soon [3]. > In the mean time, I propose to use adr and several copies of the > same constants instead. I also tried to use 'ldr ip, =consts' inst > of 'adrl ip, consts', but then ld failed with 'illegal text reloc'. > > - symbol names need to be prefixed with _ I only quickly looked through your patches and think that some of the things like \ prefixes are probably fine. I'm still a bit worried whether it could be a maintenance problem in the long run (the need to always remember what things not to use in in the assembly code in order not to break iOS). I wonder if it would be difficult to move this part to gas-preprocessor.pl too and aim for full gas compatibility eventually? The workaround for adrl is a bit ugly, but probably it can't be helped until the problem is fixed in Apple assembler. As you are already using git yourself, maybe you could push your patches to some git repository? It just would be a bit more convenient for me. > - on iOS, the detection of the availability of NEON acceleration is > done at compile time OK, I think that it is very unlikely for any future ARM based Apple product to suddenly have NEON support removed. There is a lot more hardware diversity across linux/android. > I used the NEON idct successfully on iOS, however the color space > conversion routines (the extrgb one) did not work correctly. I also > checked the jpegut output in a QEMU-emulated ARM environment, both > the original version (SVN r611) and the patched one produced > incorrect, but almost identical, results. And both versions produced > varying results across different runs. NEON support in upstream QEMU is not very mature yet. You may have more luck with http://meego.gitorious.org/qemu-maemo (it should have more NEON fixes) > Hence, I don't think that > these problems are due to my changes. A version where I disabled the > SIMD color space conversion worked correctly (r611 and patched). > > I hope that these changes in the following mails are acceptable for > libjpeg-turbo. I'd be happy if somebody could help out with ideas for > the color conversion problems. And of course I'm open for suggestions > for improvement. It looks like a bug in gas-preprocessor.pl Just as an experiment, I tried to compare the objdump output for normally compiled jsimd_arm_neon.S (in linux) and the same source compiled after running through the preprocessor. The result is expected to be the same, but this was not the case. One obvious problem is that the preprocessor generates the following code: vst3.8 {d10[0], d11[0], d12[0]}, [RGB]! vst3.8 {d10[1], d11[1], d12[1]}, [RGB]! vst3.8 {d10[2], d11[2], d12[2]}, [RGB]! vst3.8 {d10[3], d11[3], d12[3]}, [RGB]! vst4.8 {d10[0], d11[0], d12[0], d13[0]}, [RGB]! vst4.8 {d10[1], d11[1], d12[1], d13[1]}, [RGB]! vst4.8 {d10[2], d11[2], d12[2], d13[2]}, [RGB]! vst4.8 {d10[3], d11[3], d12[3], d13[3]}, [RGB]! But it makes no sense because vst3.8 and vst4.8 are used in the different if/else branches and should not be used within the same conversion function. On a positive side, after applying your patches, the code still builds and passes tests in linux for me. So probably just the preprocessor issue needs to be solved. I'll try to have one more look at the patches a bit later. -- Best regards, Siarhei Siamashka |
From: DRC <dco...@us...> - 2011-05-18 01:34:57
|
This list is for end user issues. Please do not submit patches here. I would prefer that they be submitted via the Patch tracker. On 5/17/11 12:51 PM, Martin Aumüller wrote: > --- > simd/jsimd_arm.c | 10 ++++++++++ > 1 files changed, 10 insertions(+), 0 deletions(-) > > diff --git a/simd/jsimd_arm.c b/simd/jsimd_arm.c > index b70b94e..725b546 100644 > --- a/simd/jsimd_arm.c > +++ b/simd/jsimd_arm.c > @@ -100,7 +100,9 @@ LOCAL(void) > init_simd (void) > { > char *env = NULL; > +#ifdef __linux__ > int bufsize = 1024; /* an initial guess for the line buffer size limit */ > +#endif > > if (simd_support != ~0) > return; > @@ -115,6 +117,14 @@ init_simd (void) > } > #endif > > +#ifdef __APPLE__ > +#ifdef __arm__ > +#ifdef __ARM_NEON__ > + simd_support |= JSIMD_ARM_NEON; > +#endif > +#endif > +#endif > + > /* Force different settings through environment variables */ > env = getenv("JSIMD_FORCE_ARM_NEON"); > if ((env != NULL) && (strcmp(env, "1") == 0)) |
From: Martin A. <au...@re...> - 2011-05-17 17:52:00
|
--- simd/jsimd_arm.c | 10 ++++++++++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/simd/jsimd_arm.c b/simd/jsimd_arm.c index b70b94e..725b546 100644 --- a/simd/jsimd_arm.c +++ b/simd/jsimd_arm.c @@ -100,7 +100,9 @@ LOCAL(void) init_simd (void) { char *env = NULL; +#ifdef __linux__ int bufsize = 1024; /* an initial guess for the line buffer size limit */ +#endif if (simd_support != ~0) return; @@ -115,6 +117,14 @@ init_simd (void) } #endif +#ifdef __APPLE__ +#ifdef __arm__ +#ifdef __ARM_NEON__ + simd_support |= JSIMD_ARM_NEON; +#endif +#endif +#endif + /* Force different settings through environment variables */ env = getenv("JSIMD_FORCE_ARM_NEON"); if ((env != NULL) && (strcmp(env, "1") == 0)) -- 1.7.5.1 |
From: Martin A. <au...@re...> - 2011-05-17 17:52:00
|
gas-preprocessor tries to work around Apple's iOS gas's limitations --- acinclude.m4 | 25 +++++++++++++++++++++++++ 1 files changed, 25 insertions(+), 0 deletions(-) diff --git a/acinclude.m4 b/acinclude.m4 index f6355bf..e5bf183 100644 --- a/acinclude.m4 +++ b/acinclude.m4 @@ -152,6 +152,31 @@ AC_DEFUN([AC_CHECK_COMPATIBLE_ARM_ASSEMBLER_IFELSE],[ pld [r0] vmovn.u16 d0, q0]], ac_good_gnu_arm_assembler=yes) CFLAGS="$ac_save_CFLAGS" + + ac_use_gas_preprocessor=no + ac_save_CC="$CC" + if test "x$ac_good_gnu_arm_assembler" = "xno" ; then + CC="gas-preprocessor.pl $CC" + CFLAGS="-x assembler-with-cpp $CFLAGS" + AC_COMPILE_IFELSE([[ + .text + .fpu neon + .arch armv7a + .object_arch armv4 + .arm + .altmacro + pld [r0] + vmovn.u16 d0, q0]], ac_use_gas_preprocessor=yes) + fi + CFLAGS="$ac_save_CFLAGS" + CC="$ac_save_CC" + + if test "x$ac_use_gas_preprocessor" = "xyes" ; then + CCAS="gas-preprocessor.pl $CC" + AC_SUBST([CCAS]) + ac_good_gnu_arm_assembler=yes + fi + if test "x$ac_good_gnu_arm_assembler" = "xyes" ; then $1 else -- 1.7.5.1 |
From: Martin A. <au...@re...> - 2011-05-17 17:52:00
|
In a recent project, we use libjpeg-turbo for remote rendering on iPads. So we are really happy to see ARM NEON support in trunk, especially for the idct. Thank you for this! Unfortunately, building with Apple's ancient gas 1.38 requires some changes and work-arounds: - macros are more limited This is mostly worked around by using gas-preprocessor.pl [1], a perl script mainly developed for compiling ffmpeg for iOS. However, it requires that macro parameters be prefixed with a \. It required some additional changes, the version which works with libjpeg-turbo is available on github [2]. I also asked for the changes to be pulled into the regular gas-preprocessor. If configure detects that the ARM assembler is not suitable, it tries again with gas-preprocessor.pl. It has to be in the PATH for this to work. - the adrl pseudo instruction crashes as I reported this bug to Apple and hope that it gets fixed soon [3]. In the mean time, I propose to use adr and several copies of the same constants instead. I also tried to use 'ldr ip, =consts' inst of 'adrl ip, consts', but then ld failed with 'illegal text reloc'. - symbol names need to be prefixed with _ - on iOS, the detection of the availability of NEON acceleration is done at compile time I used the NEON idct successfully on iOS, however the color space conversion routines (the extrgb one) did not work correctly. I also checked the jpegut output in a QEMU-emulated ARM environment, both the original version (SVN r611) and the patched one produced incorrect, but almost identical, results. And both versions produced varying results across different runs. Hence, I don't think that these problems are due to my changes. A version where I disabled the SIMD color space conversion worked correctly (r611 and patched). I hope that these changes in the following mails are acceptable for libjpeg-turbo. I'd be happy if somebody could help out with ideas for the color conversion problems. And of course I'm open for suggestions for improvement. Best regards, Martin [1] http://github.com/yuvi/gas-preprocessor [2] http://github.com/aumuell/gas-preprocessor [3] http://openradar.appspot.com/radar?id=1197405 |
From: Martin A. <au...@re...> - 2011-05-17 17:51:59
|
a leading \ is required for all macro parameters to be replaced --- simd/jsimd_arm_neon.S | 54 ++++++++++++++++++++++++------------------------ 1 files changed, 27 insertions(+), 27 deletions(-) diff --git a/simd/jsimd_arm_neon.S b/simd/jsimd_arm_neon.S index a8daaeb..73edd30 100644 --- a/simd/jsimd_arm_neon.S +++ b/simd/jsimd_arm_neon.S @@ -43,21 +43,21 @@ /* Supplementary macro for setting function attributes */ .macro asm_function fname - .func fname - .global fname + .func \fname + .global \fname #ifdef __ELF__ - .hidden fname - .type fname, %function + .hidden \fname + .type \fname, %function #endif -fname: +\fname: .endm /* Transpose a block of 4x4 coefficients in four 64-bit registers */ .macro transpose_4x4 x0, x1, x2, x3 - vtrn.16 x0, x1 - vtrn.16 x2, x3 - vtrn.32 x0, x2 - vtrn.32 x1, x3 + vtrn.16 \x0, \x1 + vtrn.16 \x2, \x3 + vtrn.32 \x0, \x2 + vtrn.32 \x1, \x3 .endm /*****************************************************************************/ @@ -266,14 +266,14 @@ jsimd_ycc_rgb_neon_consts: .short -128, -128, -128, -128 .macro do_load size - .if size == 8 + .if \size == 8 vld1.8 {d4}, [U]! vld1.8 {d5}, [V]! vld1.8 {d0}, [Y]! pld [Y, #64] pld [U, #64] pld [V, #64] - .elseif size == 4 + .elseif \size == 4 vld1.8 {d4[0]}, [U]! vld1.8 {d4[1]}, [U]! vld1.8 {d4[2]}, [U]! @@ -286,14 +286,14 @@ jsimd_ycc_rgb_neon_consts: vld1.8 {d0[1]}, [Y]! vld1.8 {d0[2]}, [Y]! vld1.8 {d0[3]}, [Y]! - .elseif size == 2 + .elseif \size == 2 vld1.8 {d4[4]}, [U]! vld1.8 {d4[5]}, [U]! vld1.8 {d5[4]}, [V]! vld1.8 {d5[5]}, [V]! vld1.8 {d0[4]}, [Y]! vld1.8 {d0[5]}, [Y]! - .elseif size == 1 + .elseif \size == 1 vld1.8 {d4[6]}, [U]! vld1.8 {d5[6]}, [V]! vld1.8 {d0[6]}, [Y]! @@ -303,34 +303,34 @@ jsimd_ycc_rgb_neon_consts: .endm .macro do_store bpp, size - .if bpp == 24 - .if size == 8 + .if \bpp == 24 + .if \size == 8 vst3.8 {d10, d11, d12}, [RGB]! - .elseif size == 4 + .elseif \size == 4 vst3.8 {d10[0], d11[0], d12[0]}, [RGB]! vst3.8 {d10[1], d11[1], d12[1]}, [RGB]! vst3.8 {d10[2], d11[2], d12[2]}, [RGB]! vst3.8 {d10[3], d11[3], d12[3]}, [RGB]! - .elseif size == 2 + .elseif \size == 2 vst3.8 {d10[4], d11[4], d12[4]}, [RGB]! vst3.8 {d10[5], d11[5], d12[5]}, [RGB]! - .elseif size == 1 + .elseif \size == 1 vst3.8 {d10[6], d11[6], d12[6]}, [RGB]! .else .error unsupported macroblock size .endif - .elseif bpp == 32 - .if size == 8 + .elseif \bpp == 32 + .if \size == 8 vst4.8 {d10, d11, d12, d13}, [RGB]! - .elseif size == 4 + .elseif \size == 4 vst4.8 {d10[0], d11[0], d12[0], d13[0]}, [RGB]! vst4.8 {d10[1], d11[1], d12[1], d13[1]}, [RGB]! vst4.8 {d10[2], d11[2], d12[2], d13[2]}, [RGB]! vst4.8 {d10[3], d11[3], d12[3], d13[3]}, [RGB]! - .elseif size == 2 + .elseif \size == 2 vst4.8 {d10[4], d11[4], d12[4], d13[4]}, [RGB]! vst4.8 {d10[5], d11[5], d12[5], d13[5]}, [RGB]! - .elseif size == 1 + .elseif \size == 1 vst4.8 {d10[6], d11[6], d12[6], d13[6]}, [RGB]! .else .error unsupported macroblock size @@ -420,7 +420,7 @@ asm_function EXTN(jsimd_ycc_&colorid&_convert_neon) 1: do_load 8 do_yuv_to_rgb - do_store bpp, 8 + do_store \bpp, 8 subs N, N, #8 bge 1b tst N, #7 @@ -441,15 +441,15 @@ asm_function EXTN(jsimd_ycc_&colorid&_convert_neon) do_yuv_to_rgb tst N, #4 beq 6f - do_store bpp, 4 + do_store \bpp, 4 6: tst N, #2 beq 7f - do_store bpp, 2 + do_store \bpp, 2 7: tst N, #1 beq 8f - do_store bpp, 1 + do_store \bpp, 1 8: subs NUM_ROWS, NUM_ROWS, #1 bgt 0b -- 1.7.5.1 |
From: Martin A. <au...@re...> - 2011-05-17 17:51:59
|
if the name OUTPUT_BUF is re-used, as fails for reasons I do not understand --- simd/jsimd_arm_neon.S | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/simd/jsimd_arm_neon.S b/simd/jsimd_arm_neon.S index 73edd30..17eb815 100644 --- a/simd/jsimd_arm_neon.S +++ b/simd/jsimd_arm_neon.S @@ -371,7 +371,7 @@ asm_function EXTN(jsimd_ycc_&colorid&_convert_neon) OUTPUT_WIDTH .req r0 INPUT_BUF .req r1 INPUT_ROW .req r2 - OUTPUT_BUF .req r3 + OUTPUT_BUF0 .req r3 NUM_ROWS .req r4 INPUT_BUF0 .req r5 @@ -412,7 +412,7 @@ asm_function EXTN(jsimd_ycc_&colorid&_convert_neon) mov N, OUTPUT_WIDTH ldr V, [INPUT_BUF2, INPUT_ROW, lsl #2] add INPUT_ROW, INPUT_ROW, #1 - ldr RGB, [OUTPUT_BUF], #4 + ldr RGB, [OUTPUT_BUF0], #4 /* Inner loop over pixels */ subs N, N, #8 @@ -460,7 +460,7 @@ asm_function EXTN(jsimd_ycc_&colorid&_convert_neon) .unreq OUTPUT_WIDTH .unreq INPUT_ROW - .unreq OUTPUT_BUF + .unreq OUTPUT_BUF0 .unreq NUM_ROWS .unreq INPUT_BUF0 .unreq INPUT_BUF1 -- 1.7.5.1 |
From: Martin A. <au...@re...> - 2011-05-17 17:51:59
|
load constants' address via adr, but as this can only cover shorter address ranges, emit a copy of the constants for each function --- simd/jsimd_arm_neon.S | 31 +++++++++++++++++++++++++++++-- 1 files changed, 29 insertions(+), 2 deletions(-) diff --git a/simd/jsimd_arm_neon.S b/simd/jsimd_arm_neon.S index 17eb815..ecbd265 100644 --- a/simd/jsimd_arm_neon.S +++ b/simd/jsimd_arm_neon.S @@ -258,12 +258,33 @@ asm_function EXTN(jsimd_idct_ifast_neon) * Colorspace conversion YCbCr -> RGB */ + +/* Apple gas crashes on adrl, work around that by using adr, + * emit a copy of the constants for each function as adr requires shorter address ranges. + * For non-limited gas, just emit one copy and use adrl. + */ + +.macro emit_ycc_consts colorid .balign 16 -jsimd_ycc_rgb_neon_consts: +jsimd_ycc_&colorid&_neon_consts: .short 0, 0, 0, 0 .short 22971, -11277, -23401, 29033 .short -128, -128, -128, -128 .short -128, -128, -128, -128 +.endm + +#ifndef __APPLE__ +emit_ycc_consts rgb +#endif + +.macro load_ycc_consts_address colorid +#ifdef __APPLE__ + adr ip, jsimd_ycc_&colorid&_neon_consts +#else + adrl ip, jsimd_ycc_rgb_neon_consts +#endif +.endm + .macro do_load size .if \size == 8 @@ -367,6 +388,10 @@ jsimd_ycc_rgb_neon_consts: vqmovun.s16 d1&b_offs, q14 .endm +#ifdef __APPLE__ +emit_ycc_consts \colorid +#endif + asm_function EXTN(jsimd_ycc_&colorid&_convert_neon) OUTPUT_WIDTH .req r0 INPUT_BUF .req r1 @@ -385,7 +410,7 @@ asm_function EXTN(jsimd_ycc_&colorid&_convert_neon) N .req ip /* Load constants to d1, d2, d3 (d0 is just used for padding) */ - adrl ip, jsimd_ycc_rgb_neon_consts + load_ycc_consts_address \colorid vld1.16 {d0, d1, d2, d3}, [ip, :128] /* Save ARM registers and handle input arguments */ @@ -484,6 +509,8 @@ generate_jsimd_ycc_rgb_convert_neon extbgrx, 32, 2, 1, 0 generate_jsimd_ycc_rgb_convert_neon extxbgr, 32, 3, 2, 1 generate_jsimd_ycc_rgb_convert_neon extxrgb, 32, 1, 2, 3 +.purgem emit_ycc_consts +.purgem load_ycc_consts_address .purgem do_load .purgem do_store -- 1.7.5.1 |
From: Martin A. <au...@re...> - 2011-05-17 17:51:59
|
--- simd/jsimd_arm_neon.S | 10 ++++++++-- 1 files changed, 8 insertions(+), 2 deletions(-) diff --git a/simd/jsimd_arm_neon.S b/simd/jsimd_arm_neon.S index 2d66ab2..a8daaeb 100644 --- a/simd/jsimd_arm_neon.S +++ b/simd/jsimd_arm_neon.S @@ -22,6 +22,12 @@ * 3. This notice may not be removed or altered from any source distribution. */ +#ifdef __APPLE__ +#define EXTN(f) _ ## f +#else +#define EXTN(f) f +#endif + #if defined(__linux__) && defined(__ELF__) .section .note.GNU-stack,"",%progbits /* mark stack as non-executable */ #endif @@ -133,7 +139,7 @@ jsimd_idct_ifast_neon_consts: vadd.s16 \x4, \x4, \t10 .endm -asm_function jsimd_idct_ifast_neon +asm_function EXTN(jsimd_idct_ifast_neon) DCT_TABLE .req r0 COEF_BLOCK .req r1 @@ -361,7 +367,7 @@ jsimd_ycc_rgb_neon_consts: vqmovun.s16 d1&b_offs, q14 .endm -asm_function jsimd_ycc_&colorid&_convert_neon +asm_function EXTN(jsimd_ycc_&colorid&_convert_neon) OUTPUT_WIDTH .req r0 INPUT_BUF .req r1 INPUT_ROW .req r2 -- 1.7.5.1 |
From: Vladimir P. <the...@gm...> - 2011-05-02 00:48:13
|
Hello DRC, Glad to know the problem had an easy fix. Thank you for your work on this library. Monday, May 2, 2011, 3:42:17 AM, you wrote: > Ah. OK. Your initial statement that "cjpeg (and possibly other > libjpeg-turbo components) is broken on Windows" made me believe that you > were claiming the library didn't work at all, which I knew was not the > case. I was not in front of a computer where I could try it for myself > (nor do I generally do so if the initial message suggests that user > error is the cause of the problem.) > I have added the USE_SETMODE definition to the builds of cjpeg and djpeg > on Windows, which should fix this issue. > On 5/1/11 5:47 AM, Vladimir Panteleev wrote: >> Hello DRC, >> >> I found the problem... it's caused by redirecting standard output to >> a file. Newline characters (0x0D) are converted to Windows newlines >> (0x0D 0x0A). >> >> Although it's partially my own fault for not noticing the -outfile >> parameter, I should note two things: >> >> 1) Certain other Windows command-line applications have no trouble >> writing correct binary output (redirected by the user to a file) to >> stdout >> >> 2) cjpeg from libjpeg-8c does not support redirection, possibly for >> the same reason, and requires a mandatory "outputfile" parameter. >> >> Sunday, May 1, 2011, 8:06:33 AM, you wrote: >> >>> cjpeg does not support png input. I would be very surprised if >>> there were a systemic problem, as no one has reported anything prior >>> to now, not to mention the fact that my other projects, VirtualGL >>> and TurboVNC, use libjpeg-turbo, and the Windows versions of those packages work fine. >> >>> If you are still experiencing image corruption when compressing a >>> supported image format, such as BMP or PPM, then provide the source >>> image, the exact command line used, and the version of libjpeg-turbo used. >> >>> On Apr 30, 2011, at 8:02 PM, Vladimir Panteleev <the...@gm...> wrote: >> >>>> Hello Libjpeg-turbo-users, >>>> >>>> As far as I can tell, cjpeg (and possibly other libjpeg-turbo >>>> components) is broken on Windows, and produces corrupted JPEG >>>> images. For example, converting Lenna >>>> (http://en.wikipedia.org/wiki/File:Lenna.png) produces the following >>>> output: >>>> >>>> http://i.imgur.com/vynUi.jpg >>>> >>>> This doesn't seem to depend on the image, compiler, CPU, target >>>> architecture, or Windows version. >>>> >>>> -- >>>> Best regards, >>>> Vladimir mailto:the...@gm... >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> WhatsUp Gold - Download Free Network Management Software >>>> The most intuitive, comprehensive, and cost-effective network >>>> management toolset available today. Delivers lowest initial >>>> acquisition cost and overall TCO of any competing solution. >>>> http://p.sf.net/sfu/whatsupgold-sd >>>> _______________________________________________ >>>> Libjpeg-turbo-users mailing list >>>> Lib...@li... >>>> https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-users >> >>> ------------------------------------------------------------------------------ >>> WhatsUp Gold - Download Free Network Management Software >>> The most intuitive, comprehensive, and cost-effective network >>> management toolset available today. Delivers lowest initial >>> acquisition cost and overall TCO of any competing solution. >>> http://p.sf.net/sfu/whatsupgold-sd >>> _______________________________________________ >>> Libjpeg-turbo-users mailing list >>> Lib...@li... >>> https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-users >> >> >> > ------------------------------------------------------------------------------ > WhatsUp Gold - Download Free Network Management Software > The most intuitive, comprehensive, and cost-effective network > management toolset available today. Delivers lowest initial > acquisition cost and overall TCO of any competing solution. > http://p.sf.net/sfu/whatsupgold-sd > _______________________________________________ > Libjpeg-turbo-users mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-users -- Best regards, Vladimir mailto:the...@gm... |
From: DRC <dco...@us...> - 2011-05-02 00:42:26
|
Ah. OK. Your initial statement that "cjpeg (and possibly other libjpeg-turbo components) is broken on Windows" made me believe that you were claiming the library didn't work at all, which I knew was not the case. I was not in front of a computer where I could try it for myself (nor do I generally do so if the initial message suggests that user error is the cause of the problem.) I have added the USE_SETMODE definition to the builds of cjpeg and djpeg on Windows, which should fix this issue. On 5/1/11 5:47 AM, Vladimir Panteleev wrote: > Hello DRC, > > I found the problem... it's caused by redirecting standard output to > a file. Newline characters (0x0D) are converted to Windows newlines > (0x0D 0x0A). > > Although it's partially my own fault for not noticing the -outfile > parameter, I should note two things: > > 1) Certain other Windows command-line applications have no trouble > writing correct binary output (redirected by the user to a file) to > stdout > > 2) cjpeg from libjpeg-8c does not support redirection, possibly for > the same reason, and requires a mandatory "outputfile" parameter. > > Sunday, May 1, 2011, 8:06:33 AM, you wrote: > >> cjpeg does not support png input. I would be very surprised if >> there were a systemic problem, as no one has reported anything prior >> to now, not to mention the fact that my other projects, VirtualGL >> and TurboVNC, use libjpeg-turbo, and the Windows versions of those packages work fine. > >> If you are still experiencing image corruption when compressing a >> supported image format, such as BMP or PPM, then provide the source >> image, the exact command line used, and the version of libjpeg-turbo used. > >> On Apr 30, 2011, at 8:02 PM, Vladimir Panteleev <the...@gm...> wrote: > >>> Hello Libjpeg-turbo-users, >>> >>> As far as I can tell, cjpeg (and possibly other libjpeg-turbo >>> components) is broken on Windows, and produces corrupted JPEG >>> images. For example, converting Lenna >>> (http://en.wikipedia.org/wiki/File:Lenna.png) produces the following >>> output: >>> >>> http://i.imgur.com/vynUi.jpg >>> >>> This doesn't seem to depend on the image, compiler, CPU, target >>> architecture, or Windows version. >>> >>> -- >>> Best regards, >>> Vladimir mailto:the...@gm... >>> >>> >>> ------------------------------------------------------------------------------ >>> WhatsUp Gold - Download Free Network Management Software >>> The most intuitive, comprehensive, and cost-effective network >>> management toolset available today. Delivers lowest initial >>> acquisition cost and overall TCO of any competing solution. >>> http://p.sf.net/sfu/whatsupgold-sd >>> _______________________________________________ >>> Libjpeg-turbo-users mailing list >>> Lib...@li... >>> https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-users > >> ------------------------------------------------------------------------------ >> WhatsUp Gold - Download Free Network Management Software >> The most intuitive, comprehensive, and cost-effective network >> management toolset available today. Delivers lowest initial >> acquisition cost and overall TCO of any competing solution. >> http://p.sf.net/sfu/whatsupgold-sd >> _______________________________________________ >> Libjpeg-turbo-users mailing list >> Lib...@li... >> https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-users > > > |
From: Vladimir P. <the...@gm...> - 2011-05-01 10:47:12
|
Hello DRC, I found the problem... it's caused by redirecting standard output to a file. Newline characters (0x0D) are converted to Windows newlines (0x0D 0x0A). Although it's partially my own fault for not noticing the -outfile parameter, I should note two things: 1) Certain other Windows command-line applications have no trouble writing correct binary output (redirected by the user to a file) to stdout 2) cjpeg from libjpeg-8c does not support redirection, possibly for the same reason, and requires a mandatory "outputfile" parameter. Sunday, May 1, 2011, 8:06:33 AM, you wrote: > cjpeg does not support png input. I would be very surprised if > there were a systemic problem, as no one has reported anything prior > to now, not to mention the fact that my other projects, VirtualGL > and TurboVNC, use libjpeg-turbo, and the Windows versions of those packages work fine. > If you are still experiencing image corruption when compressing a > supported image format, such as BMP or PPM, then provide the source > image, the exact command line used, and the version of libjpeg-turbo used. > On Apr 30, 2011, at 8:02 PM, Vladimir Panteleev <the...@gm...> wrote: >> Hello Libjpeg-turbo-users, >> >> As far as I can tell, cjpeg (and possibly other libjpeg-turbo >> components) is broken on Windows, and produces corrupted JPEG >> images. For example, converting Lenna >> (http://en.wikipedia.org/wiki/File:Lenna.png) produces the following >> output: >> >> http://i.imgur.com/vynUi.jpg >> >> This doesn't seem to depend on the image, compiler, CPU, target >> architecture, or Windows version. >> >> -- >> Best regards, >> Vladimir mailto:the...@gm... >> >> >> ------------------------------------------------------------------------------ >> WhatsUp Gold - Download Free Network Management Software >> The most intuitive, comprehensive, and cost-effective network >> management toolset available today. Delivers lowest initial >> acquisition cost and overall TCO of any competing solution. >> http://p.sf.net/sfu/whatsupgold-sd >> _______________________________________________ >> Libjpeg-turbo-users mailing list >> Lib...@li... >> https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-users > ------------------------------------------------------------------------------ > WhatsUp Gold - Download Free Network Management Software > The most intuitive, comprehensive, and cost-effective network > management toolset available today. Delivers lowest initial > acquisition cost and overall TCO of any competing solution. > http://p.sf.net/sfu/whatsupgold-sd > _______________________________________________ > Libjpeg-turbo-users mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-users -- Best regards, Vladimir mailto:the...@gm... |
From: Vladimir P. <the...@gm...> - 2011-05-01 10:32:19
|
Hello DRC, I wondered if I should mention that I converted the PNG to a BMP first, but I thought it was obvious from that 1) cjpeg accepted the input file instead of printing an error message, as it will do with a PNG image 2) you can see visible discolored fragments of the input image in the corrupted output. But to clear that up, yes, the Windows version of cjpeg will produce corrupted output with BMP, TGA and PPM input. No command-line options I tried affect this behavior. This behavior exists since the earliest libjpeg-turbo version that comes its own cjpeg. This includes 1.1.0 and the SVN HEAD as of yesterday. Please try it yourself. I should mention that the library itself works fine. Sunday, May 1, 2011, 8:06:33 AM, you wrote: > cjpeg does not support png input. I would be very surprised if > there were a systemic problem, as no one has reported anything prior > to now, not to mention the fact that my other projects, VirtualGL > and TurboVNC, use libjpeg-turbo, and the Windows versions of those packages work fine. > If you are still experiencing image corruption when compressing a > supported image format, such as BMP or PPM, then provide the source > image, the exact command line used, and the version of libjpeg-turbo used. > On Apr 30, 2011, at 8:02 PM, Vladimir Panteleev <the...@gm...> wrote: >> Hello Libjpeg-turbo-users, >> >> As far as I can tell, cjpeg (and possibly other libjpeg-turbo >> components) is broken on Windows, and produces corrupted JPEG >> images. For example, converting Lenna >> (http://en.wikipedia.org/wiki/File:Lenna.png) produces the following >> output: >> >> http://i.imgur.com/vynUi.jpg >> >> This doesn't seem to depend on the image, compiler, CPU, target >> architecture, or Windows version. >> >> -- >> Best regards, >> Vladimir mailto:the...@gm... >> >> >> ------------------------------------------------------------------------------ >> WhatsUp Gold - Download Free Network Management Software >> The most intuitive, comprehensive, and cost-effective network >> management toolset available today. Delivers lowest initial >> acquisition cost and overall TCO of any competing solution. >> http://p.sf.net/sfu/whatsupgold-sd >> _______________________________________________ >> Libjpeg-turbo-users mailing list >> Lib...@li... >> https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-users > ------------------------------------------------------------------------------ > WhatsUp Gold - Download Free Network Management Software > The most intuitive, comprehensive, and cost-effective network > management toolset available today. Delivers lowest initial > acquisition cost and overall TCO of any competing solution. > http://p.sf.net/sfu/whatsupgold-sd > _______________________________________________ > Libjpeg-turbo-users mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-users -- Best regards, Vladimir mailto:the...@gm... |
From: DRC <dco...@us...> - 2011-05-01 05:06:45
|
cjpeg does not support png input. I would be very surprised if there were a systemic problem, as no one has reported anything prior to now, not to mention the fact that my other projects, VirtualGL and TurboVNC, use libjpeg-turbo, and the Windows versions of those packages work fine. If you are still experiencing image corruption when compressing a supported image format, such as BMP or PPM, then provide the source image, the exact command line used, and the version of libjpeg-turbo used. On Apr 30, 2011, at 8:02 PM, Vladimir Panteleev <the...@gm...> wrote: > Hello Libjpeg-turbo-users, > > As far as I can tell, cjpeg (and possibly other libjpeg-turbo > components) is broken on Windows, and produces corrupted JPEG > images. For example, converting Lenna > (http://en.wikipedia.org/wiki/File:Lenna.png) produces the following > output: > > http://i.imgur.com/vynUi.jpg > > This doesn't seem to depend on the image, compiler, CPU, target > architecture, or Windows version. > > -- > Best regards, > Vladimir mailto:the...@gm... > > > ------------------------------------------------------------------------------ > WhatsUp Gold - Download Free Network Management Software > The most intuitive, comprehensive, and cost-effective network > management toolset available today. Delivers lowest initial > acquisition cost and overall TCO of any competing solution. > http://p.sf.net/sfu/whatsupgold-sd > _______________________________________________ > Libjpeg-turbo-users mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-users |
From: Vladimir P. <the...@gm...> - 2011-05-01 01:04:11
|
Hello Libjpeg-turbo-users, As far as I can tell, cjpeg (and possibly other libjpeg-turbo components) is broken on Windows, and produces corrupted JPEG images. For example, converting Lenna (http://en.wikipedia.org/wiki/File:Lenna.png) produces the following output: http://i.imgur.com/vynUi.jpg This doesn't seem to depend on the image, compiler, CPU, target architecture, or Windows version. -- Best regards, Vladimir mailto:the...@gm... |
From: DRC <dco...@us...> - 2011-02-27 02:36:10
|
http://sourceforge.net/projects/libjpeg-turbo/files/1.1.0/ Significant changes since 1.0.90 (1.1 beta1) ============================================ [1] The algorithm used by the SIMD quantization function cannot produce correct results when the JPEG quality is >= 98 and the fast integer forward DCT is used. Thus, the non-SIMD quantization function is now used for those cases, and libjpeg-turbo should now produce identical output to libjpeg v6b in all cases. [2] Despite the above, the fast integer forward DCT still degrades somewhat for JPEG qualities greater than 95, so TurboJPEG/OSS will now automatically use the slow integer forward DCT when generating JPEG images of quality 96 or greater. This reduces compression performance by as much as 15% for these high-quality images but is necessary to ensure that the images are perceptually lossless. It also ensures that the library can avoid the performance pitfall created by [1]. [3] Ported jpgtest.cxx to pure C to avoid the need for a C++ compiler. [4] Fixed visual artifacts in grayscale JPEG compression caused by a typo in the RGB-to-chrominance lookup tables. [5] The Windows distribution packages now include the libjpeg run-time programs (cjpeg, etc.) [6] All packages now include jpgtest. [7] The TurboJPEG dynamic library now uses versioned symbols. [8] Added two new TurboJPEG API functions, tjEncodeYUV() and tjDecompressToYUV(), to replace the somewhat hackish TJ_YUV flag. |
From: DRC <dco...@us...> - 2011-01-06 01:53:25
|
https://sourceforge.net/projects/libjpeg-turbo/files/1.0.90%20%281.1beta1%29/ Significant changes since 1.0.1 =============================== [1] Added emulation of the libjpeg v7 and v8b APIs and ABIs. See README-turbo.txt for more details. This feature was sponsored by CamTrace SAS. [2] Created a new CMake-based build system for the Visual C++ and MinGW builds. [3] TurboJPEG/OSS can now compress from/decompress to grayscale bitmaps. [4] jpgtest can now be used to test decompression performance with existing JPEG images. [5] If the default install prefix (/opt/libjpeg-turbo) is used, then 'make install' now creates /opt/libjpeg-turbo/lib32 and /opt/libjpeg-turbo/lib64 sym links to duplicate the behavior of the binary packages. [6] All symbols in the libjpeg-turbo dynamic library are now versioned, even when the library is built with libjpeg v6b emulation. [7] Added arithmetic encoding and decoding support (can be disabled with configure or CMake options) [8] Added a TJ_YUV flag to TurboJPEG/OSS which causes both the compressor and decompressor to output planar YUV images. [9] Added an extended version of tjDecompressHeader() to TurboJPEG/OSS which allows the caller to determine the type of subsampling used in a JPEG image. [10] Added further protections against invalid Huffman codes. |
From: Henrik A. <hen...@ce...> - 2010-12-07 08:37:48
|
Hi i have been working on extending the buildsystem to gcc platforms, and here comes the initial patch... I have also verified the cmake build in a cross compiled environment for targets win32 and win64 I'm unshure if this breaks the Visual Studio support (untested) It does lack a CPack configuration but that might be next step to support creation of source tarballs and rpm,deb,nsis packages -- Henrik Andersson hen...@ce... System Developer +46 (0)13-290860 Cendio AB http://www.cendio.com/ |