Hi,
jsimd_arm_neon.S and jsimd_arm_neon_64.S use .func / .endfunc around functions. This only has an effect with STABS debug info (-gstabs), but almost everyone builds with dwarf debug info (and the only effect is to emit debug entries). Furthermore, clang's integrated assembler doesn't understand .func / .endfunc (and they don't want to add STABS support, as it's not really used anymore): http://llvm.org/bugs/show_bug.cgi?id=20424
Would it be possible to just remove the .func / .endfunc lines from libjpeg_turbo? Then it would compile with clang's built-in assembler and I don't think anyone would miss those lines.
Thanks,
Nico
ps: You simplify the ifdef apple block in these two files by using USER_LABEL_PREFIX -- that's a symbol that evaluates to "_" on systems that use underscores and to nothing elsewhere. Example:
#define GLUE2(a, b) a ## b
#define GLUE(a, b) GLUE2(a, b)
#define SYMBOL_NAME(name) GLUE(__USER_LABEL_PREFIX__, name)
.macro asm_function fname
SYMBOL_NAME(\fname)
...
Whow. That's supposed to be
#define GLUEetc. But the .func / .endfunc bit is the bit I'm really interested in.Forwarded to the NEON developers, who can hopefully answer the question. I've never had trouble with building the ARMv7 code for iOS using clang, but we use gas-preprocessor.pl, which I believe filters out or renames the .func/.endfunc directives.
The asm_function macro serves other purposes (note the #ifdef __ELF__), so it wouldn't really simplify it to do as you suggest-- it would just make it more difficult to read.
Last edit: DRC 2014-08-18
Also, FYI-- SourceForge's tracker editor is stupid, and it will turn a line into
large characters
if the line is prefixed by a # sign. To get around this, indent lines with tab in order to
Hi Nicolas,
It would be easier for everyone if you could submit a patch with your
proposed changes and also provide the compilation logs before and after
it is applied (to confirm that there was a problem and that the patch helps).
I don't have any Apple device, so can't comment much. But is it really
the only required change? I used to think that gas-preprocessor.pl
actually takes care of a lot more problems.
This is the patch I applied in chromium: https://codereview.chromium.org/481243002/diff/1/simd/jsimd_arm_neon.S
It builds fine on android and linux, with gcc and with clang's integrated assembler. (We're using a clang close to clang trunk, not the one bundled with Xcode.) This is the only change I had to make to get libjpeg_turbo compile with clang's integrated assembler. (Again, targeting android and linux, not ios.)
(The libjpeg_turbo website said that filing tickets is the way to go. If there's better documentation somewhere how to contribute patches, I'd appreciate a pointer to it, and apologize for missing it.)
Thanks for clarifying that you actually use clang in android and linux.
I have just tried to compile jsimd_arm_neon.S with clang 3.4 (the latest release at the moment) and it gives me the following error messages:
... and a lot more errors, but I skipped the rest of the log.
Removing .func/.endfunc eliminates only the .func/.endfunc complaints, but the rest of the problems are still there. So could you please clarify it again? How exactly you are compiling libjpeg-turbo in arm linux using clang? What is your clang version? Any other important details?
BTW, you are right about the .func/.endfunc directives redundancy.
The GCC compiler does not emit them (anymore?) in the generated
assembly code if compiling something with '-S' option. So we indeed
should not do it either.
Use the 'Add attachments' link on this page and attach the patch here.
Upon further reflection, I actually ran into that same problem with iOS builds as well (the "invalid operand for instruction" errors.) I had to modify the build instructions so that -no-integrated-as is passed to clang to work around the issue. This supposedly invokes the "system assembler", but I'm not sure what that means, since I'm cross-compiling on an x86 system.
clang 3.5 will be released soon, maybe it'll be able to compile libjpeg-turbo with this patch (but even it might not have enough arm asm support -- some of my changes to clang's arm assembler landed just last week, and there's more work to be done as far as I can tell. Clang trunk can compile libjpeg-turbo with the attached patch applied.)
Thanks, the patch looks good to me (DRC can probably already apply it
safely). And both patched/unpatched jsimd_arm_neon.S produce binary
identical object files when compiled with the GNU assembler.
It is good to know that the integrated assembler in clang is getting
improved and we may eventually retire the gas-preprocessor.pl (which
was really a stop gap solution). It is also very good that Apple
decided not to invent some sort of a "new" macro preprocessor, but
apparently intends to ensure compatibility with the GNU assembler,
which is a de-facto standard for the ARM assembly optimizations.
And as the last request, could you please confirm that "make test"
check passes if libjpeg-turbo is compiled with clang 3.5?
Siarhei, were you ever able to build this? I'm still getting the same errors that you were above. Is clang 3.5 required? Can you verify whether the same thing needs to be done in the ARM64 code?
I only checked that this change does not cause regressions for the
GNU assembler.
My understanding is that Nicolas is one of the clang developers or
contributors. And they are working on improving the integrated
assembler in the future clang 3.5 release. We can only welcome
this news, but can't easily verify the improvements yet.
As for the ARM64 code, I can't verify it myself, but would guess
that .func/.endfunc directives can be also safely removed.
Yes, I'm a clang contributor. If you have a 64 bit file, doing the same change there would be good.
…oh, I see that https://sourceforge.net/p/libjpeg-turbo/code/1375 did the change to both files. Thanks much! :-)
You are welcome :-)
Still, as I mentioned earlier, you are strongly encouraged to make sure
that "make test" check does not detect any problems when libjpeg-turbo
is built with clang 3.5 and its integrated assembler. We are interested not
only in successful compilation, but also in correct execution at runtime.