Please consider accepting the following patch to CpuArch.c in the LZMA SDK, upstreamed from Chromium.
The patch fixes two crashes we've seen in Chromium-derived projects:
(1) Register corruption due to clobbering $eax, $edx in CheckFlag. This clobbering harmless in many builds, but when the function is inlined is very likely to cause a crash, as $eax and $edx may contain information. This solution uses explicit clobber flags to warn the compiler that the values of $eax and $edx may change following the operation.
(2) Register corruption due to use of mov on $ebx/$edi on 64-bit builds: this solution uses $rbx, $rdi instead of $ebx, $edi on 64-bit platforms. (Using 'mov' on $ebx will zero the top 32 bits, but in AMD64, $ebx may contain a 64-bit value, so the existing code does not correctly save and restore the register value in 64-bit builds.)
The .patch file from version 15.14 of the SDK is attached. The effect of the patch is summarized at: https://codereview.chromium.org/1700453002/diff2/160001:180001/third_party/lzma_sdk/CpuArch.c
The history of these issues is detailed at:
https://code.google.com/p/chromium/issues/detail?id=248385
https://code.google.com/p/chromium/issues/detail?id=261676
and the associate changelists.
Thanks for patch.
What do you think about
defined(__PIC__)check?Why do we use
__PIC__check for MY_CPU_X86, but there is no such check for MY_CPU_AMD64?Good call, I think that's an oversight in the patch.
I don't develop with GCC with different systems. So I can't test that code in all cases. And I need exact code instructions and some text description that explains what we are doing.
OK. Looking more at this, I'm curious why we bother doing the register swaps at all rather than just declaring that the inline assembly clobbers the corresponding register contents, and leaving it up to the compiler to reason about the contents of those registers.
Anyways, even if that's a bad idea for some reason, I agree the PIC check should be added. Given that, I propose: I'll make the corresponding changes in Chromium and let it bake there. Chromium is built for (and deployed to) many different systems and architectures, and therefore it's likely that if any stability regression is introduced by these changes we'll detect it. If everything looks good, I'll send you another patchset after M50 hits the stable channel (~April 20th) and we don't see any rise in crashes.
Oh, one important question here is whether lzma_sdk needs to support GCC 4.9 or whether it's OK to require 5.0. If requiring 5.0 is not OK, we can't quite simplify the MyCPUID function as described above.
Last edit: Joshua Pawlicki 2016-02-19
I want to support any GCC version.
What problem with GCC before 5.0?
GCC 4.9 uses the
bxregister to track the global offset table in position-independent-code, and makes it a compile-time error to include any assembly code that clobbersbxor usesbxas an output register. This is why theif defined(MY_CPU_X86) && defined(__PIC__)is necessary: the code must usedias the output register instead, and manually restorebxaftercpuid.GCC 5.0 relaxes this restriction, so that it is legal to specify
bxas an output register even if compiling position-independent-code. If lzma_sdk already required the use of GCC 5.0, we could resolve the bug simply by dropping the entireif defined(MY_CPU_X86) && defined(__PIC__)block. But, if the goal is also to support 4.9, we need that block, and need to add theif defined(MY_CPU_AMD64) && defined(__PIC__)block.Thanks for description
Do you think that there is difference for
__PIC__and non__PIC__modes for amd64 in gcc?And maybe someone knows how "cpuid" is called in another open source projects?
In non
__PIC__mode (regardless of architecture), it's not a problem for the inline-assembly to use=bas output, so thexchg/movisn't necessary.This general approach is how GCC does it itself: for example looking at my /usr/lib/gcc/x86_64-linux-gnu/4.8.4/include/cpuid.h, I see:
The attached fix has been tested in QA and in the wild for several months on [Windows, OS X, Android, iOS, FreeBSD, OpenBSD, and Linux] on [a large number] of [x86, x86_64, and ARM] machines running Google Chrome or other Chromium-derived browsers. So far haven't seen any regression in crashes.
Thanks for your time!
I suppose CpuArch.c in 16.02 version is similar to that code.
So it must work same way.
Ah, yep, looks great in 16.02. Thanks!