Menu

#285 Upstream fix for register corruption in CpuArch.c

open
nobody
None
5
2016-06-07
2016-02-16
No

Please consider accepting the following patch to CpuArch.c in the LZMA SDK, upstreamed from Chromium.

The patch fixes two crashes we've seen in Chromium-derived projects:
(1) Register corruption due to clobbering $eax, $edx in CheckFlag. This clobbering harmless in many builds, but when the function is inlined is very likely to cause a crash, as $eax and $edx may contain information. This solution uses explicit clobber flags to warn the compiler that the values of $eax and $edx may change following the operation.
(2) Register corruption due to use of mov on $ebx/$edi on 64-bit builds: this solution uses $rbx, $rdi instead of $ebx, $edi on 64-bit platforms. (Using 'mov' on $ebx will zero the top 32 bits, but in AMD64, $ebx may contain a 64-bit value, so the existing code does not correctly save and restore the register value in 64-bit builds.)

The .patch file from version 15.14 of the SDK is attached. The effect of the patch is summarized at: https://codereview.chromium.org/1700453002/diff2/160001:180001/third_party/lzma_sdk/CpuArch.c

The history of these issues is detailed at:
https://code.google.com/p/chromium/issues/detail?id=248385
https://code.google.com/p/chromium/issues/detail?id=261676
and the associate changelists.

1 Attachments

Discussion

  • Igor Pavlov

    Igor Pavlov - 2016-02-17

    Thanks for patch.

    What do you think about defined(__PIC__) check?
    Why do we use __PIC__ check for MY_CPU_X86, but there is no such check for MY_CPU_AMD64?

    defined(MY_CPU_X86) && defined(__PIC__)
    
     
  • Joshua Pawlicki

    Joshua Pawlicki - 2016-02-17

    Good call, I think that's an oversight in the patch.

     
  • Igor Pavlov

    Igor Pavlov - 2016-02-18

    I don't develop with GCC with different systems. So I can't test that code in all cases. And I need exact code instructions and some text description that explains what we are doing.

     
  • Joshua Pawlicki

    Joshua Pawlicki - 2016-02-19

    OK. Looking more at this, I'm curious why we bother doing the register swaps at all rather than just declaring that the inline assembly clobbers the corresponding register contents, and leaving it up to the compiler to reason about the contents of those registers.

    Anyways, even if that's a bad idea for some reason, I agree the PIC check should be added. Given that, I propose: I'll make the corresponding changes in Chromium and let it bake there. Chromium is built for (and deployed to) many different systems and architectures, and therefore it's likely that if any stability regression is introduced by these changes we'll detect it. If everything looks good, I'll send you another patchset after M50 hits the stable channel (~April 20th) and we don't see any rise in crashes.

     
  • Joshua Pawlicki

    Joshua Pawlicki - 2016-02-19

    Oh, one important question here is whether lzma_sdk needs to support GCC 4.9 or whether it's OK to require 5.0. If requiring 5.0 is not OK, we can't quite simplify the MyCPUID function as described above.

     

    Last edit: Joshua Pawlicki 2016-02-19
  • Igor Pavlov

    Igor Pavlov - 2016-02-20

    I want to support any GCC version.
    What problem with GCC before 5.0?

     
  • Joshua Pawlicki

    Joshua Pawlicki - 2016-02-23

    GCC 4.9 uses the bx register to track the global offset table in position-independent-code, and makes it a compile-time error to include any assembly code that clobbers bx or uses bx as an output register. This is why the if defined(MY_CPU_X86) && defined(__PIC__) is necessary: the code must use di as the output register instead, and manually restore bx after cpuid.

    GCC 5.0 relaxes this restriction, so that it is legal to specify bx as an output register even if compiling position-independent-code. If lzma_sdk already required the use of GCC 5.0, we could resolve the bug simply by dropping the entire if defined(MY_CPU_X86) && defined(__PIC__) block. But, if the goal is also to support 4.9, we need that block, and need to add the if defined(MY_CPU_AMD64) && defined(__PIC__) block.

     
  • Igor Pavlov

    Igor Pavlov - 2016-02-24

    Thanks for description

    Do you think that there is difference for __PIC__ and non __PIC__ modes for amd64 in gcc?

    And maybe someone knows how "cpuid" is called in another open source projects?

     
  • Joshua Pawlicki

    Joshua Pawlicki - 2016-02-24

    In non __PIC__ mode (regardless of architecture), it's not a problem for the inline-assembly to use =b as output, so the xchg/mov isn't necessary.

    This general approach is how GCC does it itself: for example looking at my /usr/lib/gcc/x86_64-linux-gnu/4.8.4/include/cpuid.h, I see:

    #if defined(__i386__) && defined(__PIC__)
    /* %ebx may be the PIC register.  */
    #if __GNUC__ >= 3
    #define __cpuid(level, a, b, c, d)          \
    __asm__ ("xchg{l}\t{%%}ebx, %k1\n\t"            \
    "cpuid\n\t"                 \
    "xchg{l}\t{%%}ebx, %k1\n\t"         \
    : "=a" (a), "=&r" (b), "=c" (c), "=d" (d)   \
    : "0" (level))
    
    ...[__cpuid_count omitted]...
    
    #else
    /* Host GCCs older than 3.0 weren't supporting Intel asm syntax
    nor alternatives in i386 code.  */
    #define __cpuid(level, a, b, c, d)          \
    __asm__ ("xchgl\t%%ebx, %k1\n\t"            \
    "cpuid\n\t"                 \
    "xchgl\t%%ebx, %k1\n\t"         \
    : "=a" (a), "=&r" (b), "=c" (c), "=d" (d)   \
    : "0" (level))
    
    ...[__cpuid_count omitted]...
    #endif
    #elif defined(__x86_64__) && (defined(__code_model_medium__) || defined(__code_model_large__)) && defined(__PIC__)
    /* %rbx may be the PIC register.  */
    #define __cpuid(level, a, b, c, d)          \
    __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t"            \
    "cpuid\n\t"                 \
    "xchg{q}\t{%%}rbx, %q1\n\t"         \
    : "=a" (a), "=&r" (b), "=c" (c), "=d" (d)   \
    : "0" (level))
    
    ...[__cpuid_count omitted]...
    #else
    #define __cpuid(level, a, b, c, d)          \
    __asm__ ("cpuid\n\t"                    \
    : "=a" (a), "=b" (b), "=c" (c), "=d" (d)    \
    : "0" (level))
    
    ...[__cpuid_count omitted]...
    #endif
    
     
  • Joshua Pawlicki

    Joshua Pawlicki - 2016-06-01

    The attached fix has been tested in QA and in the wild for several months on [Windows, OS X, Android, iOS, FreeBSD, OpenBSD, and Linux] on [a large number] of [x86, x86_64, and ARM] machines running Google Chrome or other Chromium-derived browsers. So far haven't seen any regression in crashes.

    I need exact code instructions
    I've attached the file after modifications.

    and some text description that explains what we are doing.
    The comment in the file should explain it. Let me know if there's any other way I can help.

    Thanks for your time!

     
  • Igor Pavlov

    Igor Pavlov - 2016-06-02

    I suppose CpuArch.c in 16.02 version is similar to that code.
    So it must work same way.

     
  • Joshua Pawlicki

    Joshua Pawlicki - 2016-06-07

    Ah, yep, looks great in 16.02. Thanks!

     

Log in to post a comment.