Menu

#1712 rexx.img varies from ASLR

5.0.0
closed
nobody
None
none
5
2023-01-01
2020-06-30
No

While working on reproducible builds for openSUSE, I found that
when building the ooRexx package, there were differences between each build:
/usr/bin/rexx.img had many such diffs:

-00000000 b8 71 15 00 00 00 00 00 08 cc 2f 1b 91 7f 00 00
+00000000 b8 71 15 00 00 00 00 00 08 bc 4b e7 19 7f 00 00

The differences went away when I disabled Address Space Layout Randomization (ASLR) e.g.
via setarch -R

The diffs have obvious pointer addresses spread across the img data, e.g. above you can see at offset 8 0x00007f911b2fcc08 vs
0x00007f19e74bbc08

There the last 12 bits do not differ, because they are the offset within a 4K page and the first bits are not usable in x86_64. The middle is randomized by ASLR.

Created rexx.img binaries should be deterministic.
See https://reproducible-builds.org/ for why this matters.

The rexx.img file is created by the rexximage program using as input CoreClasses.orx PlatformObjects.orx StreamClasses.orx in a directory.
I tried to understand how the data is generated, but it is spread over many cpp and hpp files: Interpreter.cpp RexxMemory.hpp+cpp interpreter/memory/Setup.cpp

Related

Bugs: #1712

Discussion

  • Rick McGuire

    Rick McGuire - 2021-02-09

    Some fixes for this commit [r12153]

     

    Related

    Commit: [r12153]


    Last edit: Erich 2021-02-21
  • Rick McGuire

    Rick McGuire - 2021-02-09

    Hopefully the final fixes in [r12155]

     

    Related

    Commit: [r12155]


    Last edit: Erich 2021-02-21
  • Erich

    Erich - 2021-02-10

    The fixes have zeroed out almost all memory addresses, except for three Array instances whose pointer to their methodDictionary still shows. Below is one example at address 001C0F08.

    This is a 64-bit rexx.img on Windows.

    001C0EF0 60 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
    001C0F00 04 00 00 00 00 00 00 80  C0 F2 5D 13 F8 7F 00 00
    001C0F10 18 54 1A 00 00 00 00 00  00 00 00 00 00 00 00 00
    001C0F20 00 00 00 00 00 00 00 00  48 0F 1C 00 00 00 00 00
    001C0F30 00 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00
    001C0F40 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
    

    There's also a bunch of (some 20) Set instances with data looking similar to a memory address, but I'm not sure. See address 001677F8.

    00167770 A0 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
    00167780 32 00 00 00 00 00 00 80  3F 00 00 00 00 00 00 00
    00167790 E8 01 00 00 00 00 00 00  02 00 00 00 00 00 00 00
    001677A0 E8 01 00 00 00 00 00 00  14 00 00 00 00 00 00 00
    001677B0 88 78 16 00 00 00 00 00  38 F2 04 00 00 00 00 00
    001677C0 E8 78 16 00 00 00 00 00  00 00 00 00 00 00 00 00
    001677D0 00 00 00 00 00 00 00 00  48 79 16 00 00 00 00 00
    001677E0 E8 96 01 00 00 00 00 00  98 76 16 00 00 00 00 00
    001677F0 00 00 00 00 00 00 00 00  01 00 00 E5 FF 7F 00 00
    00167800 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
    
     
  • Rick McGuire

    Rick McGuire - 2021-02-10

    Additional fixes committed [r12157].

    The object at 001C0F08 was a LibraryPackage object, not an array. I also spotted and fixed a small potential exposure with arrays. I'm not seeing any differences between builds with VBDiff. But I didn't see any yesterday either after the other fixes, so somehow the LibraryPackage one slipped through.

     

    Related

    Commit: [r12157]


    Last edit: Erich 2021-02-21
  • Erich

    Erich - 2021-02-11

    The object at 001C0F08 was a LibraryPackage object, not an array

    Ah, I didn't notice there's a normalization for internal class type numbers.

    So the object starting at 00167770 really is a ControlledDoInstruction (not a Set instance) and the last 5 bytes in above listed line 001677F0 00 00 00 00 00 00 00 00 01 00 00 E5 FF 7F 00 00 are changing I assume because they are uninitialized padding bytes after the last three bytes of ControlledLoop class:
    uint8_t expressions[3]; // controlled loop expression order

    In a DEBUG build these bytes show up as CC CC CC CC CC which probably is an MSVC debug feature.

    There may be more objects with uninitialized padding where DEBUG show CC chains - I'm seeing e. g. 16 CC bytes in type 66 RexxCode objects.

     
    • Rick McGuire

      Rick McGuire - 2021-02-11

      A little debugging tip. If you're in the debugger, if you expand a variable in the Windows debugger (for example, the variable copyObject in the saveimage loop), the first field displayed is that of the class that owns the virtual object pointer of the object. That way you don't have to figure out the object type from the type numbers.

      OK, I've been debugging this from debug the build, which explains why I haven't been seeing any differences here. It's strange that these should be anything other than zero, since the memory manager zeros the memory out when an object is allocated. Not sure what would be setting this. This could be a bit tricky to eliminate, since we need to handle both 32-bit and 64-bit versions.

       
      • Rick McGuire

        Rick McGuire - 2021-02-11

        OK, I'm pretty sure I know source of the garbage/padding bytes. When the object is constructed, the controlLoop field is initialized using the ControlledLoop instance passed from the parser. Since that object is originally allocated from the stack, the entire object is just copied into the field, including what ever padding came from the stack. Obviously, in debug mode, the compiler is initializing everything to CC, while for non-debug, it is picking up whatever garbage happens to be there. I think this can be fixed by overriding the assignment method and just do a field-by-field copy.

         
  • Rick McGuire

    Rick McGuire - 2021-02-11

    Fix for controlled loop padding issues fixed [r12158]

     

    Related

    Commit: [r12158]


    Last edit: Erich 2021-02-21
  • Erich

    Erich - 2021-02-21

    With change [r12155] .nil hashCode now returns DEADBEEF
    I'm not sure if this is an actual issue though

    say .nil~hashCode~reverse~c2x  -- 00000000DEADBEEF
    
     

    Related

    Commit: [r12155]

    • Rick McGuire

      Rick McGuire - 2021-02-21

      On Sun, Feb 21, 2021 at 11:32 AM Erich erich_st@users.sourceforge.net
      wrote:

      With change [r12155] .nil hashCode now returns DEADBEEF
      I'm not sure if this is an actual issue though

      It is just an arbitrary value picked to remain constant. It also has some
      old history around it. Not an issue.

      Rick

      ~~~
      say .nil~hashCode~reverse~c2x -- 00000000DEADBEEF
      ~~~


      [bugs:#1712] rexx.img varies from ASLR

      Status: open
      Group: None
      Created: Tue Jun 30, 2020 07:01 PM UTC by Bernhard M. Wiedemann
      Last Updated: Thu Feb 11, 2021 08:27 PM UTC
      Owner: nobody

      While working on reproducible builds for openSUSE, I found that
      when building the ooRexx package, there were differences between each
      build:
      /usr/bin/rexx.img had many such diffs:

      -00000000 b8 71 15 00 00 00 00 00 08 cc 2f 1b 91 7f 00 00
      +00000000 b8 71 15 00 00 00 00 00 08 bc 4b e7 19 7f 00 00

      The differences went away when I disabled Address Space Layout
      Randomization (ASLR) e.g.
      via setarch -R

      The diffs have obvious pointer addresses spread across the img data, e.g.
      above you can see at offset 8 0x00007f911b2fcc08 vs
      0x00007f19e74bbc08

      There the last 12 bits do not differ, because they are the offset within a
      4K page and the first bits are not usable in x86_64. The middle is
      randomized by ASLR.

      Created rexx.img binaries should be deterministic.
      See https://reproducible-builds.org/ for why this matters.

      The rexx.img file is created by the rexximage program using as input
      CoreClasses.orx PlatformObjects.orx StreamClasses.orx in a directory.
      I tried to understand how the data is generated, but it is spread over
      many cpp and hpp files: Interpreter.cpp RexxMemory.hpp+cpp
      interpreter/memory/Setup.cpp


      Sent from sourceforge.net because you indicated interest in <
      https://sourceforge.net/p/oorexx/bugs/1712/>

      To unsubscribe from further messages, please visit <
      https://sourceforge.net/auth/subscriptions/>

       

      Related

      Bugs: #1712
      Commit: [r12155]

      • Rony G. Flatscher

        What is the "old history around it" in this case?

        Found "deadbeef" mentioned in:
        https://en.wikipedia.org/wiki/Magic_number_(programming)#Magic_debug_values and
        https://en.wikipedia.org/wiki/Hexspeak

         
        • Rick McGuire

          Rick McGuire - 2021-02-23

          The VM operating system used this value as an eye-catcher in memory. The
          value really stood out in dumps. Even in ooRexx, when garbage collection
          takes place, the dead objects are given a DEAD eyecatcher value.

          Rick

          On Tue, Feb 23, 2021 at 7:59 AM Rony G. Flatscher orexx@users.sourceforge.net wrote:

          What is the "old history around it" in this case?

          Found "deadbeef" mentioned in:
          <
          https://en.wikipedia.org/wiki/Magic_number_(programming)#Magic_debug_values>
          and
          https://en.wikipedia.org/wiki/Hexspeak


          [bugs:#1712] rexx.img varies from ASLR

          Status: open
          Group: None
          Created: Tue Jun 30, 2020 07:01 PM UTC by Bernhard M. Wiedemann
          Last Updated: Sun Feb 21, 2021 04:32 PM UTC
          Owner: nobody

          While working on reproducible builds for openSUSE, I found that
          when building the ooRexx package, there were differences between each
          build:
          /usr/bin/rexx.img had many such diffs:

          -00000000 b8 71 15 00 00 00 00 00 08 cc 2f 1b 91 7f 00 00
          +00000000 b8 71 15 00 00 00 00 00 08 bc 4b e7 19 7f 00 00

          The differences went away when I disabled Address Space Layout
          Randomization (ASLR) e.g.
          via setarch -R

          The diffs have obvious pointer addresses spread across the img data, e.g.
          above you can see at offset 8 0x00007f911b2fcc08 vs
          0x00007f19e74bbc08

          There the last 12 bits do not differ, because they are the offset within a
          4K page and the first bits are not usable in x86_64. The middle is
          randomized by ASLR.

          Created rexx.img binaries should be deterministic.
          See https://reproducible-builds.org/ for why this matters.

          The rexx.img file is created by the rexximage program using as input
          CoreClasses.orx PlatformObjects.orx StreamClasses.orx in a directory.
          I tried to understand how the data is generated, but it is spread over
          many cpp and hpp files: Interpreter.cpp RexxMemory.hpp+cpp
          interpreter/memory/Setup.cpp


          Sent from sourceforge.net because you indicated interest in <
          https://sourceforge.net/p/oorexx/bugs/1712/>

          To unsubscribe from further messages, please visit <
          https://sourceforge.net/auth/subscriptions/>

           

          Related

          Bugs: #1712

  • Erich

    Erich - 2021-03-02
    • status: open --> pending
    • Group: None --> 5.0.0
     
  • Rony G. Flatscher

    • Status: pending --> closed
     

Anonymous
Anonymous

Add attachments
Cancel