Menu

#64 Mac OS X Frameworks support

open
nobody
None
5
2014-08-19
2010-03-12
No

It would be nice to compress the Mac OS X frameworks. (*.framework).

Discussion

  • John Reiser

    John Reiser - 2010-03-18

    Please give a URL to some documentation (particularly the file format) and a small test case.

     
  • Steve Mokris

    Steve Mokris - 2014-03-31

    John Reiser: Here's the info you requested:

    A Mac OS X Framework is a folder whose name ends with ".framework" and contains a Mach-O dynamically linked shared library. For example, "Test.framework" is a folder, and "Test.framework/Test" is a dynamic library (note that there's no ".dylib" extension on Framework dylibs, unlike standalone Mac OS X dylibs).

    These days, Mac OS X dylibs can be 32-bit-only, 64-bit-only, or "universal" (meaning a single file contains both 32-bit and 64-bit binaries). As discussed on http://sourceforge.net/p/upx/discussion/6806/thread/37f66dea/, Mac OS X provides command line utilities for combining multiple binaries into a single file, so technically UPX would only need to support handling 32-bit-only and 64-bit-only dylibs, and can ignore "universal" dylibs.

    Here's the documentation on the Mach-O file format: https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachORuntime/Reference/reference.html

    If I attempt to run UPX on a 64-bit-only framework dylib, it throws UnknownExecutableFormatException. To reproduce the exception on Mac OS X, run these commands:

    echo 'int foo(void) { return 42; }' > Test.c
    clang -m32 -dynamiclib Test.c -o Test-32bit.dylib
    clang -m64 -dynamiclib Test.c -o Test-64bit.dylib
    upx Test-32bit.dylib
    upx Test-64bit.dylib
    

    In packmast.cpp, I found this, which suggests partial support for Mach-O dylibs is already implemented:

    // 2010-03-12  omit these because PackMachBase<T>::pack4dylib (p_mach.cpp)
    // does not understand what the Darwin (Apple Mac OS X) dynamic loader
    // assumes about .dylib file structure.
    

    If I uncomment those, and remove the "missing -init function" check in p_mach.cpp, UPX packs the framework's dylib. But if I attempt to link an app that uses the packed dylib, ld crashes. Likewise, if I link an app to the original dylib, then pack the dylib, the app crashes on launch.

     
  • John Reiser

    John Reiser - 2014-04-01

    Executive summary: Steve Mokris's info contains an updated pointer to documentation on Mach-O file format, but no new insights about how to compress a .dylib such that the dynamic linker will do the right thing with the compressed output. Thus, no progress.

    Details: Apparently a 'framework' is a directory which contains a dylib file. Therefore just specify the dylib file directly, and then UPX won't have to peer inside the directory, find the file(s), etc. [Can there be more than one? Must there be one?] UPX already processes 'fat' executables by compressing each member separately. (Try it with a 'fat' main program.)

    Compressing an executable main program is "easy" because the interface is narrow and well defined between the operating system launch mechanism and the address space layout that is expected just before execution of the first instruction of the new program. (Actually there is one trick: the dynamic linker (LC_LOAD_DYLINKER) expects an extra &mhdrp at the top of the stack. See the return from upx_main() in stub/src/amd64-darwin.macho-fold.S)

    Compressing a dylib (shared library) is "hard" because the dynamic linker does not document the details of exactly what constitutes a dylib that the dylinker will interpret properly: which segments and sections must be present [and where], what the necessary relationships between them must be, etc. Earlier editions of UPX (such as before 2010-03-12) tried compressing just the LC_SEGMENT_64 containing .text, but the result often produced SIGSEGV from the dylinker with very little clue as to what was wrong or how to fix it. Even for ELF format on Linux using ld-linux as the dynamic linker, it is a hack that compressing a shared library works. See the comment in p_lx_elf.cpp:
    // If there is an existing DT_INIT, and if everything that the dynamic
    // linker ld-linux needs to perform relocations before calling DT_INIT
    // resides below the first SHT_EXECINSTR Section in one PT_LOAD, then
    // compress from the first executable Section to the end of that PT_LOAD.
    // We must not alter anything that ld-linux might touch before it calls
    // the DT_INIT function.
    The problem on Mac OS X is that I don't even know all the [logically] corresponding restrictions; and discovering them by trial-and-error is a very slow process when violations give only SIGSEGV with no other clues.

     

Log in to post a comment.

MongoDB Logo MongoDB