Menu

#382 PCH: cc1plus.exe crash on Windows8.1

v1.0 (example)
closed
nobody
4
2015-07-10
2014-02-25
No

Can you help me with PCH problem ?
I have a project based on the original DC++ project...

cc1plus.exe crashes when trying to compile using Precompiled Headers

I tried to get support or more detail of the DC++ team ( https://answers.launchpad.net/dcplusplus/+question/243282 ) <- see detail of compilation (i also tried on all gcc releases: seh/dwarf/sjlj arch: i686/x86-64)
PCH = 136 554 756 b - and less - is ok!
PCH = 137 526 404 b - and over - crash
but it seems they don't know how to solve it

This problem on the Widnows8.1-x64, 16G RAM, Intel Core i7-3630QM

But! later I checked on a different platform (Windows Vista-x32 4G RAM) with mingw-builds: i686-4.8.2-release-posix-dwarf-rt_v3-rev2

it works! whats happens ? How solve this problem on Windows8.1 ?

thanks.,

Discussion

  • Kai Tietz

    Kai Tietz - 2014-03-12

    This issue is caused by a hard limit of pch-size. It might be that mingw-builds applies here some additional patch to increase memory-region size, which helps here.
    Nevertheless this issue is clearly an (already reported) gcc-issue. Therefore I close this bug here.
    As side-note: pch-support is pretty much discouraged as it has many problems.

     
  • Kai Tietz

    Kai Tietz - 2014-03-12
    • status: open --> closed
     
  • Denis Kuzmin [GitHub/3F]

    "hard limit" bad news :)

    I get it
    Yes, some problems exist but is sometimes needed a rapid build..

     
  • ollydbg

    ollydbg - 2015-05-30

    Hi, Kai, cygwin works fine with large pch.
    See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56926#c12
    Can you show some idea that how we can solve this issue? Where is the hard limit? Thanks.

     
  • Kai Tietz

    Kai Tietz - 2015-05-30

    Cygwin is here no reference. It emulates mmap-functionality, so implementation of pch for it works pretty different.

    For some more details on this issue you can search bugzilla-database of gcc for further details.

     
  • ollydbg

    ollydbg - 2015-05-31

    Hi, Kai, thanks for the info, I see some in the bugzilla, such as: 14940 – PCH largefile test fails on various platforms - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14940

     
  • ollydbg

    ollydbg - 2015-05-31

    BTW: I may find the hard limit value's location, which is 128M, see this post: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56926#c16

     
  • ollydbg

    ollydbg - 2015-06-02

    Good news: Guys, I can solve the crash issue by setting a larger value, see my Comment 17 in gcc bugzilla.

     
    • niXman

      niXman - 2015-06-02

      Great!

      This fix I will include in the next builds. At the moment, I do not know when I'll update the builds, but I'll try next week.

       
  • Denis Kuzmin [GitHub/3F]

    ahah :) I thought you're don't want any changes of this const value(for some reason), i.e. I remember, i watched the pch_VA_max_size in sources after first post to me from Kai Tietz
    and accepted this restriction "as is"

    However, year later i see "...I will include in the next builds"
    amazing anyway :) especially Comment 17 (at that time, I also didn't want build the gcc on win) - sweet solution :)

    well., How about any flexible settings for this ? if it's a simple restriction to VirtualAlloc etc. :) why not, it means

     

    Last edit: Denis Kuzmin [GitHub/3F] 2015-06-02
  • ollydbg

    ollydbg - 2015-06-02

    Hi, reg, we surely need a flexible setting about this value. I see that to load the pch file under mingw target, it first allocate a piece of memory, and create a memory map of that pch file. Sometimes, the pch file is large... But currently I don't quite understand the source code and logic of host-mingw32.c, so we still need someone to work on a flexible solution. Before that, a hard coded large value is OK for me.

     
  • omgwtfbbq

    omgwtfbbq - 2015-06-26

    Checking the size and assuring its granularity before VirtualAlloc is not necessary at all, the function can handle that itself. Simply removing all occurrences of pch_VA_max_size will make PCH work fine with large headers - tested on GCC 4.9.2.

    This, however, only hides another problem. The function mingw32_gt_pch_use_address can fail (return code -1 or 0) - this is what happens when mingw32_gt_pch_get_address returns NULL due to the unnecessary limit introduced by pch_VA_max_size. File mapping actually is created, but since the mapped address is not equal to the expected one (NULL), the function returns -1. File host-mingw32.c contains no reason for cc1plus to crash.

    I'd expect not to see anything that would check the return value of mingw32_gt_pch_use_address... wherever it is used. This bug is therefore not Win32-specific, it applies to the whole business of handling PCH files in cc1plus regardless of the host architecture. The only reason it manifested here is that completely unnecessary hard-coded limit, that is not present on other architectures.

     
  • ollydbg

    ollydbg - 2015-06-27

    Hi, omgwtfbbq. I suggest you can put your comment in GCC's bugzilla, see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14940 so that more people will notice your analysis. BTW, you mean that set a large value of pch_VA_max_size don't solve the crash issue? For my test, it did solve the crash issue. Sorry, I don't quite understand your meanings.

     
  • omgwtfbbq

    omgwtfbbq - 2015-06-27

    Will do when I make people here understand the issue. :-)

    If you set pch_VA_max_size to a large value, you only postpone the crash, it is not really solved. If you remove all uses of pch_VA_max_size, you will get your flexible setting, that should work for any size: http://pastebin.com/68ELH2Jz.

    Except it won't work for any size. For example you will have 5GB PCH on a 32bit architecture. You will see this exact crash again due to simple fact that the OS doesn't have enough virtual memory. The ultimate reason behind this bug is somewhere in the code that handles PCH.

    EDIT: Done some more testing regarding the PCH crash.

    1) g++
    1.1) mingw32_gt_pch_use_address(NULL, 0, FD, 0); // init?
    1.2) addr = mingw32_gt_pch_get_address(PCH_SIZE, FD);

    2) g++
    2.1) mingw32_gt_pch_use_address(addr, PCH_SIZE, FD, OFFSET);
    2.2) crash

    The crash occures in 2nd and following invocations of the compiler. That is when it is trying to read the PCH file. Also mingw32_gt_pch_use_address must fail to map the file to the supplied address. In case the file is successfuly maped, but the function itself fails, compilation is terminated with "fatal error: had to relocate PCH".

    The code is quite messy, and I still can't really figure out what gets called where and how. I think the best solution for now would be just removing the hard-coded limit (as seen above), and letting someone more familiar with GCC internals fix the real bug behind it.

     

    Last edit: omgwtfbbq 2015-06-27
  • ollydbg

    ollydbg - 2015-06-27

    Good work!
    Though I still don't quite follow all your idea(sorry), I suggest you can supply a simple patch file, and submitted to that GCC Bugzilla.

     
  • Denis Kuzmin [GitHub/3F]

    First, you need to consider what thought author of this function for implementing of host for mingw
    (i.e. what problems he tried to solve at that time... because with 'how', we can already see this as result :)

    author of this (seems to be) - dannysmith (Danny Smith), so we can also try to ask for an explanation or some details: see initial 62b4d90e930

    However! what I see:

    • The author uses pch_VA_max_size for use the address space in the upper range 0x70000000 - 0x78000000 (i.e. see MEM_TOP_DOWN).
      • That is why this value is equal to 128mb, i.e. 0x8000000 - the reason of why so many, by this author

    For lpAddress used NULL value, so the our system should determine where to allocate this region... or ok, we can also use the GetSystemInfo...
    Problem can be with MEM_TOP_DOWN flag, see for example here
    but again, it's not related(in general) for our problem...

    I also see other hosts implementations: host-cygwin, host-linux, host-openbsd, host-hpux, host-solaris and in general, the all host hooks uses the gt_pch_get_address only for checking address space:

     addr = mmap((void*)TRY_EMPTY_VM_SPACE, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
    
     if(addr == (void*)MAP_FAILED) {
       return NULL;
     }
     munmap(addr, size);
    

    i.e. the all 'checked' space should be released with VirtualFree or munmap before returning!

    then should see the preferred_base as the base address of the allocated region from all hooks:

    struct mmap_info
    {
      size_t offset;
      size_t size;
      void *preferred_base;
    };
    

    today, all my available environments are not configured for some test with gcc on Win, but in total:
    need to reconsider the work with address space for pch with host hook - mingw32.

    well, need to contact with author for details about this hard limitation... may be we can't see some bug or simply fix it! 'as is' :)

     

    Last edit: Denis Kuzmin [GitHub/3F] 2015-06-30
  • omgwtfbbq

    omgwtfbbq - 2015-07-03

    @reg: Could you test something then, doesn't matter what environment?

    I suspect that the actual reason for the crash is outside _gt_pch_get_address and _gt_pch_use_address functions. I think that the address returned by _gt_pch_get_address is being accessed regardless of the return value of _gt_pch_use_address. Therefore if we make the _gt_pch_use_address fail, crash/segfault should happen on any platform.

    Functions should look like this (change mingw32 part to your platform specific one):

    static void *
    mingw32_gt_pch_get_address (size_t size, int fd)
    {
      return NULL; /* return invalid pointer */
    }
    
    static int
    mingw32_gt_pch_use_address (void *addr, size_t size, int fd,
                    size_t offset)
    {
      return -1; /* always fail
    
        cc1plus should crash/segfault when reading PCH due
        to *_gt_pch_get_address returning invalid pointer
    
        this happens despite *_gt_pch_use_address returns -1
        which means that the memory was not allocated and
        therefore should not be accessed */
    }
    

    Back to Win32/MinGW32; again my guesses, not necessarily true:

    The reason for using MEM_TOP_DOWN is that Danny wanted to avoid fragmentation of space for malloc1 - as he writes in his comments. Though I believe there might be another reason behind it (probably mentioned in the last line of his FIXME). You see, mingw32_gt_pch_get_address is called only once - in the process that created PCH file. The returned address is used as base in all following PCH reader processes. Since Danny assumes that malloc allocates memory in bottom-up manner, bottom-up base address could cause failure of mingw32_gt_pch_use_address because the base address from mingw32_gt_pch_get_address (from the initial process) might have already been used by malloc.

    By using MEM_TOP_DOWN for base address, that issue is seemingly fixed. However, another one arises. NT system DLLs, as Danny writes, are preferrably placed in the top memory regions. What could happen here, in Danny's mind (Top Gear style), is basicly the same as in the previous case. The initial writer process would reserve some top address space, and reader processes would be bound to reserve the same area. Except these reader processes might have done so a little later, meaning system DLLs have been already placed there, and so their mingw32_gt_pch_use_address fails. That's why the 128MB limit was introduced (0x7FFFFFFF - 128*1024*1024 = 0x77FFFFFF -> basicly outside of system DLLs space 0x70000000 to 0x78000000).

    My view on what should be done:

    • Keep, for now, MEM_TOP_DOWN, there is IMHO higher chance of coliding with *malloc* address space than with system DLLs.
    • I would expect system DLLs to be loaded very early. Meaning memory regions would not collide with system DLLs space anyway, they would be placed accordingly by system, and at same address. This makes the 128MB limit unnecessary -> remove it just as I proposed (my simple patch here).
    • Rewrie handling of PCH files, especially the memory mapping.
      • Don't use the same base address among several processes.
      • Don't require the whole PCH file to be mapped into memory. Implement some sort of paging.
      • Add failover read mechanics using standard file IO -> faster support of new platforms, compilation doesn't fail if there are issues with memory mapping.

    1 - not malloc itself, some other similiar function is used.

     

    Last edit: omgwtfbbq 2015-07-03
  • Denis Kuzmin [GitHub/3F]

    @omgwtfbbq,

    I suspect that the actual reason for the crash is outside _gt_pch_get_address and _gt_pch_use_address

    It may be - that's what I meant above for "..may be we can't see some bug.." because all gt_pch_use_address only as part of all host hooks (NULL pointer & -1 can be as normal state for return i.e. any handler of this should be in top level)
    You've seen the other hooks ?

    Mainly the gt_pch_get_address used for getting Base addr with our 'checking' (this region will be released before returning),
    And the gt_pch_use_address should be used to allocate same size at the same address for loading our pch data:

     result = host_hooks.gt_pch_use_address(mmi.preferred_base, mmi.size, fileno(f), mmi.offset);
     // -1 : we couldn't map the data at Base addr
    

    So! the our stub (in general, I see the all pch_VA_max_size only as a stub) in gt_pch_use_address de facto only repeats logic from gt_pch_get_address - checks the similar for "same size at the same address"

    The reason for using MEM_TOP_DOWN is that Danny wanted to avoid fragmentation

    :) reason not only for Danny
    and as I said above 0x8000000 "not related(in general)" for pch_VA_max_size... it's really stub (seems only used with mmi)

    if it's so.. omg, this "bug" with long history :)
    However, If we all are not sure, well.. a more safe variant for this case should be (at least) any flexible option for end-user, otherwise needed fixes for this hook 'as is', as I said above

    Rewrie handling of PCH files, especially the memory mapping.

    I think the answer will be ~ "flag in your hand and go" :) don't forget about other hosts for common handler, it probably big task. A quick solution only with gt_pch_use_address hook

     
  • omgwtfbbq

    omgwtfbbq - 2015-07-08

    I have finally managed to obtain stack trace of the crash. GDB on Windows wouldn't work properly, so I moved to Linux.

    What happens is that once _gt_pch_use_address fails with -1, fatal_error is called as expected (#7). This function prints the supplied message ("had to relocate PCH") along with some diagnostics. However, retrieving that information fails (#0) due to set being invalid pointer (in this case 0x10bb3c0). Pointer set is actually global pointer line_table (defined in gcc/input.c). It seems that PCH is used before this pointer is initialized. So if fatal_error is called due to PCH, invalid memory location pointed to by uninitialized line_table is being accessed - that results in segfault instead of terminating the program with error message from fatal_error.

    Program received signal SIGSEGV, Segmentation fault.
    [Switching to process 24920]
    0x0915bc20 in linemap_location_from_macro_expansion_p (set=0x10bb3c0, location=7452) at ../../libcpp/line-map.c:948
    948       linemap_assert (location <= MAX_SOURCE_LOCATION
    (gdb) backtrace 
    #0  0x0915bc20 in linemap_location_from_macro_expansion_p (set=0x10bb3c0, location=7452) at ../../libcpp/line-map.c:948
    #1  0x0915b5cb in linemap_lookup (set=0x10bb3c0, line=7452) at ../../libcpp/line-map.c:642
    #2  0x0915c076 in linemap_macro_loc_to_def_point (set=0x10bb3c0, location=7452, original_map=0xbfffe72c) at ../../libcpp/line-map.c:1139
    #3  0x0915c1f5 in linemap_resolve_location (set=0x10bb3c0, loc=7452, lrk=LRK_MACRO_DEFINITION_LOCATION, map=0xbfffe72c) at ../../libcpp/line-map.c:1268
    #4  0x0913954a in diagnostic_report_current_module (context=0x9723240 <global_diagnostic_context>, where=7452) at ../../gcc/diagnostic.c:518
    #5  0x08305512 in cp_diagnostic_starter (context=0x9723240 <global_diagnostic_context>, diagnostic=0xbfffe7d8) at ../../gcc/cp/error.c:3035
    #6  0x0913a202 in diagnostic_report_diagnostic (context=0x9723240 <global_diagnostic_context>, diagnostic=0xbfffe7d8) at ../../gcc/diagnostic.c:798
    #7  0x0913a90c in fatal_error (gmsgid=0x921a2d6 "had to relocate PCH") at ../../gcc/diagnostic.c:1118
    #8  0x0871dc8a in gt_pch_restore (f=0x9785778) at ../../gcc/ggc-common.c:705
    #9  0x084c19d5 in c_common_read_pch (pfile=0x972f420, name=0x9731018 "header.hpp.gch", fd=4, orig_name=0x974ad60 "header.hpp") at ../../gcc/c-family/c-pch.c:372
    #10 0x09151ad8 in should_stack_file (pfile=0x972f420, file=0x97854d8, import=false) at ../../libcpp/files.c:787
    #11 0x09151d48 in _cpp_stack_file (pfile=0x972f420, file=0x97854d8, import=false) at ../../libcpp/files.c:872
    #12 0x09152181 in _cpp_stack_include (pfile=0x972f420, fname=0x974af40 "header.hpp", angle_brackets=0, type=IT_INCLUDE) at ../../libcpp/files.c:1009
    #13 0x09146e57 in do_include_common (pfile=0x972f420, type=IT_INCLUDE) at ../../libcpp/directives.c:798
    #14 0x09146e94 in do_include (pfile=0x972f420) at ../../libcpp/directives.c:809
    #15 0x09146602 in _cpp_handle_directive (pfile=0x972f420, indented=0) at ../../libcpp/directives.c:492
    #16 0x09158435 in _cpp_lex_token (pfile=0x972f420) at ../../libcpp/lex.c:2160
    #17 0x09160122 in cpp_get_token_1 (pfile=0x972f420, location=0xbfffebd8) at ../../libcpp/macro.c:2359
    #18 0x0916050e in cpp_get_token_with_location (pfile=0x972f420, loc=0xbfffebd8) at ../../libcpp/macro.c:2541
    #19 0x084b763f in c_lex_with_flags (value=0xbfffebdc, loc=0xbfffebd8, cpp_flags=0xbfffebd2 "r\tP\024\037\t\370\353\377\277D\rL\b\030", lex_flags=0) at ../../gcc/c-family/c-lex.c:302
    #20 0x0830d277 in cp_lexer_get_preprocessor_token (lexer=0x0, token=0xbfffebd0) at ../../gcc/cp/parser.c:761
    #21 0x0834653f in cp_parser_initial_pragma (first_token=0xbfffebd0) at ../../gcc/cp/parser.c:31427
    #22 0x0830cf93 in cp_lexer_new_main () at ../../gcc/cp/parser.c:631
    #23 0x0830fdd3 in cp_parser_new () at ../../gcc/cp/parser.c:3407
    #24 0x08346b35 in c_parse_file () at ../../gcc/cp/parser.c:31698
    #25 0x084c02dc in c_common_parse_file () at ../../gcc/c-family/c-opts.c:1067
    #26 0x0899e49a in compile_file () at ../../gcc/toplev.c:548
    #27 0x089a053d in do_compile () at ../../gcc/toplev.c:1926
    #28 0x089a06bb in toplev_main (argc=26, argv=0xbfffedb4) at ../../gcc/toplev.c:2002
    #29 0x09124968 in main (argc=26, argv=0xbfffedb4) at ../../gcc/main.c:36
    
     
  • Denis Kuzmin [GitHub/3F]

    omgwtfbbq,

    I lost your train of thought... :)

    I wrote above that the -1 value it's their 'standard behaviour' for all hooks:

    i.e. the all hooks follows the rule in comment
    so, yes we should get this error

    and don't forget that's pointer - mmi.preferred_base (github.com) should contain the Base addr after steps with gt_pch_use_address in restoring function

    or what you mean ?

     

    Last edit: Denis Kuzmin [GitHub/3F] 2015-07-08
  • omgwtfbbq

    omgwtfbbq - 2015-07-08

    Thats's the stack trace of the crash that happens when _gt_pch_use_address fails. Failure of that function is indicated by returning -1 as documented here:
    https://gcc.gnu.org/onlinedocs/gccint/Host-Common.html

    ...
    result = host_hooks.gt_pch_use_address (mmi.preferred_base, mmi.size,
                      fileno (f), mmi.offset);
    if (result < 0) // result == -1
      fatal_error (input_location, "had to relocate PCH"); // <- this gets executed
    ...
    

    fatal_error normally prints an error message with diagnostics, and terminates the program. But due to incorrectly initialized global pointer line_table (or the pointed struct itself), which is used to gather that diagnostic data, a crash occures. That crash is the one we have been trying to fix. Now we know where it happens, and are again a bit closer to getting rid of it.

     
  • Denis Kuzmin [GitHub/3F]

    a correct termination of this it's also not so important for our case, I think.
    This problem for 'next levels'. I mean for other problems later if the result != 1

    most importantly to find any other problems if exists for 1 & sizes more than the quota

     
  • omgwtfbbq

    omgwtfbbq - 2015-07-09

    I have just posted fix for the crash itself. If gt_pch_use_address returns -1, the program should terminate without crashing now.
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14940#c50

    @reg
    PCH handling in GCC is so badly designed (more like glued together) that a possible bug in some unprobable extreme case on Windows doesn't matter. I have already provided patch that removes the limit, enabling use of much bigger PCHs. Some of my posts above even contain reasoning behind it. There will again be a problem once PCHs reach sizes in gigabytes (on 32bit arch), but that can't be fixed without redesigning the whole PCH business from scratch, maybe including some more GCC internals.

     
  • ollydbg

    ollydbg - 2015-07-09

    Great job, omgwtfbbq. I would like to see those patches were used in MinGW-w64 or MinGW related toolchains before they are in GCC trunks or any branches.

     
  • Denis Kuzmin [GitHub/3F]

    I have just posted fix for the crash itself. If gt_pch_use_address returns -1, the program should terminate without crashing now.

    A Good addition! thanks

    PCH handling in GCC is so badly designed (more like glued together)

    yes, I wrote above

    but that can't be fixed without redesigning the whole PCH business

    and also about this... i.e. it can be in general, but the redesign is more right choice

    So! be a hero to the end. You can :)

     

Log in to post a comment.

MongoDB Logo MongoDB