From: Wolfgang G. <wol...@ev...> - 2009-03-02 15:28:33
|
Kai Tietz schrieb: > 2009/3/2 Wolfgang Glas <wol...@ev...>: >> Kai Tietz schrieb: >>> 2009/3/2 Wolfgang Glas <wol...@ev...>: >>>> Kai Tietz schrieb: >>>>> 2009/2/27 Wolfgang Glas <wol...@ev...>: >>>>>> Kai Tietz schrieb: >>>>>>> 2009/2/27 Wolfgang Glas <wol...@ev...>: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I have a program, which occasionally makes use of alloca(), which is >>>>>>>> dispatched to __builtin_alloca() by the mingw-w64 headers. >>>>>>>> >>>>>>>> The program make heavy use of DLLs and cross-DLL exception handling and uses >>>>>>>> WinMain instead of main and is lined using '-subsystem windows'. >>>>>>>> >>>>>>>> My problem is, that the program silently stops (no crash, just as somebody >>>>>>>> calls exit(0)...), when __builtin_alloca() is called. >>>>>>>> >>>>>>>> So far I have not been able to reproduce this behaviour in a simple testcase, >>>>>>>> so maybe someone out there has either experience with my problem. Or somebody >>>>>>>> knows under which cirumstances alloca() silently stops the program? >>>>>>>> >>>>>>>> Any hint is appreciated since I have no idea what's going wrong... >>>>>>>> >>>>>>>> TIA, >>>>>>>> >>>>>>>> Wolfgang >>>>>>> Hello Wolfgang, >>>>>>> >>>>>>> well we had some issues about alloca. But those problems were solved >>>>>>> now about a half year ago. So I assume that you possibly use an older >>>>>>> gcc version. The other chance for your problem is the limited stack >>>>>>> size your app has. Maybe you have to use ld option to increase it. >>>>>> Hello Kai, >>>>>> >>>>>> I'm using the 20090224 snapshot, so this is not an old installation. I use >>>>>> alloca for allocating array of 1000 bytes or less, so I hope that the stack size >>>>>> should not be the limiting factor. >>>>>> >>>>>> I also tried to create a test program, which imitates the call pattern of my >>>>>> app, but inside the test everything works fine. Can you give me a hint on the >>>>>> kind of problems you had with alloca(), so I have a chance to trigger my problem >>>>>> in a testcase? >>>>> The issue was in gcc by calling internal __alloca method. By this a >>>>> register clobber happend. In general you should use the variant in >>>>> malloc.h (which aliases to __builtin_alloca). >>>>> >>>>> If could finally get a test case, that would be great. >>>> Hello Kai, >>>> >>>> Unfortunateely I did not make any progress on putting together a testcase. I >>>> only have segfaults in the real program and my testcase, which obey the same >>>> calling pattern seems to be allright. >>>> >>>> What seem to be astonishing to me is the order of the variables on the stack >>>> (cf. to the pointer addresses...): >>>> >>>> 1) function argument. >>>> 2) Pointer rerturned by alloca() >>>> 3) Normal local variables. >>>> >>>> Is this the intentional behaviour? >>>> >>>> Wolfgang >>>> >>> Hallo Wolfgang, >>> >>> yes, this is intended and normal for x64 ms abi. This is reasoned by >>> the fact, that for calls there is at least a need of 32-bytes stack >>> reserved (this area is used to store the register based arguments >>> passed for variadic). So a typical function layout looks something >>> like this >>> >>> [optinal additional stack for arguments >=5] >>> [32-bytes reserved for register passed arguments (rcx,rdx,r8,r9)] >>> [return address] >>> [saved registers for function] >>> [saved xmm registers] >>> [local reserved stack for local variables] >>> [32-bytes reserved area for sub calls] >>> .... >>> [remove reserved locals and 32-byte area] >>> [restore xmm] >>> [restore registers] >>> ret >>> >>> Now by the fact that scheduler moves some of those areas in location >>> and for optimized code no rbp based function layout is used, the >>> standard layout get altered by gcc optimization. >> OK, now I understand a little bit more. Your remark fits to my observation, that >> my crashes might somehow be triggered by optimization. >> >> Are there any subtle details in the stack layout when calls are issued across >> DLL boundaries? >> >> Should I try to alloca() inside and outside my DLL or do you have a better >> suggestion for stress testing the stack layout engine in gcc? >> >> Might calling alloca() inside a function, that is inlined by optimization, be a >> problem? > > Well, DLL boundaries have no effect on this. There two possible ways > how alloca is getting invoked. First on large local variable blocks > (bigger then one page IIRC) and by user code. First is to know that > windows target are needing a stack probing, otherwise possibly there > can happen a protection fault. Therefore gcc calls internally > __allocate_stack, which in fact probles the stack to prevent this > protection fault. This is somehow reasoned by windows stack paging > algorithm. > Secondly you should be aware, that any stack allocated by alloca is > allocated at the function entry. This is done (by an oddity of gcc), > so that the allocated stack is usable even outside the current frame. > I assume that this is possibly reasoned by optimization. What do you mean by "function entry". I typically use alloca in situations, where I allocate memory dependant on an earlier computational result. My classical example is reading a registry key: ********************* static std::string read_reg_key(const char *key, const char *vkey) { HKEY hkey; DWORD len; if (RegOpenKeyExA(HKEY_LOCAL_MACHINE, key, 0, #ifdef WIN64 KEY_READ | KEY_WOW64_32KEY, #else KEY_READ, #endif &hkey) != ERROR_SUCCESS) return std::string(); if (RegQueryValueExA(hkey,vkey,NULL,NULL,NULL,&len) != ERROR_SUCCESS) { RegCloseKey(hkey); return std::string(); } char *value = (char*)alloca(len+2); fprintf(stderr,"key=%p.\n",key); fprintf(stderr,"vkey=%p.\n",vkey); fprintf(stderr,"&hkey=%p.\n",&hkey); fprintf(stderr,"&len=%p.\n",&len); fprintf(stderr,"value=%p.\n",value); if (RegQueryValueExA(hkey,vkey,NULL,NULL,(BYTE*)value,&len) != ERROR_SUCCESS) { RegCloseKey(hkey); return std::string(); } RegCloseKey(hkey); std::string ret(value); ret += "\\etc"; return ret; } ********************* Wolfgang |