Menu

#115 Excessive stack use slows MSVC builds

open
nobody
None
5
2009-07-12
2009-07-12
No

In multiple locations in Hunspell v1.2.8, 64KB (see BUFSIZE def) buffers are created as local variables. This causes MSVC (tested with VS2005/SP1) to insert additional code to ensure that all required memory pages are in RAM. This is the reason that over 155,000 calls to _chkstk() are done just to open the 596KB "en-US.dic" dictionary in Mozilla products.

For example, we get this code:

int HashMgr::decode_flags(unsigned short ** result, char * flags, FileMgr * af) {
00A89BB0 B8 00 00 01 00 mov eax,10000h
00A89BB5 E8 D6 D2 0F 00 call _chkstk (0B86E90h)
int len;
switch (flag_mode) {
00A89BBA 8B 84 24 04 00 01 00 mov eax,dword ptr [esp+10004h]
00A89BC1 8B 40 0C mov eax,dword ptr [eax+0Ch]
00A89BC4 83 E8 01 sub eax,1
00A89BC7 53 push ebx

because of the buffer allocated here:

case FLAG_UNI: { // UTF-8 characters
w_char w[BUFSIZE/2];
len = u8_u16(w, BUFSIZE/2, flags);
*result = (unsigned short *) malloc(len * sizeof(short));
if (!*result) return -1;
memcpy(*result, w, len * sizeof(short));
break;

Once in _chkstk() the code loops for each 4KB page needed to satisfy the local stack use, slowing the performance of Hunspell.

Here's another example, this time from HashMgr::add_hidden_capitalized_word() :

if (utf8) {
char st[BUFSIZE];
w_char w[BUFSIZE];
int wlen = u8_u16(w, BUFSIZE, word);

That's 192KB of stack use, causing a single invocation of _chkstk() to have to touch 48 pages of memory.

Please, please, please improve the performance of Hunspell by avoiding most of these calls to _chkstk. Thank you.

Discussion