Menu

#208 segfault when using add() and non-utf-8 encoding

open
nobody
None
5
2011-10-25
2011-10-25
geboid
No

add(const char * word); can modify the word passed.
To reproduce:
Hunspell * hunspell = new Hunspell("grammarFile-non-utf8", "emptyDictionary");
std::string test = "MamA";
g_debug("before: %s", test.c_str());
hunspell->add(test.c_str());
g_debug("after: %s", test.c_str());

result:
before: MamA
after: Mama

If you pass a constant string, ( hunspell->add("MamA"); ) you might get a segmentation fault.

That's because:
int HashMgr::add(const char * word) is called, and it calls:
add_hidden_capitalized_word((char *) word, wbl, wcl, flags, al, NULL, captype); //casting away constness of word
In HashMgr::add_hidden_capitalized_word(), if utf8 is false, it will do:
mkallsmall(word, csconv);
mkinitcap(word, csconv);
which will modify word.

Discussion