Also called by fmt_newheadword(), update_alphabet()
has no apparent purpose but triggers an assertion if it
receives non-UTF-8 8-bit data. dictfmt continues to
function if update_alphabet() is removed entirely.
Does this function do anything useful?
There is a related restriction in write_hw_to_index():
if (tolower_alnumspace (word, new_word,
allchars_mode, utf8_mode)){
fprintf (stderr, "'%s' is not a UTF-8 string", word);
However, as update_alphabet can only process UTF-8
text, this condition will never be true.
--Leah <qleah@earthlink.net>
Logged In: YES
user_id=587312
In order to build 8-bit or utf-8 dictionaries compatible
with dictd server
it is necessary to specify --locale option.
For latin1 charset you may, for example, run dictfmt like this
dictfmt --locale de_DE.ISO-8859-1
assuming that the locale de_DE.ISO-8859-1 is installed on
you system.
update_alphabet function in turn is needed to build
00-database-alphabet headword, its definition contains a
list of characters present in real headwords. This
information is necessary for utf-8 dictionaries works
correctly with LEV search strategy.
This also speed-ups LEV strategy for utf8 and ASCII
dictionaries.