From: Aleksey C. <vl...@gm...> - 2004-10-21 20:08:04
|
Diego Nehab <die...@ya...> writes: > Hi, > > While creating a dictionary for the portuguese language, I noticed that the > index sorting performed by dictfmt.c invokes the "sort" utility, while the > binary search performed by index.c uses it's own comparison function to order > two words. > > It turns out they don't agree (under non C locales), which means the binary > search fails for words that are indeed in the index. > > I guess the cleanest way to solve this is to make sure both modules use the > same comparison function, and since index.c has it's reasons to use it's own, > the easiest would be to avoid the use of the "sort" utility and perform index > sorting by hand in dictfmt.c. > > I wrote a tiny stand-alone program that does the sorting, if you > guys want it. I understand that creating a new dictionary is not > that common, but I wonder if anyone has ever reported the same > problem. If headwords from your dictionary contains non-ASCII symbols, both `dictfmt' and `dictd' utilities must be run with non-C locale which is set by --locale option. In this case, sorting order in /bin/sort and index.c should be the same. If it is not, show a part of your dictionary which is enough to see the described effect and tell what system you are using. -- Best regards, Aleksey Cheusov. |