Hunspell / Bugs (archive) / #147 Unmunch with flag num not working

#147 Unmunch with flag num not working

Status: open

Owner: nobody

Labels: None

Priority: 5

Updated: 2014-11-15

Created: 2010-05-27

Creator: Laknath

Private: No

Unmunch with flag num isn't working for long numbers separated with commas.

Ex: 1242,1231,4232,4343

Trying this outputs almost all affix combinations with a given dictionary word.

Problem seems to be in expand_rootword() and that following code checks only one character. So the programme successfully matches even character 4 with the flag 4232 because it contains 4.

if (strchr(ap,(stable[i].aep)->achar)){
suf_add(ts, wl, stable[i].aep, stable[i].num);
}

I'm attaching the dic and aff pair I tried.

Discussion

Laknath - 2010-05-27

Sinhala language dic and aff files

si_LK.tar.gz

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Adrián Chaves Fernández - 2014-10-09

Patch proposed by the reporter: http://sourceforge.net/p/hunspell/patches/37/

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Adrián Chaves Fernández - 2014-11-01

I’ve had to use the 2KiB Bash script at https://github.com/kscanne/hunspell-gd/blob/master/unmunch.sh which seems to somehow fix unmunch with awk magic. Could you please merge this fix upstream?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Adrián Chaves Fernández - 2014-11-14

OK, the script does not fix unmunch, it is a simple implementation of it. I also noticed that when there are two ‘.dic’ entries with the same lemma and a different combination of flags, only the last combination is used.

For example, given a ‘.dic’ with:

word/10,15
word/10

The result of the script is the same as it would be with a ‘.dic’ with:

word/10

Removing the ‘word/10’ line results in the desired output.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Adrián Chaves Fernández - 2014-11-15

I ended up implementing my own unmunch in Python. Quite slower but fast enough for me, and it supports what I need (and so far nothing else).

The script is available in the https://github.com/eitsl/hunspell repository:
https://github.com/eitsl/hunspell/blob/master/utils/unmunch.py (command-line interface)
https://github.com/eitsl/hunspell/blob/master/hunspell.py (implementation, _Unmuncher class)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link: