#94 Fourfold affixing


Will fourfold or at least threefold affixing be supported in Hunspell? If yes, what is the ETA? If not, what do you think is the complexity of the code required to make the change in Hunspell in number of person days?

I'm creating a Hunspell aff & dic file for the Armenian language (currently, Western Armenian using traditional Armenian orthography). The Armenian language is an Indo-European language similar in affixing rules to Latin. In Armenian it is possible to have upto fourfold affixing. For example: "կարդացածներս" ("those which I have read") can be broken down into the following 5 parts: "կարդ-ացած-ներ-է-ս". The first part signifies the stem: "կարդալ" (to read).

Thank you in advance. Hunspell has been a great piece of software to use so far!


  • serouj

    Correction on the above example:

    "կարդացածներէս" ("among those which I have read") can be
    broken down into the following 5 parts: "կարդ-ացած-ներ-է-ս".
    The first part signifies the stem: "կարդալ" (to read).

  • serouj

    I didn't realize there's a "Feature Requests." category in Tracker. This is probably a Feature Request rather than a bug..

  • serouj

    It turns out Armenian even has five-fold affixing. It is simply the negation of the previous example: "չկարդացածներէս" ("among those which I have not read") breaks down into the 6 parts: "չ-կարդ-ացած-ներ-է-ս" whose stem is "կարդալ" (to read).

    I hope we can have Hunspell support it, since it will lead to extremely rich morphological analysis, not to mention not having to hard-code the over 100 possible combinations!

  • Affix in Hunspell means affix combinations, too. For example, the Hungarian word "kutyátlaníthatatlanságaitokéiért" from the stem "kutya" generated by the -talaníthatatlanság derivative affix combination (talan-ít-hat-atlan-ság) and the -aitokéiért inflectional affix combination (-a-i-tok-é-i-ért), so two "affixes" is enough for complex morphology, too:

    $ hunspell -d hu_HU -m
    kutyátlaníthatatlanságaitokéiért st:kutya po:noun ts:NOM ds:tAlAn_LESS_adj ds:Ít_TRANSITIVE_vrb ds:hAt_MODAL_vrb ds:tAlAn_LESS_adj ds:sÁg_ABSTRACT_noun ts:NOM is:PLUR is:POSS_PL_2 is:POSSESSEE is:PLUR is:CAUS/FIN

    Average Hunspell dictionary of an agglutinative language can contain 10-20 thousand affix rules. Unfortunately, there is no standard tool yet to generate these combinations from simple n-fold descriptions (like in the dictionaries of the two-level rule compilers). Hungarian affix combinations are generated by nested m4 macros.

    Thanks your kind words! László

    • status: open --> closed