Working on the Kichwa dictionary I have some issues with more than twofold suffixes.
The grammar is
kuyana (to love)
kuyani (I love)
kuyaRIni (I'm love myself)
Up to here we're good, with "twofold suffix stripping" the particles ri and ni can be analyzed.
However, in quite a few cases three suffixes are needed.
The suffixes known to me (ri, gri, ku, ra) can be combined, so that they form at least three suffixes (well to this date I haven't seen any word with more than three)
kuyaRIgriNI = (I'm going to love myself)
The bad news are that not all suffix combinations are allowed...
My idea was to list the good combinations:
and give them and the corresponding words flags. It just seemed a lot easier if threefold suffix stripping was possible...
SFX v: (verb conjugations)
SFX A: ri/BCDv
SFX B: ra/Dv
SFX C: ku/Bv
SFX D: gri/v
This way stem+B+D+v would parse correctly...
Is this possible any way?
I suggest to use only one suffix for the inflectional suffix combinations. An agglutinative language need a redundant dic file with a lot of word forms with derivational suffixes (or suffix combinations), or the second "suffix" of Hunspell. In fact, the main role of the second affixes of Hunspell is to store these derivational suffixes. By the way, there is an old Bolivian Quetchua dictionary here:
It contains 18 thousand (inflectional?) suffix combinations (and a script to generate this them). This dictionary or the script can be optimized to use second suffixes, too.
I agree. If any infix could be used with any verb and in any combination, it could be practical with a multifold suffix stripping. But since
-not all combinations are allowed
-not all orders are allowed,
it's probably best to write a script to do
etc for all the valid combinations, and flag them (eg /v is a verb), so that hunspell can conjugate + compound after that.
For the first release of the Kichwa hunspell dictionary I followed your suggestians and created a script to rewrite the .dic file. I use a .dic.MASTER file, and use hunspell-style flags to control the rewriting, in part because hunspell might in the future support multi-fold affix stripping. Another benefit is that the original .dic file does not need hardly any rewriting.
All lines ending with the flag "v" are considered Verbs, so a
kuyana//r>+-,whv in the .dic.MASTER file will be written as
and so on, to the resulting .dic file, while a less morphing(?) verb:
The first release including the script is located at
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.