Menu

#107 COMPOUNDRULE and affixed form with multiple stems

open
None
5
2009-03-16
2009-03-16
No

It looks like only one stem is considered in COMPOUNDRULE, when there are multiple possible stems in an affixed form.

In the below example, affixed form "gan" can be "ga" + "n" or "gal" + "n". But the compound "seogan" is not allowed when there is "gal" in DIC file.

$ cat comp.aff
COMPOUNDMIN 1
COMPOUNDRULE 1
COMPOUNDRULE AB

SFX C Y 2
SFX C 0 n [^l]
SFX C l n l
$ cat comp.dic
3
seo/A
ga/BC
gal/C
$
$ hunspell -m -d comp
gan
gan st:gal fl:C
gan st:ga fl:C

$ hunspell -d comp
Hunspell 1.2.8
gan
+ gal

seoga
-

seogal
& seogal 3 0: seo gal, seoga, seo

seogan
& seogan 3 0: seo gan, seoga, seo

$

Discussion

  • Németh László

    • assigned_to: nobody --> nemethl
     
  • Németh László

    I suggest to use the COMPOUNDFLAG, COMPOUNDBEGIN etc. flags combined with CHECKCOMPOUNDPATTERN instead of COMPOUNDRULE in this case.

     
  • Changwoo Ryu

    Changwoo Ryu - 2009-03-16

    If I understand the document correctly, CHECKCOMPOUNDPATTERN defines forbidden compound patterns with forbidden characters and (optionally) flags. But there's no such forbidden characters between words in my case (Korean main verb + auxiliary verb compound). And three different compound rules should be defined for different types of auxiliary verbs.

    Can I use CHECKCOMPOUNDPATTERN without forbidden characters, but only with forbidden flags?

     
  • Németh László

    > Can I use CHECKCOMPOUNDPATTERN without forbidden characters, but only with
    > forbidden flags?

    Yes, you can. (Unfortunately, there is no null value for the CHECKCOMPOUNDPATTERN fields, yet, so you need to use patterns or flags.)