Menu

#227 hfst-lexc and multichars with zeros

future
open
nobody
lexc (6)
1
2014-06-30
2014-02-05
No

I think this is a regression in hfst-3.6.0. The test in test/tools/basic.multichar-symbol-with-0.lexc that worked in Xerox lexc doesn't work on foma or hfst-lexc but if you add % before all zeros it works on hfst-3.5.1 and possibly foma.


$ cat a0.lexc 
Multichar_Symbols A%0

LEXICON Root
A%0 # ;
aA%0 # ;
A%00 # ;
tpirinen@hippu4 /fs/lustre/wrk/tpirinen (1012) [03:48:05] 
$ hfst-lexc a0.lexc | hfst-fst2strings 
hfst-lexc: warning: Defaulting to foma type (since it has native lexc support)

A0
aA0
tpirinen@hippu4 /fs/lustre/wrk/tpirinen (1013) [03:48:12] 
$ vim a0.lexc 
tpirinen@hippu4 /fs/lustre/wrk/tpirinen (1014) [03:48:25] 
$ hfst-lexc a0.lexc | hfst-fst2strings 
hfst-lexc: warning: Defaulting to foma type (since it has native lexc support)

aA
A
tpirinen@hippu4 /fs/lustre/wrk/tpirinen (1015) [03:48:28] 
$ cat a0.lexc 
Multichar_Symbols A0

LEXICON Root
A0 # ;
aA0 # ;
A00 # ;

Versus:


$ cat a0.lexc 
Multichar_Symbols A%0

LEXICON Root
A%0 # ;
aA%0 # ;
A%00 # ;

flammie@zmey ~/Koodit/omorfi (556) [03:48:12] 
$ hfst-lexc a0.lexc | hfst-fst2strings
hfst-lexc: warning: Defaulting to OpenFst tropical type
A@ZERO@
aA@ZERO@

While the automagic guessing of which zero is in multichar or not is rather problematic to get right there's quite a bit of code in the repos that rely on it and other such quirks of lexc that'll need to be faithfully maintained in hfst-lexc until we can upgrade to better formalisms...

Discussion

  • Senka Drobac

    Senka Drobac - 2014-02-06

    Fixed in #3707

     
  • Senka Drobac

    Senka Drobac - 2014-02-06
    • status: open --> closed
     
  • Flammie Pirinen

    Flammie Pirinen - 2014-06-30
    • status: closed --> open
     
  • Flammie Pirinen

    Flammie Pirinen - 2014-06-30

    It was noted on IRC today that A0 in hfst-lexc still results in A even with multichar A0 declared. It seems that the tests ensuring the functionality have been commented out.

     
  • sjurum

    sjurum - 2014-06-30

    This bug is still there, it is not fixed (version 3.7.1, revision 3945), and it creates problems for keeping parity between Xerox and Hfst in the Giellatekno/Divvun infrastructure.

     
MongoDB Logo MongoDB