Menu

#206 Replace behaviour

3.4
closed
None
5
2014-01-29
2013-10-21
No

I'm not sure is this a bug in hfst-regexp2fst, Xerox replace rules or what, but this can't be right:

echo "a+ [\"\" -> \">\" ]" | hfst-regexp2fst > aa.hfst
echo "abbb" | hfst-lookup aa.hfst

abbb a>b>b>b>

Here the idea is to add a ">" after one or more "a", but they also appear after any other character. Also:

echo "a+ [\"\" @-> \">\" ]" | hfst-regexp2fst > aa_longest.hfst

produces an empty transducer.

Discussion

  • Senka Drobac

    Senka Drobac - 2013-10-24

    The first command gives expected result. Rule:
    "" -> ">"
    or [..] -> ">" in xfst and foma

    replaces every epsilon with ">" only once and it finds an epsilon before and after any symbol.
    If you wish to insert something with replace rules, you could use markup rule, i.e.:
    echo 'a+ -> ... ">"' | hfst-regexp2fst | hfst-fst2txt

    However, the second command should not give an empty transducer, I'll fix that.

     
  • Senka Drobac

    Senka Drobac - 2014-01-29
    • status: open --> closed
     
  • Senka Drobac

    Senka Drobac - 2014-01-29

    Fixed in revision 3692.

     
MongoDB Logo MongoDB