Menu

#127 apertium-transfer deosn't recognise lowercase lemmas

open
nobody
2017-10-10
2017-10-10
No

Bug or lost feature in apertium-transfer for t1x regarding the case of @lemma attribute of cat-item element. Input from the pipeline up until here:

^seit<cnjsub>/kun<cnjsub>$ ^Jahrzehnt<n><nt><pl><dat>/vuosikymmen<n><pl><dat>$ ^bevor<preadv>/ennen<post>$

trying to match:

    <def-cat n="seit">
      <cat-item lemma="seit" tags="pr.*"/>
      <cat-item lemma="seit" tags="pr.dat"/>
      <cat-item lemma="seit" tags="cnjsub"/> <!-- XXX: bad disam -->
    </def-cat>
    <def-cat n="bevor">
      <cat-item lemma="bevor" tags="preadv"/>
    </def-cat>
    <def-cat n="zeitwort">
      <cat-item lemma="Jahr" tags="n.*"/>
      <cat-item lemma="Jahrzehnt" tags="n.*"/>
      <!--<cat-item lemma="jahrzehnt" tags="n.*"/>-->
    </def-cat>

in rule:

    <rule comment="seit ZEIT bevor: AIKOIHIN">
      <pattern>
        <pattern-item n="seit"/>
        <pattern-item n="zeitwort"/>
        <pattern-item n="bevor"/>
      </pattern>
      <action>
        <call-macro n="case-mangler">
          <with-param pos="2"/>
        </call-macro>
        <call-macro n="number-mangler">
          <with-param pos="2"/>
        </call-macro>
        <out>
          <chunk name="AdvP" case="caseFirstWord">
            <tags>
                <tag><lit-tag v="ADV"/></tag>
            </tags>
            <lu>
              <clip pos="2" side="tl" part="lem"/>    <!-- Jahr ~ vuosi -->
              <clip pos="2" side="tl" part="a_noun"/> <!-- n -->
              <var n="number"/>                       <!-- sg / pl -->
              <lit-tag v="ill"/>                      <!-- ill -->
            </lu>
          </chunk>
        </out>
      </action>
    </rule>

Does not work with pipeline:

$ cat modes/deu-fin-transfer.mode 


    lt-proc -w -e '/home/tpirinen/github/flammie/apertium-fin-deu/deu-fin.automorf.bin' | cg-proc -w1n '/home/tpirinen/github/flammie/apertium-fin-deu/deu-fin.rlx.bin' | apertium-pretransfer| lt-proc -b '/home/tpirinen/github/flammie/apertium-fin-deu/deu-fin.autobil.bin' | apertium-transfer -c -b '/home/tpirinen/github/flammie/apertium-fin-deu/apertium-fin-deu.deu-fin.t1x'  '/home/tpirinen/github/flammie/apertium-fin-deu/deu-fin.t1x.bin' 

see here:

$ apertium -d . deu-fin-transfer < texts/dw.de-Langsam-gesprochene-Nachrichten-2017-10-10.text | fgrep Liberia
^default<default>{^kun<cnjsub>$}$ ^nP<N><FOOFOO>{^vuosikymmen<n><pl><gen>$}$ ^default<default>{^ennen<post>$}$

Uncommenting the lower-case lemma in t1x fixes the problem.

Discussion


Log in to post a comment.

MongoDB Logo MongoDB