actual output:
$ echo '^omstende<n><nt><ut><sg><ind>$'|lt-proc -d nno/nno.autogen.bin
#omstende\<n\>\<nt\>\<sg\>\<ind\>
expected:
$ echo '^omstende<n><nt><ut><sg><ind>$'|lt-proc -d nno/nno.autogen.bin
#omstende\<n\>\<nt\>\<ut\>\<sg\>\<ind\>
where a correct analysis is e.g.
$ echo '^omstende<n><nt><sg><ind>$'|lt-proc -d nno/nno.autogen.bin
omstende
example fst
Diff:
It looks like <ut> is just being deleted there:
$ echo '^omstende<n><ut><nt><sg><ind>$' |lt-proc -d /tmp/nno.autogen.bin </ind></sg></nt></ut></n></ut>
omstende\<n>\<nt>\<sg>\<ind>
echo '^omstende<n><ut><ut><sg><ind>$' |lt-proc -d /tmp/nno.autogen.bin </ind></sg></ut></ut></n>
omstende\<n>\<sg>\<ind>
$ echo '^omstende<n><ut><sg><ind>$' |lt-proc -d /tmp/nno.autogen.bin </ind></sg></ut></n>
omstende\<n>\<sg>\<ind>
echo '^omstende<n><ut><sg><ind><foo><bar>$' |lt-proc -d /tmp/nno.autogen.bin
#omstende\<n>\<sg>\<ind>
It's because there is no <ut> symbol in there[1]. Maybe retitle this to be that tags that aren't in the dictionary should be somehow marked?</ut></bar></foo></ind></sg></ut></n>
[1] lt-print /tmp/nno.autogen.bin |grep '<ut>' gives nothing, at least</ut>
I was expecting the debug output to be exactly the input for the #-marked lu's. The string
#omstende\<n>\<sg>\<ind>isn't anywhere (not in input, not in the bin), so it's rather unhelpful (and makes it hard to do testvoc using-doutput).