[hfst-development] Lookup in HFST3

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

I'm looking into restoring the lookup functionality into libhfst that hasn't
made it over from HFST2 yet, but I'm not sure what philosophy the library
should be following. Should lookup/analysis functions (and support functions
for tokenizing input strings) from the backend libraries be driving the
lookup? SFST and foma both have such functions exposed, while OpenFST does
not directly. This approach leads to considerable variance in the lookup
operation with different backends as e.g. foma honors flag diacritics for
its lookup while SFST does not. So would it instead be preferred to follow
HFST2 in using HFST-specific methods for performing lookups and input string
tokenization?

I'm also wondering what design decisions have been made regarding the the
role of HFST2's Symbol and Key layers in the new library version. The code
currently seems to have traces of key table usage which has been removed.
And what about HfstTransducer's is_trie member variable? Does it have any
relation to the Trie class in HfstTokenizer.h?

Thanks for the help,
--Brian Croom