this command
echo '0226 04 0-6 0–7–90–90 07 1177 1182 11 12-14 12 13 13 14-15 148 1-4 1.-4 1500- 1600 15 1600 17 1814 1840 1848 1-8 18 1900 1902 1924 1956 1959 1963 1967 1970 1972-1982 1973 1975 1976 1979 1980-81 1983 1985 19870612-056 1987 1988 1991 1997-4 19980717061 1998 1999 1 2000 2002 20050617-064 2005 2007-13 2007 2009-2013 200 2011 30 2012-06 2013 2014 2016 204 207 2.2 25 267 2.-6 27 752 000 29 2 30 32 350 3'|hfst-proc smj-sme.automorf-untrimmed.hfst |wc -l
makes proc eat RAM and never output anything (well, I didn't try letting it eat my swap), while hfst-lookup seems to work (40961 analyses, after 17s CPU time; Xerox lookup somehow takes <2s).
hfst-optimised-lookup is much faster than xerox, but skips most analyses (which might be ok):
The issue in proc might be related to it trying to tokenize the input as it does the analysis. Because of all spaces, the tokenization is potentially very ambiguous.
So the 40961 lines have something to do with the apertium-version of the analyser:
I guess it makes sense that the proc's tokenisation can take a lot of space if there are 40961 end states along with all those spaces.