morfologik-svn Mailing List for Morfologik

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Revision: 154
          http://morfologik.svn.sourceforge.net/morfologik/?rev=154&view=rev
Author:   dawidweiss
Date:     2009-03-27 08:14:42 +0000 (Fri, 27 Mar 2009)

Log Message:
-----------
Added FSA version that works for me.

Added Paths:
-----------
    fsa/
    fsa/CHANGES
    fsa/INSTALL
    fsa/Makefile
    fsa/README
    fsa/TROUBLESHOOTING
    fsa/Times
    fsa/accent.cc
    fsa/accent.h
    fsa/accent_main.cc
    fsa/build_fsa.cc
    fsa/build_fsa.h
    fsa/builds_fsa.cc
    fsa/buildu_fsa.cc
    fsa/chkmorph.pl
    fsa/common.cc
    fsa/common.h
    fsa/compile_options.h
    fsa/de.acc
    fsa/de.lang
    fsa/de_morph_data.awk
    fsa/de_morph_infix.awk
    fsa/deguess.awk
    fsa/deguess.pl
    fsa/demorph.awk
    fsa/demorph.pl
    fsa/dump.cc
    fsa/filesel.tcl
    fsa/filesel.tcl.in
    fsa/find_irregular.awk
    fsa/find_irregular.pl
    fsa/fr.acc
    fsa/fr.lang
    fsa/fsa.h
    fsa/fsa_accent.1
    fsa/fsa_accent.exe
    fsa/fsa_build.1
    fsa/fsa_build.exe
    fsa/fsa_dump
    fsa/fsa_guess.1
    fsa/fsa_guess.5
    fsa/fsa_guess.exe
    fsa/fsa_hash.1
    fsa/fsa_hash.exe
    fsa/fsa_morph.1
    fsa/fsa_morph.5
    fsa/fsa_morph.exe
    fsa/fsa_prefix.1
    fsa/fsa_prefix.exe
    fsa/fsa_spell.1
    fsa/fsa_spell.exe
    fsa/fsa_ubuild.1
    fsa/fsa_ubuild.exe
    fsa/fsa_version.h
    fsa/fsa_visual.1
    fsa/fsa_visual.exe
    fsa/gendata.pl
    fsa/guess.cc
    fsa/guess.h
    fsa/guess_main.cc
    fsa/hash.cc
    fsa/hash.h
    fsa/hash_main.cc
    fsa/ie1
    fsa/jaccent-skeleton
    fsa/jguess-skeleton
    fsa/jmorph-skeleton
    fsa/jspell-skeleton
    fsa/jspell.el
    fsa/mkindex.cc
    fsa/mmorph23c.awk
    fsa/mmorph23c.pl
    fsa/morph.cc
    fsa/morph.h
    fsa/morph_data.awk
    fsa/morph_data.pl
    fsa/morph_infix.awk
    fsa/morph_infix.pl
    fsa/morph_main.cc
    fsa/morph_prefix.awk
    fsa/morph_prefix.pl
    fsa/nindex.cc
    fsa/nindex.h
    fsa/nnode.cc
    fsa/nnode.h
    fsa/nstr.cc
    fsa/nstr.h
    fsa/one_word_io.cc
    fsa/out
    fsa/pl.acc
    fsa/pl.chcl
    fsa/pl.lang
    fsa/prefix.cc
    fsa/prefix.h
    fsa/prefix_main.cc
    fsa/prep_atg.awk
    fsa/prep_atg.pl
    fsa/prep_ati.awk
    fsa/prep_ati.pl
    fsa/prep_atl.awk
    fsa/prep_atl.pl
    fsa/prep_atp.awk
    fsa/prep_atp.pl
    fsa/putinplace.pl
    fsa/simplify.pl
    fsa/snode.cc
    fsa/sortatt.pl
    fsa/sortondesc.pl
    fsa/spell.cc
    fsa/spell.h
    fsa/spell_main.cc
    fsa/tclmacq-help.txt
    fsa/tclmacq-lang.txt
    fsa/tclmacq.tcl
    fsa/tclmacq.tcl.in
    fsa/text_io.cc
    fsa/unode.cc
    fsa/unode.h
    fsa/visual_main.cc
    fsa/visualize.cc
    fsa/visualize.h

Property Changed:
----------------
    /


Property changes on: 
___________________________________________________________________
Modified: svn:ignore
   - fsa*

   + 


Added: fsa/CHANGES
===================================================================

--- fsa/CHANGES	                        (rev 0)
+++ fsa/CHANGES	2009-03-27 08:14:42 UTC (rev 154)
@@ -0,0 +1,339 @@
+Version 0.5:
+- Word length in programs using automata increased to 120.
+- Option `clean' provided in Makefile.
+- Option `-v' provided in all programs (gives version details).
+- Sorting of arcs on frequency in optimization phase of automaton creation.
+- Merging two nodes that share the same arc.
+- This file added.
+Version 0.6:
+- Option -v corrected.
+- fr.acc file added to distribution.
+- man pages provided.
+- Compilation options shown in -v in all programs.
+- Option -X provided in fsa_build (makes an index a tergo for word category
+  guessing).
+- New program - fsa_guess - added; it predicts word categories based on
+  word endings.
+- New program - fsa_hash - added; it is used for perfect hashing.
+- Option -i added to programs using automata; it specifies input files.
+- Option -l added to programs using automata; it provides information
+  on language specific features, such as which characters form words,
+  and on case conversions.
+- New module - text_io - provided that processes text files (many words
+  in line, punctuation, etc.), and gives grep-like output.
+Version 0.7:
+- In one_word_io, replacements are now separated by a comma and a space
+  (was: space only); this makes it possible to have a two-word
+  replacement for one word - in other words: now run-on words can be
+  corrected.
+- New compile option RUNON_WORDS added; if turned on, fsa_spell checks
+  for run-on words, i.e. it checks whether inserting a space somewhere
+  inside the word results in two correct words.
+- New compile option CHCLASS added; if turned on, a dedicated file
+  specifies equivalent sequences of characters, so that e.g. `rz' and
+  `z' with a dot above (\.z in TeX) may be only one edit distance unit
+  apart from each other.
+- Emacs interface for spelling correction added; it is an adaptation of
+  ispell.el.
+Version 0.8:
+- New program fsa_morph performs morphological analysis (but not generation).
+- Improved INSTALL guidelines.
+- README more up to date, obsolete data removed, better file list.
+- fsa_guess now guesses lexemes as well (with GUESS_LEXEMES).
+- awk scripts for data preparation.
+Version 0.9:
+- Corrected a bug that caused segment violations when using dictionaries
+  of different sizes, and thus preventing users from using personal
+  dictionaries.
+- fsa_guess now recognizes prefixes with GUESS_PREFIX option.
+- New options -g and -p for fsa_guess to simulate compile options.
+- Words and lines can now be of arbitrary length.
+- Binary search in leaf vectors of the register - this does speed up
+  processing considerably.
+- New compile option for creating an index a tergo: GENERALIZE; it gives
+  smallest automata sizes.
+- New compile option STATISTICS prints... wait for it... some statistics
+  in fsa_build.
+Version 0.10:
+- Corrected a bug in fsa_build that showed up when using PRUNE_ARCS options
+  while compiling an index a tergo.
+- Corrected a bug in fsa_guess that prevented the proper use of -g option.
+  Now -g and -p are independent.
+- Introduced a limit on the number of analyses in fsa_guess.
+- Introduced a limit on the depth of search for suffixes.
+- Corrected a bug in fsa_build man page.
+- Changed definitions of node and arc_node classes, so that the automaton
+  requires less memory than before (by a quarter).
+Version 0.11:
+- Corrected a bug in statistics.
+- Option -r added to the function usage() in fsa_spell.
+- Removed random inline in fsa.h.
+- Updated #ifdefs so that all #ifdef NUMBERS are enclosed in #ifdef FLEXIBLE.
+- Updated Makefile so that it contains description of NUMBERS
+- Corrected a bug in fsa_build that appeared while reading long input lines
+- Updated description of -v option for all programs
+- Corrected the effect of GENERALIZE option
+- Introduced -m option in fsa_guess (prediction of mmorph descriptions
+  of words based on inflected forms). mmorph is a morphology program
+  available from ISSCO, Geneva, http://www.issco.unige.ch/
+  or http://issco-www.unige.ch/
+- fsa_build is now faster.
+- Corrected a bug in PRUNE_ARCS option application.
+Version 0.12:
+- Added a new program: fsa_ubuild.
+Version 0.13:
+- Corrected a bug in fsa_ubuild that excluded some words from the automaton;
+  the bug was in the function already_there().
+- Added new program: fsa_visual.
+- Added an entry for version 0.12 in this file.
+Version 0.14:
+- Corrected a bug in Makefile (introduced in 0.12) - there was no rule
+  for making buildu_fsa.o.
+- Changed declarations in fsa.h to simplify their use.
+- Added perl scripts (awk scripts translated with a2p) for portability.
+- Corrected a bug in fsa_hash: -N did not work correctly.
+- fsa_visual uses manhattan edges.
+- Introduced a new compile option STOPBIT that changes the format
+  version, and makes automata smaller (by nearly 20% for large
+  automata).
+- Included more information on data preparation in README, and on
+  compile options in INSTALL.
+- Compiled the package on Solaris using g++ 2.6.0 to improve
+  portability (thanks Sabine).
+Version 0.15:
+- Corrected a bug in list.empty_list - a memory leak that could be a nuisance
+  with fsa_prefix operationg on large data.
+- Corrected perl scripts.
+- Added new script: morph_infix.{awk,pl}. It prepares data for an automaton
+  to be used with fsa_morph for languages that have prefixes and infixes
+  (like German).
+- Added new compile option: MORPH_INFIX, and two new runtime options for
+  fsa_morph: -I and -P. They make it possible to use data prepared with
+  morph_infix.{awk,pl}.
+- Added new compile option: POOR_MORPH that enables -A option in
+  fsa_morph. That option enables morphological analysis giving only
+  categories, and no base form.
+- Added new script: morph_prefix.{awk.pl}. It prepares data for an automaton
+  to be used with fsa_morph for languages that have prefixes (like Polish).
+Version 0.16:
+- Corrected a memory leak bug in fsa_morph. Now fsa_morph works two orders 
+  of magnitude faster.
+- Corrected manual pages (format errors).
+Version 0.17:
+- Added new compile option for fsa_build and fsa_ubuild:
+  DESCENDING. If on, makes resulting automata smaller, but slower.
+- Improved morph_infix.{awk,pl}.
+- New option -F added to fsa_build and fsa_ubuild. It sets the filler
+  character.
+- New scripts added: prep_ati.{awk,pl}. They prepare data with coded infixes
+  and prefixes for guessing lexemes and categories using fsa_guess.
+- New scripts added: prep_atp.{awk,pl}. They prepare data with coded prefixes
+  for guessing lexemes and categories using fsa_guess.
+- fsa_hash now works correctly with the STOPBIT option.
+- corrected another bug in fsa_hash, which probably lingered there
+  from the beginning, and which made fsa_hash unusable for more than
+  256 words.
+Version 0.18:
+- Added new compile option MORE_COMPR that tries to get more compression
+  when using fsa_build or fsa_ubuild compiled with NEXTBIT.
+Version 0.19:
+- Added new compile option TAILS that enables compression of tails
+  (last transitions) of states.
+- Now MORE_COMPR also tries to squeeze some bytes without NEXTBIT.
+- Corrected a bug in Makefile introduced in 0.18 (one comment too
+  many).
+- Enriched documentation on options in INSTALL.
+- Corrected a bug in fsa_visual that showed up with variable size
+  arcs, i.e. NEXTBIT or TAILS.
+- Added a check on whether -O option should be used in fsa_visual and
+  supply it when necessary even when the user doesn't do that.
+- Added LOOSING_RPM compile option to circumvent a bug in g++ or
+  stdlibc++ found in new rpms (I have to use SuSE now, and I got
+  reports of the same bugs appearing on Red Hat, but no problems on
+  Debian). This does not solve all the problems - if they appear,
+  switch optimization off (remove -O2 from compile options).
+- Added a small program fsa_dump. It is not in Makefile, as it is not
+  tested yet. The source is in dump.cc. The program lists the contents
+  of an automaton as transitions.
+- Added scripts: de_morph_data.{awk,pl} and de_morph_infix.{awk,pl}
+  that produce the 3 column format out of data for fsa_build.
+- Added scripts: demorph.{awk,pl} that produce the 3 column format from
+  the output of fsa_morph.
+Version 0.20
+- Moved mark_inner() to nnode.cc, as it can be used without A_TERGO option.
+- Added an info on producing the contents of an automaton.
+- Fixed display of statistics for NEXTBIT and TAILS
+- Corrected placement of conditionals so that compilation without
+  FLEXIBLE is possible. But do use FLEXIBLE!
+- Added a Tcl script - an interface for fsa_guess as a tool for
+  acquisition of descriptions for a morphological dictionary.
+- Added additional information to -v option of all programs.
+- MORE_COMPR is now *much* faster; actually, it became usable.
+- Added a new perl script chkmorph.pl that removes those predictions
+  made by fsa_guess that cannot produce the required flectional form.
+- Added sortatt.pl perl script that sorts words on their
+  categories/features; it is used by the tcl/tk interface, and it is
+  specially useful when comparing output of two descriptions.
+- added gendata.pl - a perl script that generates data for guessing
+  morphological descriptions in mmorph format of unknown words.
+Version 0.21
+- Corrected some bugs in gendata.pl.
+- Added new compile option - WEIGHTED.
+- Corrected a bug in chkmorph.pl.
+- Corrected a bug in fsa_ubuild (thanks to Christen Blom - Dahl)
+- Totally rewritten GENERALIZE. I hope it provides better results.
+- Added new script sortondesc.pl that sorts morphological descriptions
+  of words so that the most probable come first. A description is
+  judged to be more probable when it appears in more words.
+- Corrected a horrible bug in fsa_spell that manifested itself when
+  the edit distance was set to 0. Program gave arbitrary results.
+- Tcl/Tk interface for lexical acquisition is now much more powerful.
+- Added a new script putinplace.pl that should put descriptions chosen
+  with the Tcl/Tk tool in their appropriate places.
+Version 0.22
+- Corrected conditional compilation so that it is now possible to
+  compile without MORE_COMPR.
+- Added guided correction (right mouse button on description) to the
+  dictionary acquisition tool. The interface is improved.
+- Added statistics to the dictionary acquisition interface.
+Version 0.23
+- In the Tcl/Tk tool, corrected output from mmorph matching so that if
+  all values of a feature are generated, nothing comes out, and when
+  no features are generated, the feature name is deleted from the
+  output.
+- In the Tcl/Tk tool, corrected deleting features using the right mouse
+  button menu.
+- Corrected the script chkmorph.pl so that no phony item appears at
+  the end (there is no dangling comma at the end).
+- Added a new option to ignore the filler character in morphology.
+- Corrected building a weighted guessing automaton. It still needs my
+  attention.
+Version 0.24
+- Corrected dropping one hypothesis in sortondesc.pl script.
+- Corrected a bug in fsa_build that make pointer size calculation invalid
+  (thanks to Gertjan van Noord).
+- Corrected a bug in fsa_spell for distances greater than 1 (thanks to
+  Jiri Andel).
+Version 0.25
+- Included perl and tcl scripts in installation in Makefile.
+- Corrected a bug in fsa_hash: null pointers were followed in word->number
+  conversion (thanks to Martin Povolny).
+Version 0.26
+- Included perl and tcl scripts deleted by mistake from 0.25.
+- Corrected Makefile so that it does not delete perl and tcl scripts
+  in make realclean.
+Version 0.27
+- Corrected a bug in Undo operation in Tcl/Tk interface.
+- Moved customization of tclmacq to Makefile.
+- Adapted tclmacq to new version of Tcl/Tk.
+Version 0.28
+- Corrected a bug in tclmacq (Tcl/Tk interface for dictionary
+  acquisition). Sorting was done before (and not after) expansion of
+  alternatives, which resulted in apparently random order.
+- Added some include directives needed in the most recent compilers
+  (thanks to Dawid Weiss).
+- Corrected setting the FILLER character in builds_fsa.cc (thanks to
+  Dawid Weiss).
+- Corrected usage info for dump.cc (thanks to Dawid Weiss).
+Version 0.29
+- Corrected a bug in simplify.pl (it produced duplicates).
+Version 0.30
+- Corrected a bug in fsa_morph. When one entry was a prefix of another
+  entry, the words were the same, but one annotation was shorter then
+  the other one, the longer entry was not printed (thanks to Gertjan
+  van Noord).
+Version 0.31
+- Corrected the use of one variable so that the package compiles with
+  the old set of options (thanks to Michael Daum).
+Version 0.32
+- The package now compiles under g++ 3.1.1.
+Version 0.33
+- jguess is again produced (thanks to Leonoor van der Beek)
+- Corrected fsa_hash so that words not in the dictionary return -1
+  and not a slash (thanks to Vinay Middha).
+- Added a file TROUBLESHOOTING describing the most common problems people have
+  while trying to install and use the package. As a bonus, I included some
+  solutions as well.
+- Added possibility of morphological analysis of words without tags, i.e.
+  stemming or lemmatization (thanks to Gertjan van Noord). Just remove
+  the last annotation separator (+) and anything that follows it from
+  the output of a script preparing morphological data.
+Version 0.34
+- States can have up to 255 (was: 127) outgoing transitions when
+  compiled with STOPBIT (thanks to Gertjan van Noord).
+- Closed memory leaks in handling of lists (thanks to Martin Povolny).
+Version 0.35
+- Corrected a bug introduced in the previous version (deleting the
+  wrong thing).
+Version 0.36
+- Corrected a bug in dynamic growth of strings read from input in
+  programs that use automata, i.e. not fsa_build nor fsa_ubuild
+  (thanks to Gertjan van Noord).
+Version 0.37
+- Replaced recursion with iteration in some programs, e.g. fsa_hash.
+  fsa_hash is now about 3.5 percent faster.
+Version 0.38
+- Introduced a "-a" runtime option to list the contents of the whole
+  dictionary. The updated glibc++ version I have now treats reading an
+  empty line as an error, so there is no way to learn if an empty line
+  was indeed read.
+- Introduced a new compile option DUMP_ALL to supress printing the
+  leading space in fsa_prefix.
+- Corrected some type errors and vestiges of previous versions when
+  using DEBUG compile option (thanks to Nikolay Ketsaris).
+- Corrected dump.cc to print non-ASCII characters.
+Version 0.39
+- fsa_spell now compiles also without CHCLASS (thanks to Nikolay Ketsaris).
+- Added ios::binary in 3 places for the benefit of those who have the
+  misfortune of being forces to use the virus distribution system from
+  M$.
+- Corrected exit code in fsa_prefix when -a is used (thanks to Marcin
+  Mi\xB3kowski)
+- Corrected a bug in fsa_build and fsa_ubuild when -O was
+  used. Certain states were compressed "too much", i.e. comparison of
+  transitions did not work in part_cmp_nodes due to a modification
+  introduced several versions ago.
+- Changed the script ie1 to make it immediately useful for debugging
+  should anything unpredictable happen.
+- Removed the outrageously outdated file ToDo.
+Version 0.40
+- Corrected a bug in the initialization of the H_matrix in fsa_spell
+  (thanks to Guillaume Rousse).
+- Corrected a bug in ie1 (return value for fgrep)
+- Changed the way parameters are passed to most functions in programs
+  that use automata (passing first arc instead of the parent arc).
+  This might have introduced some new errors...
+- Added a parameter to fsa_spell to force the search for replacements
+  (thanks to Guillaume Rousse).
+- Added two new compile options. The first one -- SPARSE -- changes
+  the way the automaton is represented. if the option is used, then
+  most of transitions of the automaton are stored as a sparse
+  matrix. Only annotations (e.g. in morphological dictionaries) are
+  still stored as lists of transitions. The new representation is
+  faster for most tasks, but it takes longer to produce, and it is
+  larger. The option SLOW_SPARSE makes sure that we try to fill in
+  every hole in the sparse matrix, but it results in *VERY* slow
+  construction, and the results are practically the same.
+Version 0.41
+- Corrected a bug in fsa_ubuild that caused the FILLER not to be set (thanks
+  to Marco Baroni)
+Version 0.42
+- Corrected a compile error in mmorph.cc when MORPH_INFIX was undefined.
+- Corrected an error in fsa_prefix that gave infinite loops while
+  listing words with certain prefixes.
+- Corrected a bug that resulted from new glibc++ I/O behaviour (thanks
+  to Gertjan van Noord)
+- Changed the licence so that the package is freer than it used to be.
+Version 0.43
+- Corrected a bug in fsa_morph that never got a chance to manifest
+  itself there because of the way C++ initializes variables, but it
+  was a bug anyway (thanks to Jirka Mikulasek).
+Version 0.44
+- Corrected a bug in fsa_morph that was introduced in version 0.40 and
+  resulted in inability to process infixes (thanks to Marcin Milkowski).
+- Corrected a bug in fsa_guess that was introduced in version 0.40 and
+  resulted in inability to process infixes (thanks to Marcin Milkowski).
+Version 0.45
+- Corrected a bug in counting transitions with the next flag set that
+  resulted in incorrect pointer size in fsa_build/fsa_ubuild (thanks
+  to Marcin Milkowski and Dawid Weiss).

Added: fsa/INSTALL
===================================================================
--- fsa/INSTALL	                        (rev 0)
+++ fsa/INSTALL	2009-03-27 08:14:42 UTC (rev 154)
@@ -0,0 +1,810 @@
+1. COMPILATION
+
+1.1. General
+
+  All programs are written in C++. You need a C++ compiler to compile them.
+  I have used GNU g++ 2.6.0 under SunOS 4.1.4, and later under
+  Solaris. This version was compiled with g++ 2.7.2.1. Previous versions
+  may have problems with templates. I had problems compiling this
+  version with Solaris CC - again, templates were to blame.
+
+  If you work under Unix, and you have a g++ compiler, a simple command:
+
+  make
+
+  should work. If you use a different compiler, append CXX=that_compiler to
+  the command line, e.g.:
+
+  make CXX=CC
+
+  If you use another operating system, and a different compiler, you should
+  have manuals for them. Consult them. Under the infamous so called
+  operating system from Microsoft, you should consider adding
+  ios::binary to the declaration of fstream dict(...) in file common.cc.
+
+  Note that emacs lisp package works with emacs 19.34, and it will
+  almost certainly not work with emacs 20.
+
+1.2. Compile options
+
+  Before you jump to experiment with various options, or jump out
+  of the window on seeing how many options there are, please note
+  that a default set of them is provided in the Makefile, so do not worry.
+
+  Please note that you can see what options were used for compiling
+  a particular program by invoking it with -v option. When you change
+  compile options and recompile the programs, please do "make clean" first
+  - it may save you a lot of troubles.
+
+  There are some compile options that may be worth trying. First,
+  normal optimization for speed (done by the compiler):
+
+  CPPFLAGS=-O
+  or
+  CPPFLAGS=-O2
+
+  Then there are options used for conditional compilation of the source
+  code. They are specified in CFLAGS with -D, i.e. use e.g.
+
+  make CPPFLAGS=-DJOIN_PAIRS
+
+  to compile the programs with JOIN_PAIRS option on. To specify more
+  options, put them into quotes, e.g.:
+
+  make CPPFLAGS='-DA_TERGO -DSORT_ON_FREQ'
+
+  Be careful when specifying MORE_COMPR. The construction time may
+  rise dramatically when you use -O run-time option of fsa_build or
+  fsa_ubuild. That time is spent not on construction itself, but
+  rather on reordering the arcs, and trying to match them.
+
+  In the following descriptions, the following fileds are used:
+  Assumes: those options must be defined.
+  Excludes: those options cannot be defined.
+  Used in: this option is given to programs in the list.
+  Affects: this option changes the output of programs in the list.
+
+1.2.1. Options changing format version number
+
+  In the present version, there are 6 numbered format versions: 0, 1, 2
+  4, 5, and 128 (or -127, or 0x80). For differences between these formats
+  see file fsa.h. The formats correspond to different settings of the
+  following compile options: LARGE_DICTIONARIES, FLEXIBLE, STOPBIT, NEXTBIT,
+  TAILS, SPARSE. In the following table, LARGE_DICTIONARIES appears as L_D.
+
+  	L_D	FLEXIBLE	STOPBIT	NEXTBIT	    TAILS   WEIGHTED	SPARSE
+  0:	-	-		-	-	    -	    -		-
+  1:	-	+		-	-	    -	    -		-
+  2:	-	+		-	+	    -	    -		-
+  4:	-	+		+	-	    -	    -		-
+  5:	-	+		+	+	    -	    -		-
+  6:	-	+		+	-	    +	    -		-
+  7:	-	+		+	+	    +	    -		-
+  8:	-	+		+	+	    -	    +		-
+  9:	-	+		+	-	    -	    -		+
+  10:	-	+		+	+	    -	    -		+
+  11:	-	+		+	-	    +	    -		+
+  12:	-	+		+	+	    +	    -		+
+  128:	+	-		-	-	    -	    -		-
+
+  Note that in order to produce an automaton in format version 8, -W
+  runtime option must be given to fsa_build or fsa_ubuild. Otherwise
+  version 5 will be produced.
+
+
+  FLEXIBLE
+  makes it possible to produce dictionaries (automata) tailored to
+  particular needs. The size of arcs is determined dynamically. This
+  should be on, as the old way gives (usually) bigger
+  dictionaries. This option also makes the automata portable - another
+  reason for using it. I may remove inflexible code from the future
+  versions of this package.
+  Assumes: no options.
+  Excludes: LARGE_DICTIONARIES.
+  Used in: all programs.
+  Affects: fsa_build, fsa_ubuild.
+  When to use: always.
+
+  LARGE_DICTIONARIES
+  This is an old option to be used without FLEXIBLE when the automaton
+  gets too big. Note that FLEXIBLE makes it possible to produce
+  dictionaries of any size while making them as small as possible, so
+  you do not need this LARGE_DICTIONARIES. I am not sure whether it
+  still works.
+  Assumes: no options.
+  Excludes: FLEXIBLE, STOPBIT, NEXTBIT, NUMBERS.
+  Used in: all programs.
+  Affects: fsa_build, fsa_ubuild.
+  When to use: never.
+
+  NEXTBIT
+  introduces a 1b flag that is set when the target of the arc
+  is placed right after the current one in the automaton, and cleared
+  otherwise. Otherwise the bit is not set. In case the flag is set,
+  the go_to field, i.e. the address of the node to which this arc
+  points, is dropped - only the (1 byte) part than contains
+  the flag is kept. This usually produces smaller automata, as there
+  are frequently chains of nodes one following another, and for the
+  arcs of those nodes it is not necessary to store the whole addresses
+  of the next nodes in those chains. However, since the nodes are no
+  longer fixed size, and we have additional 1b flag that takes place
+  in the go_to field, the size of the resulting automaton may actually
+  be higher when the additional 2-3 bytes cross the byte boundary in
+  the go_to field. Also note that in order to increase the
+  compression, the numbering scheme is different from the usual one in
+  that it starts numbering the children from the last arc. This is
+  done in order to have more nodes lying just after the arc that
+  points to them.
+  Assumes: FLEXIBLE.
+  Excludes: LARGE_DICTIONARIES, JOIN_PAIRS.
+  Used in: all programs.
+  Affects: fsa_build, fsa_ubuild, fsa_prefix.
+  When to use: always.
+
+  STOPBIT
+  replaces counters that hold the number of arcs for each node with
+  one bit for each arc that says whether it is the last one in the
+  node. This gives smaller automata, although maybe a fraction of a
+  percent slower. Note that while automata produced with this option
+  are never larger than those produced without it, for some automata,
+  the size does not change. The reason is that 1-bit markers have to
+  find room in the goto bytes, and they may provoke crossing the byte
+  barrier.
+  Assumes: FLEXIBLE.
+  Excludes: LARGE_DICTIONARIES, JOIN_PAIRS.
+  Used in: all programs.
+  Affects: fsa_build, fsa_ubuild, fsa_prefix.
+  When to use: always.
+
+  TAILS
+  introduces a 1b flag that is set for a particular node when the tail
+  of that node (i.e. a number of arcs that are the last arcs of the
+  node) matches the tail of another node somewhere else in the
+  automaton. If the byte is set, then the present arc is followed by
+  the address of the isomorphic tail in another node in the
+  automaton. For example, if we have node A with arcs (a, c, d) (we
+  skip the addresses, and markers of finality for brevity), and a node
+  B with arcs (b, c, d), then in node B, we can have only an arc with
+  b, and a pointer to (c, d) from A. The arc b in B has the flag
+  set. Or we can do that the other way round, i.e. node A may contain
+  only arc a with the flag set, and the node B is written in
+  whole. Note that the flag takes space in the goto field, so it may
+  leed to increase in space. However, it should normally produce
+  smaller automata. It always leeds to bigger construction times.
+  Assumes: FLEXIBLE, STOPBIT.
+  Excludes: LARGE_DICTIONARIES, JOIN_PAIRS.
+  Used in: all programs.
+  Affects: fsa_build, fsa_ubuild, fsa_prefix.
+  When to use: for static dictionaries after testing (for certain
+               sizes the automata can actually be bigger).
+
+  WEIGHTED
+  introduces weights in every arc. The weights are proportional to the
+  number of strings recognized in the part of the automaton reachable
+  via that arc. Weights take only one byte, so if the number of
+  strings is too large to fit into one byte, the weights on all arcs
+  of the parent node are descreased proportinally. This option
+  requires more memory during construction process, and automata are
+  larger (they may even contain multiple copies of isomorphic nodes,
+  but with different weights). However, this option makes it possible
+  to introduce probabilities to fsa_guess.
+  Assumes: FLEXIBLE, STOPBIT, NEXTBIT, A_TERGO.
+  Excludes: LARGE_DICTIONARIES, JOIN_PAIRS.
+  Used in: all_programs.
+  Affects: fsa_build, fsa_ubuild, fsa_guess.
+  When to use: for adding new words to a morphological dictionary, for tagging.
+
+  SPARSE
+  introduces sparse matrix representation. If there is no annotation
+  separator in the strings, the entire automaton is stored using it
+  (except for some dummy data). If there are annotations, they are
+  stored in the traditional format (list of transitions). This option
+  gives fast recognition times, fast word to number
+  conversion (perfect hashing), but larger dictionaries, slow listing
+  of contents, slow number to word conversion, and slow search for
+  candidates in spelling correction, slow guessing, slow construction.
+  Assumes: FLEXIBLE, STOPBIT
+  Excludes: LARGE_DICTIONARIES, JOIN_PAIRS, WEIGHTED
+  Used in: all programs.
+  Affects: fsa_build, fsa_ubuild, fsa_prefix.
+  When to use: When the programs should be optimized for speed rather
+               than for size, number to word conversion speed is not
+	       critical to the system, and spelling correction is
+	       called mostly on correct words. See file `Times' for
+	       results of my experiments.
+
+1.2.2. Options changing format without changing format version number
+
+  NUMBERS
+  makes it possible to build automata that have word numbering
+  information in them, and to use them. That information is used by
+  fsa_hash. To build automata that have the numbering information in
+  them, use -N option of fsa_build. Note that when using -N, its is
+  not arcs, but bytes that are addressable, so we need 2 or usually 3
+  bits more for the goto field. This in turn may be translated into
+  increasing the arc size by one byte. Even when we have room for
+  those additional bits in the current byte frame, note that the
+  numbering information also takes place (as many bytes as it takes to
+  number all words stored in the automaton). You cannot use
+  compression (runtime option -O) with -N.
+  Assumes: FLEXIBLE.
+  Excludes: LARGE_DICTIONARIES.
+  Used in: all programs.
+  Affects: fsa_build, fsa_ubuild, fsa_hash, fsa_prefix.
+  When to use: if you use perfect hashing.
+
+1.2.3. Options changing the size of the automaton without changing the format
+
+  DESCENDING
+  makes the resulting automaton built with -O a bit smaller, but much slower.
+  Assumes: SORT_ON_FREQ.
+  Excludes: No options.
+  Used in: fsa_build, fsa_ubuild.
+  Affects: fsa_build, fsa_ubuild, fsa_prefix, fsa_hash.
+  When to use: If you want a bit smaller but a bit slower to use automata.
+
+  JOIN_PAIRS
+  makes the resulting automaton smaller if you use fsa_build with "-O"
+  (the option of fsa_build, or fsa_ubuild, not the compiler). It works
+  by sharing one arc by two two-arc nodes, where possible.
+  Assumes: No options.
+  Excludes: STOPBIT, NEXTBIT.
+  Used in: fsa_build, fsa_ubuild.
+  Affects: fsa_build, fsa_ubuild, fsa_prefix, fsa_hash.
+  When to use: never.
+
+  MORE_COMPR
+  changes the order of arcs to get more compression. Requires more
+  memory. With -O, the execution time is much, much longer.
+  Assumes: NEXTBIT or STOPBIT.
+  Excludes: No options.
+  Used in: fsa_build, fsa_ubuild.
+  Affects: fsa_build, fsa_ubuild, fsa_prefix, fsa_hash.
+  When to use: for static dictionaries.
+
+  SORT_ON_FREQ
+  makes the the automaton smaller (independently of JOIN_PAIRS). It
+  works by sorting the arcs on frequency. Note that this changes the
+  order of words in the automaton. If DESCENDING not set, can make the
+  resulting automaton built with -O faster.
+  Assumes: no options.
+  Excludes: no options.
+  Used in: fsa_build, fsa_ubuild.
+  Affects: fsa_build, fsa_ubuild, fsa_prefix, fsa_hash.
+  When to use: always except for cases when you build something huge in
+               real time.
+
+1.2.4. Option affecting the way guessing automata (index a tergo) are built.
+
+  A_TERGO
+  enables -X option in fsa_build. This creates an index a tergo (a
+  guessing automaton).
+  Assumes: no options.
+  Excludes: no options.
+  Used in: fsa_build, fsa_ubuild.
+  Affects: fsa_build, fsa_ubuild, fsa_guess.
+  When to use: if you use fsa_guess.
+
+  GENERALIZE
+  In fsa_build called with -X option, reduces the size of the automaton
+  while loosing the advantage of always annotating correctly words that
+  are already in the dictionary. This options makes the automaton
+  smaller than PRUNE_ARCS.
+  Assumes: A_TERGO.
+  Excludes: PRUNE_ARCS.
+  Used in: fsa_build, fsa_ubuild.
+  Affects: fsa_build, fsa_ubuild, fsa_guess.
+  When to use: if you use fsa_guess for adding new words to a dictionary.
+
+  PRUNE_ARCS
+  launches additional pruning during guessing automaton (index a
+  tergo) creation. The resulting automaton will be smaller, and
+  predictions narrower (maybe more precise, but those less probable
+  may be missing). Automata produced with this option are larger than
+  with GENERALIZE.
+  Assumes: A_TERGO.
+  Excludes: GENERALIZE.
+  Used in: fsa_build, fsa_ubuild.
+  Affects: fsa_build, fsa_ubuild, fsa_guess.
+  When to use: if you use fsa_guess for tagging.
+
+1.2.5. Options affecting the way guessing automata are interpreted
+
+  GUESS_LEXEMES
+  makes fsa_guess tries to guess not only categories, but lexemes as
+  well. The data must be prepared differently (see man pages for
+  fsa_build and fsa_guess). Run-time option -g switches off guessing
+  lexemes.
+  Assumes: no options.
+  Excludes: no options.
+  Used in: fsa_guess.
+  Affects: fsa_guess.
+  When to use: if you use fsa_guess for more tasks than tagging.
+
+  GUESS_MMORPH
+  makes it possible to use -m option in fsa_guess, i.e. prediction of
+  mmorph descriptions. mmorph is a morphology program developed at
+  ISSCO, Geneva.
+  Assumes: no options.
+  Excludes: no options.
+  Used in: fsa_guess.
+  Affects: fsa_guess.
+  When to use: for using fsa_guess in acquisition of new words for a
+	       morphological dictionary.
+
+  GUESS_PREFIX
+  makes fsa_guess use information about prefixes to disambiguate
+  morphological parses. Requires GUESS_LEXEMES. Data must be prepared
+  differently (see man pages for fsa_build and fsa_guess). Reduces the
+  size of the a tergo dictionary compared with that created to be used
+  with GUESS_LEXEMES only. Run-time option -p switches off the use of
+  prefixes in guessing.
+  Assumes: no options.
+  Excludes: no options.
+  Used in: fsa_guess.
+  Affects: fsa_guess.
+  When to use: when you use fsa_guess, and the language you are
+	       working on has prefixes or infixes.
+
+1.2.6. Options changing the way morphological automata are interpreted
+
+  MORPH_INFIX
+  makes it possible to use -P and -I options that interpret coded
+  prefixes (-P), and coded prefixes and infixes (-I) in fsa_morph.
+  For more details, see README file, and the man page for fsa_morph(5).
+  Assumes: no options.
+  Excludes: no options.
+  Used in: fsa_morph.
+  Affects: fsa_morph.
+  When to use: when you use fsa_morph, and the language you are
+	       working on has prefixes or infixes.
+
+  POOR_MORPH
+  makes it possible to use -A option, so that the automata can contain
+  only information about categories, and no information about the base
+  form of an inflected form.
+  Assumes: no options.
+  Excludes: no options.
+  Used in: fsa_morph.
+  Affects: fsa_morph.
+  When to use: if you use fsa_morph only for tagging.
+
+1.2.7. Various options.
+
+  CASECONV
+  works with fsa_spell. It makes it possible to check capitalized words
+  as if they were all lowercase.
+  Assumes: no options.
+  Excludes: no options.
+  Used in: fsa_accent, fsa_morph, fsa_spell.
+  Affects: fsa_accent, fsa_morph, fsa_spell.
+  When to use: when case conversion is needed.
+
+  CHCLASS
+  makes it possible to treat certain two-letter sequences in certain
+  context as if they were single letters. This is useful in
+  spelling. E.g. in Polish, `rz' and `z' with a dot above (\.z in TeX)
+  are pronounced in exactly the same way, so they may be confused. This
+  option makes it possible to treat such replacements as if they were
+  one edit distance unit apart from each other. This option is used in
+  fsa_spell.
+  Assumes: no options.
+  Excludes: no options.
+  Used in: fsa_spell.
+  Affects: fsa_spell.
+  When to use: for spelling correction in languages for which edit
+	       distance one is not sufficient.
+
+  DEBUG
+  If you have a few spare months, you can compile the programs with
+  CFLAGS=-DDEBUG. That will give huge amounts of information about program
+  internals during execution time. It may also give compile errors. In
+  debugging the program, I just comment out particular ifdefs.
+  Assumes: no options.
+  Excludes: no options.
+  Used in: all programs.
+  Affects: all programs.
+  When to use: never.
+
+  DUMP_ALL
+  works with fsa_prefix. If you compile the program with this option,
+  no space will be prepended to listed entries. In particular, this
+  can list the contents of the dictionary without the need to remove
+  the leading space. Use -a run-time option to list the contents.
+  Assumes: no options.
+  Excludes: no options.
+  Used in: fsa_prefix.
+  Affects: fsa_prefix.
+  When to use: to list the contents of a dictionary.
+
+  LOOSING_RPM
+  makes it possible to use the programs even on linux distributions
+  using rpms. The libstdc++ distributed with RedHat and SuSE has
+  broken I/O. You probably do need to use that option with more stable
+  distributions. This option does not fix the -O2 problem,
+  however. You will still have to use -O only.
+  Assumes: no options.
+  Excludes: no options.
+  Used in: all programs.
+  Affects: all programs.
+  When to use: with corrupted versions og libg++, e.g. Red Hat and SuSE.
+
+  PROGRESS
+  In fsa_build, shows how many lines have been read so far, and what is
+  being done at the moment, i.e. what phase the processing is in.
+  Assumes: no options.
+  Excludes: no options.
+  Used in: fsa_build, fsa_ubuild.
+  Affects: fsa_build, fsa_ubuild.
+  When to use: when you build something huge and you are not sure if
+	       it works.
+
+  RUNON_WORDS
+  makes it possible to check whether inserting a space inside the
+  checked word produces two correct words. This works with fsa_spell.
+  Assumes: no options.
+  Excludes: no options.
+  Used in: fsa_spell.
+  Affects: fsa_spell.
+  When to use: for spellchecking.
+
+  SHOW_FILLERS
+  enables printing of filler characters by fsa_prefix (they are normally
+  not printed).
+  Assumes: no option.
+  Excludes: no option.
+  Used in: fsa_prefix.
+  Affects: fsa_prefix.
+  When to use: for diagnostics.
+
+  SLOW_SPARSE
+  checks for every hole in a sparse matrix whether it can still be filled,
+  which could lead to smaller automata. This slows down construction
+  process for large automata by orders of magnitude.
+  Assumes: FLEXIBLE, STOPBIT, SPARSE.
+  Excludes: WEIGHTED, LARGE_DICTIONARIES.
+  Used in: fsa_build, fsa_ubuild.
+  Affects: fsa_build, fsa_ubuild.
+  When to use: If you think you waist too many transitions in a sparse
+	       matrix in small automata.
+
+  STATISTICS
+  In fsa_build, shows some statistics on the resulting automaton: the
+  number of states, transitions, etc.
+  Assumes: no option.
+  Excludes: no options.
+  Used in: fsa_build, fsa_ubuild.
+  Affects: fsa_build, fsa_ubuild.
+  When to use: when you are interested in properties of automata.
+
+2. CONSTANTS
+
+  Max_word_len
+  Defined in: common.h
+  Default value: 120.
+  Affects: All programs except fsa_build and fsa_ubuild.
+  Description:
+  Restrictions: Must be positive.
+
+  LIST_INIT_SIZE
+  Defined in: common.h
+  Default value: 16.
+  Affects: All programs except fsa_build and fsa_ubuild.
+  Description: Initial size of a list, e.g. list of replacements, list
+	       of dictionary names etc. The bigger, the faster.
+  Restrictions: Must be positive.
+
+  LIST_STEP_SIZE
+  Defined in: common.h
+  Default value: 8.
+  Affects: All programs except fsa_build and fsa_ubuild.
+  Description: If a list grows beyond LIST_INIT_SIZE, its size is
+	       increased by this value. The bigger, the faster.
+  Restrictions: Must be positive.
+
+  MAX_ARCS_PER_NODE
+  Defined in: fsa.h
+  Default value: 255 or 128, depending on compile options.
+  Affects: All programs.
+  Description: Maximal number of outgoing transitions per state. Do
+	       not change.
+  Restrictions: Depends on the structure of states and transitions. Do
+		not change.
+
+  MAX_NOT_CYCLE
+  Defined in: common.h
+  Default value: 1024.
+  Affects:
+  Description: Maximal length of a string in the automaton. It is used
+	       to detect errors.
+  Restrictions: Must be positive.
+
+  MAX_VANITY_LEVEL
+  Defined in: guess.h
+  Default value: 5.
+  Affects: fsa_guess.
+  Description:
+  Restrictions:
+
+  PAIR_REG_LEN
+  Defined in: nindex.h
+  Default value: 32.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions:
+
+  MAX_SPARSE_WAIT
+  Defined in: nnode.h
+  Default value: 3.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions:
+
+  Max_edit_distance
+  Defined in: spell.h
+  Default value: 3.
+  Affects: fsa_spell.
+  Description:
+  Restrictions:
+
+  WORD_BUFFER_LENGTH
+  Defined in: build_fsa.cc
+  Default_value: 128.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions:
+
+  UNREDUCIBLE
+  Defined in: mkindex.cc
+  Default value: 4.
+  Affects:
+  Description:
+  Restrictions:
+
+  WITH_ANNOT
+  Defined in: mkindex.cc
+  Default value: 2.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions: Do not change.
+
+  NO_ANNOT
+  Defined in: mkindex.cc
+  Default_value: 1.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions: Do not change.
+
+  NODE_TO_BE_REDUCED
+  Defined in: mkindex.cc
+  Default value: -5.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions: Do not change.
+
+  NODE_UNREDUCIBLE
+  Defined in: mkindex.cc
+  Default value: -6.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions: Do not change.
+
+  NODE_IN_TAGS
+  Defined in: mkindex.cc
+  Default value: -7.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions: Do not change.
+
+  NODE_MERGED
+  Defined in: mkindex.cc
+  Default value: -8.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions: Do not change.
+
+  NODE_TO_BE_MERGED
+  Defined in: mkindex.cc
+  Default value: -9.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions: Do not change.
+
+  MIN_PRUNE
+  Defined in: mkindex.cc
+  Default value: 2.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions:
+
+  MAX_DESTS
+  Defined in: mkindex.cc
+  Default value: 32.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions:
+
+  MIN_DESTS_MEMBERS
+  Defined in: mkindex.cc
+  Default value: 0.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions:
+
+  MAX_ANNOTS
+  Defined in: mkindex.cc
+  Default value: 20.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions:
+
+  MAX_DIFF_ANNOTS
+  Defined in: mkindex.cc
+  Default value: 20.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions:
+
+  MIN_KIDS_TO_MERGE
+  Defined in: mkindex.cc
+  Default value: 2.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions:
+
+  MIN_ANNOTS
+  Defined in: mkindex.cc
+  Default value: 3.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions:
+
+  AN_NOM
+  Defined in: mkindex.cc
+  Default value: 1.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions:
+
+  AN_DENOM
+  Defined in: mkindex.cc
+  Default value: 2.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions:
+
+  INDEX_SIZE_STEP
+  Defined in: nindex.cc and nnode.cc.
+  Default value: 16.
+  Affects: fsa_build and fsa_ubuild.
+  Description:
+  Restrictions:
+
+
+3. INSTALLATION
+
+  Copy all dictionaries you want to be installed into the source dictionary
+  of the package. Dictionaries are provided separately, so make sure you
+  have copied them (at least those you may need). Note that the
+  dictionaries on http://www.pg.gda.pl/~jandac/fsa.html have been
+  prepared long time ago using only those options that were available
+  at that time. If you want to use them, compile the program as you
+  like, try to use the dictionaries, and you will probably get an
+  error message saying what compile options were used for compilation
+  of fsa_build that constructed the dictionaries. Save Makefile,
+  delete some options from it, make clean, make, use fsa_prefix to get
+  the contents, restore Makefile, make clean, make, and build the
+  automata again (they should be much smaller with the default set of
+  options for the current version).
+
+3.1. Admin part
+
+  For Polish users, you may look at pl.chcl file and uncomment some
+  lines, if too many users watch too much tv, and read too little.
+
+  There are a few variables in Makefile that you can change. These are
+  PREFIXDIR - parent dir of BINDIR, MANDIR, DICTDIR (default: /usr/local);
+  BINDIR    - where the programs should be placed (default: $PREFIXDIR/bin);
+  MANDIR    - where the man pages should be placed (default:
+	      $PREFIXDIR/man);
+  DICTDIR   - where dictionaries, accent files, language files, and
+	      character class files should be placed (default:
+	      $PREFIXDIR/lib);
+  LISPDIR   - where jspell.el should be placed. I think the site-lisp
+	      directory is better than lisp directory. Check your emacs
+	      version as it normally forms a part of that name.
+	      The directory specified in Makefile by default will
+	      probably not work for you;
+  TCLMACQDIR- where files supporting execution of the tcl/tk interface
+	      tclmacq should go (help file and language file);
+  TCLMACQBINDIR
+	    - where tcl scripts and perl scripts supporting tclmacq
+	      should go (it should be the same as BINDIR);
+  PREP_FCONF- It should be set to \# for Tcl versions prior to 8.2 (I
+	      think), and to nothing for 8.2 and higher. If man
+	      fconfigure shows -encoding option present, then the
+	      variable should be empty, otherwise it should be set to \#
+  MANSECT   - in which section of the manual the pages should be placed.
+
+  You can specify those variables on the command line, e.g.:
+
+  make installlisp LISPDIR=/utl/share/gnu/emacs/site-lisp
+
+  make install	       - installes everything,
+  make installbin      - installes the binaries without man pages,
+  make installman      - installes the manpages,
+  make installscripts  - installes interface scripts (jspell & jaccent),
+  make installlisp     - installes jspell.el (byte compile it afterward),
+  make installdicts    - installes dictionaries (if any), accent files,
+		         language files, character class files.
+
+  Note that with newer linux emacs distributions, the LISPDIR should
+  point to something like /etc/emacs/site-start.d, and the file names
+  should have a prefix `50'. If you put the jspell.el there, it will
+  be loaded automatically, so that you will not need (require 'jspell)
+  in your .emacs file.
+
+3.2. Admin or user part
+
+  The following commands can be put either in site-start.el file in
+  the site-lisp directory, or in users' .emacs files. The first method
+  makes the packet functions available to all users, and it should be
+  done by the administrator. The second method enables the functions
+  on a per-user basis. Note that emacs functions used to work with
+  emacs19, they will probably (almost certainly) not work with emacs20.
+
+  ;; make functions known to emacs
+  (require 'jspell)
+
+  ;; install menus
+  (define-key-after
+    (lookup-key global-map [menu-bar edit])
+    [jspell] '("Jspell" .jspell-menu-map) 'ispell)
+
+  You may want to specify the default dictionary with e.g.:
+
+  (setq jspell-dictionary "polski")
+
+  You may want to compile additional dictionaries. Read the README
+  file and the man page for fsa_build. Remember to sort the data and
+  exclude duplicates (use sort -u) for fsa_build.
+
+  New perl and tcl/tk scripts: sortatt.pl and tclmacq.tcl require
+  setting some variables located at the top of those files.
+
+3.3. Mostly user part
+
+  The following variables can be changed by the user in their .emacs
+  file:
+
+  jmorph-format		- defines the format of morphotactic annotations
+			  (tags). It is an argument to format function
+			  (resembles C format in printf). It contains 3
+			  %s, corresponding to the inflected word,
+			  lexeme, and tag. The correspondence between
+			  those items and particular %s is given by the
+			  variable jmorph-order. Morphotactic
+			  annotations are added by jmorph-* functions.
+			  Example: (setq jmorph-format "%s_%s+%s").
+
+  jmorph-order		- defines the correspondance between the
+			  inflected word, lexeme, and annotations and %s
+			  in jmorph-format; in other words, it defines
+			  the order in which they appear as %s in
+			  jmorph-format. Example: (setq jmorph-order '(1
+			  2 3)).
+			  
+  jspell-morph-sep	- defines a separator character that separates a
+			  lexeme from annotations in the output from
+			  jmorph script. Example: (setq jspell-morph-sep
+			  "&")
+
+  jaccent-automatically	- accents are restored without asking the user
+			  for permission if there is only one choice.
+			  Example: (setq jaccent-automatically t)
+
+  jmorph-automatically	- morphotactic annotations are added without
+			  asking the user for permission if there is
+			  only one choice. Example: (setq
+			  jmorph-automatically nil).
+

Added: fsa/Makefile
===================================================================
--- fsa/Makefile	                        (rev 0)
+++ fsa/Makefile	2009-03-27 08:14:42 UTC (rev 154)
@@ -0,0 +1,369 @@
+# Makefile for building a final state automaton
+# Copyright (c) Jan Daciuk <ja...@pg...>, 1996, 1997, 1998, 1999
+#
+# The most difficult parts written by Dominique Petitpierre
+
+# These define i/o behaviour of programs
+TEXT_IO = one_word_io.o	# texts as input, grep -like output
+WORD_IO = one_word_io.o	# one word per line input
+
+
+# Installation program
+INSTALL = cp -i
+
+# C++ compiler
+CXX=g++
+
+# Compile options (see the file INSTALL for detail)
+# A_TERGO	- include code to build an index a tergo (recognizing word
+#		  categories)
+# CASECONV	- the first letter in spellchecking may be uppercase - check
+#		  both upper & lower
+# CHCLASS	- checks if a string is replaced with another string that
+#		  sounds similar; in the present form, this checks one-letter
+#		  strings against two-letter strings, and vice versa
+# DEBUG		- produces huge amounts of useless data
+# DESCENDING	- produces a bit smaller, but much slower automata
+# DUMP_ALL	- does not print the leading space in fsa_prefix
+# FLEXIBLE	- arc size should be adapted to automaton size; better
+#		  compression, (slightly) less speed, architecture independence
+# GENERALIZE	- used with A_TERGO to reduce the size of the guessing
+#		  automaton, and to increase recall
+# GUESS_LEXEMES	- tries to guess not only tags, but lexemes as well
+#		  in fsa_guess
+# GUESS_MMORPH	- makes it possible to use -m option in fsa_guess to predict
+#		  morphological descriptions of lexemes corresponding to
+#		  unknown inflected words; the descriptions are in the format
+#		  of mmorph - MULTEXT morphology tool developed at ISSCO.
+# GUESS_PREFIX	- tries to include information about prefixes to disambiguate
+#		  morphological parses in fsa_guess
+# JOIN_PAIRS	- used to prune the automaton (arcs share memory) with -X
+#		  option in fsa_build
+# LARGE_DICTIONARIES
+#		- to build big but a little bit faster automata (do not use it)
+# LOOSING_RPM	- to work around a bug in rpm libstdc++ library
+# MORE_COMPR	- to built smaller automata more slowly
+# MORPH_INFIX	- makes it possible to use -I and -P options in fsa_morph
+#		  for recognition of coded prefixes and infixes
+# NEXTBIT	- changes the format of the automaton, so that when there are
+#		  chains of nodes, one following another, one bit is set
+#		  in the goto field to indicate that fact, and only one byte
+#		  from the goto field is used; it usually gives smaller
+#		  automata
+# NUMBERS	- it is possible to use fsa_hash and build dictionaries for
+#		  perfect hashing
+# POOR_MORPH	- enables -A option in fsa_morph for morphological analysis
+#		  giving only categories, and no base forms.
+# PROGRESS	- shows how many lines were read, what fsa_build does
+# PRUNE_ARCS	- used with A_TERGO to reduce the size of the guessing
+#		  automaton, and to increase precision
+# RUNON_WORDS	- checks whether inserting a space inside the word results
+#		  in two correct words in fsa_spell
+# SHOW_FILLERS	- the filler character should be displayed in fsa_prefix
+# SORT_ON_FREQ	- arcs should be sorted on frequency (better compression)
+# SLOW_SPARSE	- try to fill every hole in sparse matrix representation
+# SPARSE	- use sparse matrix representation
+# STATISTICS	- shows some statistics after having built an automaton
+# STOPBIT	- changes the format of the automaton, so that there are
+#		  no counters, but for each arc there is a bit that says
+#		  whether it is the last one in the node; this gives smaller
+#		  automata
+# TAILS		- changes the format of the automaton, allowing for more
+#		  arc sharing, so more compression at the cost of construction
+#		  time
+# WEIGHTED	- introduces weights on arcs for guessing automata.
+#
+# PRUNE_ARCS works only with A_TERGO
+# GUESS_LEXEMES works only with A_TERGO
+# LARGE_DICTIONARIES and FLEXIBLE cannot be specified together
+# NUMBERS works only with FLEXIBLE
+# STOPBIT works only with FLEXIBLE
+# (use FLEXIBLE)
+#
+# See INSTALL file for info on compile options.
+#
+# Some versions of g++ (or stdlibc++) are broken - if so, don't use -O2!
+# !!! If you change these, please do make clean first before each make
+CPPFLAGS=-O2 --pedantic -Wall \
+  -DFLEXIBLE  \
+  -DNUMBERS \
+  -DA_TERGO \
+  -DGENERALIZE \
+  -DSORT_ON_FREQ \
+  -DSHOW_FILLERS  \
+  -DSTOPBIT \
+  -DNEXTBIT \
+  -DMORE_COMPR \
+  -DCASECONV \
+  -DRUNON_WORDS \
+  -DMORPH_INFIX \
+  -DPOOR_MORPH \
+  -DCHCLASS \
+  -DGUESS_LEXEMES -DGUESS_PREFIX \
+  -DGUESS_MMORPH \
+  -DDUMP_ALL \
+  -DSTATISTICS \
+  -DPROGRESS \
+  -DLOOSING_RPM #-DDMALLOC
+
+
+
+
+# -pg
+
+#  -DTAILS \
+#  -DJOIN_PAIRS \
+#  -DPRUNE_ARCS \
+#  -DPROGRESS \
+#  -DWEIGHTED \
+#  -DSTATISTICS \
+#  -DSPARSE \
+
+# Normally empty
+#LDFLAGS=-L/usr/local/lib -ldmallocxx
+LDFLAGS=
+
+# Install directories
+PREFIXDIR = /usr/local
+
+# this is where fsa_build, fsa_spell, etc. should go
+BINDIR = ${PREFIXDIR}/bin
+# this is where the manuals should be kept
+MANDIR = ${PREFIXDIR}/man
+# this is where the dictionaries should go; also accent and language files
+DICTDIR = ${PREFIXDIR}/lib
+# this is where emacs lisp files go
+LISPDIR = /usr/lib/emacs/site-lisp
+# this is where tcl scripts go (also perl scripts used in tclmacq)
+TCLMACQBINDIR = ${BINDIR}
+# this is where tclmacq support files (help, language) go
+TCLMACQDIR = ${PREFIXDIR}/lib
+# The following should be empty if man fconfigure shows -encoding option,
+# and set to \# otherwise. In other words, if your Tcl version is 8.0,
+# you should set it to \#, and if it is 8.2 or higher -- leave it empty.
+PREP_FCONF = \#
+#PREP_FCONF
+# to which man section man pages for fsa belong
+MANSECT1 = 1
+MANSECT5 = 5
+
+########################################################################
+
+# Objects that make particular programs
+SPELL_OBJECTS = common.o spell.o nstr.o ${TEXT_IO} spell_main.o
+ACCENT_OBJECTS = common.o nstr.o ${TEXT_IO} accent_main.o accent.o
+FSA_B_OBJECTS = build_fsa.o nnode.o nindex.o nstr.o
+FSA_S_OBJECTS = builds_fsa.o snode.o
+FSA_U_OBJECTS = buildu_fsa.o unode.o
+PREFIX_OBJECTS = common.o nstr.o one_word_io.o prefix.o prefix_main.o
+GUESS_OBJECTS = common.o nstr.o ${TEXT_IO} guess.o guess_main.o
+HASH_OBJECTS =  common.o nstr.o ${TEXT_IO} hash.o hash_main.o
+MORPH_OBJECTS = common.o nstr.o ${TEXT_IO} morph.o morph_main.o
+VISUAL_OBJECTS = common.o nstr.o ${TEXT_IO} visualize.o visual_main.o
+ALL_PROGS = fsa_spell fsa_build fsa_accent fsa_prefix fsa_guess fsa_hash \
+ fsa_morph fsa_ubuild fsa_visual
+SKL_SCRIPTS = jspell jaccent jmorph jguess
+TCL_SCRIPTS = tclmacq.tcl filesel.tcl
+ALL_SCRIPTS = ${SKL_SCRIPTS} chkmorph.pl deguess.pl demorph.pl \
+ find_irregular.pl gendata.pl mmorph23c.pl morph_data.pl morph_infix.pl \
+ morph_prefix.pl prep_atg.pl prep_ati.pl prep_atl.pl prep_atp.pl \
+ putinplace.pl simplify.pl sortatt.pl sortondesc.pl tclmacq.tcl filesel.tcl
+# Note that awk scripts are not portable
+AWK_SCRIPTS = de_morph_data.awk de_morph_infix.awk deguess.awk demorph.awk \
+ find_irregular.awk mmorph23c.awk morph_data.awk morph_infix.awk \
+ morph_prefix.awk prep_atg.awk prep_ati.awk prep_atl.awk prep_atp.awk
+TCL_SUPP_FILES = tclmacq-help.txt tclmacq-lang.txt
+
+ALL_OBJ = common.o spell.o nstr.o spell_main.o \
+ accent_main.o accent.o build_fsa.o nnode.o nindex.o prefix.o prefix_main.o \
+ guess.o guess_main.o hash.o hash_main.o morph.o morph_main.o builds_fsa.o \
+ buildu_fsa.o unode.o snode.o visualize.o visual_main.o
+
+
+all: ${ALL_PROGS}
+
+
+fsa_spell: ${SPELL_OBJECTS}
+	${CXX} ${CPPFLAGS} ${SPELL_OBJECTS} ${LDFLAGS} -o fsa_spell
+
+fsa_accent: ${ACCENT_OBJECTS}
+	${CXX} ${CPPFLAGS} ${ACCENT_OBJECTS} ${LDFLAGS} -o fsa_accent
+
+fsa_build: ${FSA_B_OBJECTS} ${FSA_S_OBJECTS}
+	${CXX} ${CPPFLAGS} ${FSA_B_OBJECTS} ${FSA_S_OBJECTS} ${LDFLAGS} -o fsa_build
+
+fsa_ubuild: ${FSA_B_OBJECTS} ${FSA_U_OBJECTS}
+	${CXX} ${CPPFLAGS} ${FSA_B_OBJECTS} ${FSA_U_OBJECTS} ${LDFLAGS} -o fsa_ubuild
+
+
+fsa_prefix: ${PREFIX_OBJECTS}
+	${CXX} ${CPPFLAGS} ${PREFIX_OBJECTS} ${LDFLAGS} -o fsa_prefix
+
+fsa_guess: ${GUESS_OBJECTS}
+	${CXX} ${CPPFLAGS} ${GUESS_OBJECTS} ${LDFLAGS} -o fsa_guess
+
+fsa_hash: ${HASH_OBJECTS}
+	${CXX} ${CPPFLAGS} ${HASH_OBJECTS} ${LDFLAGS} -o fsa_hash
+
+fsa_morph: ${MORPH_OBJECTS}
+	${CXX} ${CPPFLAGS} ${MORPH_OBJECTS} ${LDFLAGS} -o fsa_morph
+
+fsa_visual: ${VISUAL_OBJECTS}
+	${CXX} ${CPPFLAGS} ${VISUAL_OBJECTS} ${LDFLAGS} -o fsa_visual
+
+fsa_dump: dump.cc
+	${CXX} ${CPPFLAGS} dump.cc ${LDFLAGS} -o fsa_dump
+
+common.o: common.cc fsa.h nstr.h common.h
+	${CXX} ${CPPFLAGS} -c common.cc
+
+spell.o: spell.cc fsa.h nstr.h spell.h common.h
+	${CXX} ${CPPFLAGS} -c spell.cc
+
+nstr.o:	nstr.cc nstr.h
+	${CXX} ${CPPFLAGS} -c nstr.cc
+
+build_fsa.o: build_fsa.cc nnode.h nindex.h nstr.h fsa.h fsa_version.h mkindex.cc
+	${CXX} ${CPPFLAGS} -c build_fsa.cc
+
+builds_fsa.o: builds_fsa.cc nnode.h nindex.h nstr.h fsa.h fsa_version.h mkindex.cc compile_options.h
+	${CXX} ${CPPFLAGS} -c builds_fsa.cc
+
+buildu_fsa.o: buildu_fsa.cc nnode.h unode.h nindex.h nstr.h fsa.h fsa_version.h mkindex.cc compile_options.h
+	${CXX} ${CPPFLAGS} -c buildu_fsa.cc
+
+nnode.o: nnode.cc nnode.h nstr.h fsa.h nindex.h
+	${CXX} ${CPPFLAGS} -c nnode.cc
+
+unode.o: unode.cc unode.h nnode.h nstr.h fsa.h nindex.h
+	${CXX} ${CPPFLAGS} -c unode.cc
+
+snode.o: snode.cc nnode.h nstr.h fsa.h nindex.h
+	${CXX} ${CPPFLAGS} -c snode.cc
+
+nindex.o: nindex.cc nindex.h nnode.h
+	${CXX} ${CPPFLAGS} -c nindex.cc
+
+one_word_io.o: one_word_io.cc fsa.h common.h
+	${CXX} ${CPPFLAGS} -c one_word_io.cc
+
+text_io.o: text_io.cc common.h fsa.h
+	${CXX} ${CPPFLAGS} -c text_io.cc
+
+spell_main.o: spell_main.cc common.h spell.h fsa_version.h compile_options.h
+	${CXX} ${CPPFLAGS} -c spell_main.cc
+
+accent_main.o: accent_main.cc comm...
 
[truncated message content]

2006	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug (6)	Sep (7)	Oct (2)	Nov	Dec (5)
2007	Jan	Feb (5)	Mar (7)	Apr (11)	May (16)	Jun	Jul	Aug (2)	Sep (22)	Oct (2)	Nov	Dec (8)
2008	Jan (2)	Feb (1)	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec (5)
2009	Jan (3)	Feb (1)	Mar (40)	Apr (3)	May	Jun (1)	Jul	Aug (9)	Sep (5)	Oct	Nov	Dec
2010	Jan	Feb (9)	Mar (11)	Apr (43)	May (2)	Jun	Jul	Aug	Sep	Oct (7)	Nov (51)	Dec
2011	Jan (19)	Feb (15)	Mar (2)	Apr (23)	May	Jun (12)	Jul	Aug	Sep	Oct	Nov	Dec
2012	Jan	Feb	Mar (1)	Apr	May (4)	Jun (34)	Jul	Aug	Sep	Oct (5)	Nov	Dec
2013	Jan	Feb	Mar (11)	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec

S	M	T	W	T	F	S
1	2	3	4 (4)	5 (1)	6	7
8	9	10	11	12	13	14
15	16	17 (2)	18 (6)	19	20	21
22 (1)	23 (4)	24 (6)	25 (6)	26 (6)	27 (1)	28
29	30	31 (3)

morfologik-svn Mailing List for Morfologik

morfologik-svn — SVN notification list