Menu

Tree [219251] master /
 History

HTTPS access


File Date Author Commit
 libstemmer_c 2010-01-18 Jan Wielemaker Jan Wielemaker [113190] Merge branch 'master' of ec:/home/pl/git/pl-devel
 .cvsignore 2005-11-26 Jan Wielemaker Jan Wielemaker [36f920] * Added first version of porter stem
 .gitignore 2010-12-15 Jan Wielemaker Jan Wielemaker [219251] Updated .gitignore
 ChangeLog 2009-07-23 Jan Wielemaker Jan Wielemaker [b2e43f] Preparing version 5.7.12
 Makefile.in 2010-02-21 Jan Wielemaker Jan Wielemaker [3c02af] Deleted DESTDIR= from all package Makefiles
 Makefile.mak 2010-01-16 Jan Wielemaker Jan Wielemaker [a7a402] Added clean for snowball
 README 2010-01-22 Jan Wielemaker Jan Wielemaker [839b68] Work around Debian/Ubuntu issue in handling run...
 configure.in 2010-01-31 Jan Wielemaker Jan Wielemaker [794537] Share configure skeleton for nlp
 double_metaphone.c 2009-03-19 Jan Wielemaker Jan Wielemaker [a8be17] CLEANUP: Removed all trailing whitespace from a...
 double_metaphone.h 2009-03-19 Jan Wielemaker Jan Wielemaker [a8be17] CLEANUP: Removed all trailing whitespace from a...
 double_metaphone.pl 2010-10-12 Jan Wielemaker Jan Wielemaker [68132a] Add documentation
 install-sh 2005-11-25 Jan Wielemaker Jan Wielemaker [90e42a] * Started NLP support routines package
 nlp.doc 2010-12-08 Jan Wielemaker Jan Wielemaker [533604] ADDED: tokenize_atom/2 now supports wide-charac...
 pltotex.pl 2010-01-16 Jan Wielemaker Jan Wielemaker [5bd2ac] Add Snowball documentation
 porter_stem.c 2010-12-08 Jan Wielemaker Jan Wielemaker [1808e8] Note on how to enhance unaccent
 porter_stem.pl 2009-07-21 Jan Wielemaker Jan Wielemaker [9d71cc] MODIFIED: Make initialization/1 ISO compliant
 snowball.c 2010-01-16 Jan Wielemaker Jan Wielemaker [cd7282] Avoid broken pthread_once on Windows
 snowball.pl 2010-01-16 Jan Wielemaker Jan Wielemaker [826b19] Fix various details to make the snowball stemme...
 test.pl 2007-01-16 Jan Wielemaker Jan Wielemaker [f77c88] * Added test suite

Read Me

Other resources that are worthwhile to add to this package:

    * http://www.ling.helsinki.fi/kieliteknologia/tutkimus/hfst/
    * Full Unicode diacritics removal
      This is more complicated.  We need the four forms as described in
      the wikipedia article:

	- http://en.wikipedia.org/wiki/Unicode_equivalence
	- http://www.opensource.apple.com/source/gcc/gcc-5646/libcpp/makeucnid.c