|
From: Carlos R. <car...@gm...> - 2015-10-08 16:40:57
|
Dear all, We are very happy to announce a fresh new release of the *mwetoolkit - version 1.1* http://mwetoolkit.sourceforge.net/ Release 1.1, now hosted on Gitlab, has many improvements: * Important bug correction in C indexer and giga-word corpus support * New amazing tools (grep.py, view.py, split.py,...) * New supported corpus formats (RASP, PALAVRAS,...) * More complete and thorough unit tests (patterns,...) * Improved handling of extra word attributes * Improved compressed files support (.gz,...) * Experimental: TextualPattern format for patterns (grep.py -e "...") * General bug fixes, refactoring and improvements Thanks a lot to all the anonymous and known users who notified us about the bugs and suggested us improvements. To keep up to date, subscribe to the mailing list https://groups.google.com/d/forum/mwetoolkit Enjoy mwetoolkit 1.1 :-) Carlos and Silvio ---------------------------------- *The mwetoolkit is a set of python scripts to deal with corpora and automatically extract multiword expressions. Even though it focuses on multiword expressions, the tool is quite complete and can also be useful in any corpus-based study in computational linguistics. It is very useful to perform advanced searches, lexicon extraction and filtering on POS-tagged and/or dependency-parsed corpora, independently of language, domain, MWE type, etc.* |