THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/

The Multiword Expressions toolkit aids in the automatic identification and extraction of multiword units in running text. These include idioms (kick the bucket), noun compounds (cable car), phrasal verbs (take off, give up), etc.

Even though it focuses on multiword expresisons, the framework is quite complete and can also be useful in any corpus-based study in computational linguistics.

The mwetoolkit can be applied to virtually any text collection, language, and MWE type. It is a command-line tool written mostly in Python. Its development started in 2010 as a PhD thesis but the project keeps active (see the SVN logs).

Up-to-date documentation and details about the tool can be found on the mwetoolkit website: http://mwetoolkit.sourceforge.net/

Features

  • Multi-level RegEx patterns
  • Large corpora support
  • Association measures
  • Token-based annotation

Project Samples

Project Activity

See All Activity >

License

GNU General Public License version 3.0 (GPLv3)

Follow mwetoolkit

mwetoolkit Web Site

Other Useful Business Software
MongoDB Atlas runs apps anywhere Icon
MongoDB Atlas runs apps anywhere

Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Rate This Project
Login To Rate This Project

User Ratings

★★★★★
★★★★
★★★
★★
1
0
0
0
0
ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5

User Reviews

  • Works fine, easy to use, and the documentation is clear.
Read more reviews >

Additional Project Details

Operating Systems

Cygwin, Linux, BSD, Mac

Languages

English

Intended Audience

Science/Research

User Interface

Command-line

Programming Language

Unix Shell, Python, C

Database Environment

XML-based, Flat-file

Related Categories

Unix Shell Artificial Intelligence Software, Unix Shell Linguistics Software, Unix Shell Command Line Tools, Python Artificial Intelligence Software, Python Linguistics Software, Python Command Line Tools, C Artificial Intelligence Software, C Linguistics Software, C Command Line Tools

Registered

2010-04-08