THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/

The Multiword Expressions toolkit aids in the automatic identification and extraction of multiword units in running text. These include idioms (kick the bucket), noun compounds (cable car), phrasal verbs (take off, give up), etc.

Even though it focuses on multiword expresisons, the framework is quite complete and can also be useful in any corpus-based study in computational linguistics.

The mwetoolkit can be applied to virtually any text collection, language, and MWE type. It is a command-line tool written mostly in Python. Its development started in 2010 as a PhD thesis but the project keeps active (see the SVN logs).

Up-to-date documentation and details about the tool can be found on the mwetoolkit website: http://mwetoolkit.sourceforge.net/

Features

  • Multi-level RegEx patterns
  • Large corpora support
  • Association measures
  • Token-based annotation

Project Samples

Project Activity

See All Activity >

License

GNU General Public License version 3.0 (GPLv3)

Follow mwetoolkit

mwetoolkit Web Site

Other Useful Business Software
Our Free Plans just got better! | Auth0 Icon
Our Free Plans just got better! | Auth0

With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
Try free now
Rate This Project
Login To Rate This Project

User Ratings

★★★★★
★★★★
★★★
★★
1
0
0
0
0
ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5

User Reviews

  • Works fine, easy to use, and the documentation is clear.
Read more reviews >

Additional Project Details

Operating Systems

Cygwin, Linux, BSD, Mac

Languages

English

Intended Audience

Science/Research

User Interface

Command-line

Programming Language

Unix Shell, Python, C

Database Environment

XML-based, Flat-file

Related Categories

Unix Shell Artificial Intelligence Software, Unix Shell Linguistics Software, Unix Shell Command Line Tools, Python Artificial Intelligence Software, Python Linguistics Software, Python Command Line Tools, C Artificial Intelligence Software, C Linguistics Software, C Command Line Tools

Registered

2010-04-08