THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/

The Multiword Expressions toolkit aids in the automatic identification and extraction of multiword units in running text. These include idioms (kick the bucket), noun compounds (cable car), phrasal verbs (take off, give up), etc.

Even though it focuses on multiword expresisons, the framework is quite complete and can also be useful in any corpus-based study in computational linguistics.

The mwetoolkit can be applied to virtually any text collection, language, and MWE type. It is a command-line tool written mostly in Python. Its development started in 2010 as a PhD thesis but the project keeps active (see the SVN logs).

Up-to-date documentation and details about the tool can be found on the mwetoolkit website: http://mwetoolkit.sourceforge.net/

Features

  • Multi-level RegEx patterns
  • Large corpora support
  • Association measures
  • Token-based annotation

Project Samples

Project Activity

See All Activity >

License

GNU General Public License version 3.0 (GPLv3)

Follow mwetoolkit

mwetoolkit Web Site

Other Useful Business Software
Turn Your Content into Interactive Magic - For Free Icon
Turn Your Content into Interactive Magic - For Free

From Canva to Slides, Desmos to YouTube, Lumio works with the tech tools you are already using.

Transform anything you share into an engaging digital experience - for free. Instantly convert your PDFs, slides, and files into dynamic, interactive sessions with built-in collaboration tools, activities, and real-time assessment. From teaching to training to team building, make every presentation unforgettable. Used by millions for education, business, and professional development.
Start Free Forever
Rate This Project
Login To Rate This Project

User Ratings

★★★★★
★★★★
★★★
★★
1
0
0
0
0
ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5

User Reviews

  • Works fine, easy to use, and the documentation is clear.
Read more reviews >

Additional Project Details

Operating Systems

BSD, Cygwin, Linux, Mac

Languages

English

Intended Audience

Science/Research

User Interface

Command-line

Programming Language

C, Python, Unix Shell

Database Environment

Flat-file, XML-based

Related Categories

Unix Shell Artificial Intelligence Software, Unix Shell Linguistics Software, Unix Shell Command Line Tools, Python Artificial Intelligence Software, Python Linguistics Software, Python Command Line Tools, C Artificial Intelligence Software, C Linguistics Software, C Command Line Tools

Registered

2010-04-08