mwetoolkit
THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/
...These include idioms (kick the bucket), noun compounds (cable car), phrasal verbs (take off, give up), etc.
Even though it focuses on multiword expresisons, the framework is quite complete and can also be useful in any corpus-based study in computational linguistics.
The mwetoolkit can be applied to virtually any text collection, language, and MWE type. It is a command-line tool written mostly in Python. Its development started in 2010 as a PhD thesis but the project keeps active (see the SVN logs).
...