wordseg project is a word segment module implemented by C#. It is used to segment text into tokens and to label token's attribute according its context and semantic by front-maximum matching and CRF algorithms.

The following are some sentences need to be segmented:
张晓晨和付仲恺一起坐在家(西坝河东里社区)里的沙发上看非诚勿扰。
百度公司的名字源于“众里寻他千百度”这诗句。

After above sentences be segmented by wordseg, the result as follows for each sentence:
张晓晨[PER] 和 付仲恺[PER] 一起 坐 在 家 ( 西坝河东里社区[LOC] ) 里 的 沙发[PDT] 上 看 非 诚 勿扰 。
百度公司[ORG] 的 名字 源于 “ 众 里 寻 他 千百度 ” 这 诗句 。

In above, if a token has some attributes, the attribute result will be appended into the corresponding token within "[]".

Since wordseg has introduced statistics model to segment text by context, for same sub string in different context, dif

Project Activity

See All Activity >

License

BSD License

Follow WordSegment

WordSegment Web Site

Other Useful Business Software
AI-powered service management for IT and enterprise teams Icon
AI-powered service management for IT and enterprise teams

Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
Try it Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of WordSegment!

Additional Project Details

Operating Systems

Windows

Languages

English, Chinese (Simplified)

Intended Audience

Information Technology, Science/Research, Education, Advanced End Users, Developers, Engineering

User Interface

Console/Terminal, Command-line

Programming Language

C#

Related Categories

C# Search Engines, C# Education Software, C# Linguistics Software

Registered

2011-05-14