Zhongkai Fu

wordseg project is a word segment module implemented by C#. It is used to segment text into tokens and to label token's attribute according its context and semantic by front-maximum matching and CRF algorithms.

The following are some sentences need to be segmented:

After above sentences be segmented by wordseg, the result as follows for each sentence:
张晓晨[PER] 和 付仲恺[PER] 一起 坐 在 家 ( 西坝河东里社区[LOC] ) 里 的 沙发[PDT] 上 看 非 诚 勿扰 。
百度公司[ORG] 的 名字 源于 “ 众 里 寻 他 千百度 ” 这 诗句 。

In above, if a token has some attributes, the attribute result will be appended into the corresponding token within "[]".

Since wordseg has introduced statistics model to segment text by context, for same sub string in different context, dif

Project Members:

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks