wordseg project is a word segment module implemented by C#. It is used to segment text into tokens and to label token's attribute according its context and semantic by front-maximum matching and CRF algorithms.

The following are some sentences need to be segmented:
张晓晨和付仲恺一起坐在家(西坝河东里社区)里的沙发上看非诚勿扰。
百度公司的名字源于“众里寻他千百度”这诗句。

After above sentences be segmented by wordseg, the result as follows for each sentence:
张晓晨[PER] 和 付仲恺[PER] 一起 坐 在 家 ( 西坝河东里社区[LOC] ) 里 的 沙发[PDT] 上 看 非 诚 勿扰 。
百度公司[ORG] 的 名字 源于 “ 众 里 寻 他 千百度 ” 这 诗句 。

In above, if a token has some attributes, the attribute result will be appended into the corresponding token within "[]".

Since wordseg has introduced statistics model to segment text by context, for same sub string in different context, dif

Project Activity

See All Activity >

License

BSD License

Follow WordSegment

WordSegment Web Site

You Might Also Like
Top-Rated Free CRM Software Icon
Top-Rated Free CRM Software

216,000+ customers in over 135 countries grow their businesses with HubSpot

HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of WordSegment!

Additional Project Details

Operating Systems

Windows

Languages

English, Chinese (Simplified)

Intended Audience

Information Technology, Science/Research, Education, Advanced End Users, Developers, Engineering

User Interface

Console/Terminal, Command-line

Programming Language

C#

Related Categories

C# Search Engines, C# Education Software, C# Linguistics Software

Registered

2011-05-14