Showing 18 open source projects for "segmentation"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 1
    gse

    gse

    Go efficient multilingual NLP and text segmentation

    Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others. Gse is implements jieba by golang, and try add NLP support and more feature. Support common, search engine, full mode, precise mode and HMM mode multiple word segmentation modes. Support user and embed dictionary, Part-of-speech/POS tagging, analyze segment info, stop and trim words.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Underthesea

    Underthesea

    Underthesea - Vietnamese NLP Toolkit

    Underthesea is a Vietnamese NLP toolkit providing various text processing capabilities, including word segmentation, part-of-speech tagging, and named entity recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    HanLP

    HanLP

    Han Language Processing

    ...Built on TensorFlow 2.0, it was designed to advance state-of-the-art deep learning techniques and popularize the application of natural language processing in both academia and industry. HanLP is capable of lexical analysis (Chinese word segmentation, part-of-speech tagging, named entity recognition), syntax analysis, text classification, and sentiment analysis. It comes with pretrained models for numerous languages including Chinese and English. It offers efficient performance, clear structure and customizable features, with plenty more amazing features to look forward to on the roadmap.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Wink-NLP

    Wink-NLP

    Developer friendly Natural Language Processing

    Wink-NLP is a lightweight and fast natural language processing library for JavaScript, optimized for browser and Node.js environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Hazm

    Hazm

    Persian NLP Toolkit

    Hazm is a natural language processing (NLP) library for Persian text, offering various tools for text preprocessing, tokenization, part-of-speech tagging, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Apache OpenNLP

    Apache OpenNLP

    Apache OpenNLP

    Apache OpenNLP is a machine learning-based NLP library that provides tools for text-processing tasks such as tokenization, sentence segmentation, and named entity recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    OpenNN - Open Neural Networks Library

    OpenNN - Open Neural Networks Library

    Machine learning algorithms for advanced analytics

    OpenNN is a software library written in C++ for advanced analytics. It implements neural networks, the most successful machine learning method. Some typical applications of OpenNN are business intelligence (customer segmentation, churn prevention…), health care (early diagnosis, microarray analysis…) and engineering (performance optimization, predictive maitenance…). OpenNN does not deal with computer vision or natural language processing. The main advantage of OpenNN is its high performance. This library outstands in terms of execution speed and memory allocation. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    funNLP

    funNLP

    Resources, corpora, and tools for Chinese natural language processing

    ...The repository is organized into categories such as sentiment analysis, text classification, named entity recognition, knowledge graphs, and various lexicons (e.g. sensitive words, emotion dictionaries, stopwords). It also includes links to academic papers, open-source model implementations, and practical utilities like word segmentation or text cleaning scripts. The project is highly community-oriented, frequently updated with contributions and new resources, and it’s widely used in both academic and applied NLP research. Its value lies in providing not just tools but also curated, domain-specific data, which can be hard to find elsewhere.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    Synonyms

    Synonyms

    Chinese synonyms, chat robot, intelligent question and answer toolkit

    Chinese Synonyms for natural language processing and understanding. Better Chinese synonyms, chatbot, intelligent question and answer toolkit. synonymsCan be used for many tasks in natural language understanding, text alignment, recommendation algorithms, similarity calculation, semantic shifting, keyword extraction, concept extraction, automatic summarization, search engines, etc. Print synonyms in a friendly way for easy debugging. "Synonyms Cilin" was compiled by Mei Jiaju and others in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    fastNLP

    fastNLP

    fastNLP: A Modularized and Extensible NLP Framework

    ...Various convenient NLP tools, such as Embedding loading (including ELMo and BERT), intermediate data cache, etc.. Provide a variety of neural network components and recurrence models (covering tasks such as Chinese word segmentation, named entity recognition, syntactic analysis, text classification, text matching, metaphor resolution, summarization, etc.). Trainer provides a variety of built-in Callback functions to facilitate experiment recording, exception capture, etc. Automatic download of some datasets and pre-trained models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    CC-Net

    CC-Net

    Tools to download and cleanup Common Crawl data

    cc_net provides tools to download, segment, clean, and filter Common Crawl to build large-scale text corpora, including monolingual datasets and the multilingual CC-100 collection introduced in the associated paper. It includes pipelines to fetch snapshots, extract text, de-duplicate, identify language, and apply quality filtering based on heuristics and language models. The outputs are intended for pretraining language models and for creating standardized corpora that can be reproduced or...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Welsh Natural Language Toolkit
    The project supports the Welsh Language Technology domain with a set of NLP tools that drive innovation and advance the development of sophisticated textual analysis solutions. The WNLT project delivers four core NLP modules; a) Word Segmentation for separating text into words b) Sentence Boundary Disambiguation for finding sentence boundaries c) Part of Speech Tagger for determining the part of speech of each word d) Morphological Analyser for identifying the root form (lemma) of words. The modules are written in JAVA and ‘wrapped’ for execution under the General Architecture for Text Engineering (GATE) framework. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    ZPar statistical parser. Universal language support (depending on the availability of training data), with language-specific features for Chinese and English. Currently support word segmentation, POS tagging, dependency and phrase-structure parsing.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    Ansj Chinese word segmentation

    Ansj Chinese word segmentation

    Ansj word segmentation

    The real java implementation of ict. The word segmentation effect is faster than the open source version of ict. Chinese word segmentation, name recognition, part-of-speech tagging, user-defined dictionary. This is a java implementation of Chinese word segmentation based on n-Gram+CRF+HMM. The word segmentation speed reaches about 2 million words per second (tested under mac air), and the accuracy rate can reach more than 96%.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Welsh Natural Language Toolkit

    Welsh Natural Language Toolkit

    WNLT is a suite of open source natural language modules for the Welsh

    The project supports the Welsh Language Technology domain with a set of NLP tools that drive innovation and advance the development of sophisticated textual analysis solutions. The WNLT project delivers four core NLP modules; a) Word Segmentation for separating text into words b) Sentence Boundary Disambiguation for finding sentence boundaries c) Part of Speech Tagger for determining the part of speech of each word d) Morphological Analyser for identifying the root form (lemma) of words. The modules are written in JAVA and ‘wrapped’ for execution under the General Architecture for Text Engineering (GATE) framework. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    D.U.C.K (Determine segmentation of Unknown words by using Context Knowledge)is an NLP tool, which aims to find the correct segmentation for unknown words in written Hebrew. Statistics from different scopes will be used to determine the segmentation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    JVnSegmenter is a Java-based and open-source Vietnamese word segmentation tool. The segmentation model was trained on about 8,000 sentences using Conditional Random Fields (FlexCRFs). This tool would be useful for Vietnamese NLP community.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo