Showing 30 open source projects for "stemming"

View related business solutions
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    natural

    natural

    General natural language facilities for node

    ...While most of the algorithms are English-specific, contributors have implemented support for other languages. Russian stemming has been added and Spanish stemming has been added, as well. Stemming and tokenizing in more languages have been added. If you’re just looking to use natural without your own node application, you can install via NPM.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Natural Language Toolkit
    ...It provides a comprehensive suite of modules, datasets, and tutorials that support both symbolic and statistical approaches to language processing. The toolkit includes implementations of many foundational NLP algorithms and utilities, enabling developers to perform tasks such as tokenization, stemming, parsing, classification, and semantic reasoning. NLTK was originally developed to support research and teaching in computational linguistics and artificial intelligence, and it has become one of the most influential educational platforms for learning NLP in Python. The project also includes access to numerous linguistic corpora and lexical resources that can be downloaded and used directly in experiments and applications.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Hazm

    Hazm

    Persian NLP Toolkit

    Hazm is a natural language processing (NLP) library for Persian text, offering various tools for text preprocessing, tokenization, part-of-speech tagging, and more.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Dawarich

    Dawarich

    Self-hostable alternative to Google Timeline

    Dawarich is a command-line tool (likely Ruby-based) for transforming and analyzing Arabic text data with normalization, diacritic handling, segmentation, and morphological tokenization. Designed for text mining and NLP workflows in Arabic-language contexts.
    Downloads: 5 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    Searchkick

    Searchkick

    Intelligent search made easy

    Searchkick brings powerful, production-ready search to Rails by mapping Active Record models into Elasticsearch with sensible defaults and easy customization. It supports language analyzers, stemming, synonyms, misspelling tolerance, and highlighting so search results feel natural to end users. Indexing is model-centric: you declare what fields to index, add computed fields, and trigger reindexing via callbacks or background jobs, with options for zero-downtime rolling reindexes. On the query side, a simple API covers relevance tuning, boosting, filtering, faceting/aggregations, and pagination, while still allowing direct access to advanced Elasticsearch features when needed. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Wink-NLP

    Wink-NLP

    Developer friendly Natural Language Processing

    Wink-NLP is a lightweight and fast natural language processing library for JavaScript, optimized for browser and Node.js environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    TNTSearch

    TNTSearch

    A fully featured full text search engine written in PHP

    TNTSearch is a full-text search engine written in PHP, designed to be integrated into Laravel and other PHP applications. It offers real-time, efficient indexing and searching of textual data using SQLite as its storage backend. TNTSearch is highly configurable and supports features like fuzzy searching, customizable ranking algorithms, and boolean search, making it a powerful tool for adding search functionality to websites and applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Vespa

    Vespa

    The open big data serving engine

    Make AI-driven decisions using your data, in real-time. At any scale, with unbeatable performance. Vespa is a full-featured text search engine and supports both regular text search and fast approximate vector search (ANN). This makes it easy to create high-performing search applications at any scale, whether you want to use traditional techniques or a modern vector-based approach. You can even combine both approaches efficiently in the same query, something no other engine can do....
    Downloads: 14 This Week
    Last Update:
    See Project
  • 9
    Smile

    Smile

    Statistical machine intelligence and learning engine

    Smile is a fast and comprehensive machine learning engine. With advanced data structures and algorithms, Smile delivers the state-of-art performance. Compared to this third-party benchmark, Smile outperforms R, Python, Spark, H2O, xgboost significantly. Smile is a couple of times faster than the closest competitor. The memory usage is also very efficient. If we can train advanced machine learning models on a PC, why buy a cluster? Write applications quickly in Java, Scala, or any JVM...
    Downloads: 7 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 10
    Code Quality and Security for Java

    Code Quality and Security for Java

    SonarSource Static Analyzer for Java Code Quality and Security

    ...Allow you to effortlessly repair your Java coding issues with just a click. Dozens of rules to ensure your tests are always as clean as your code! Dedicated rules to detect vulnerabilities including ones stemming from OWASP & CWE Top 25 guidelines. It all comes from a powerful analysis engine that we constantly refine. Sonar employs advanced rules along with smart, exclusive analysis techniques to find the trickiest, most elusive issues.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    GoldenDict
    A feature-rich dictionary lookup program, supporting multiple dictionaries' formats, perfect article rendering with the complete markup, illustrations and other content retained, and allowing to type in words without any accents or correct case.
    Leader badge
    Downloads: 982 This Week
    Last Update:
    See Project
  • 12
    Anti-Spam SMTP Proxy Server

    Anti-Spam SMTP Proxy Server

    Anti-Spam SMTP Proxy Server implements multiple spam filters

    The Anti-Spam SMTP Proxy (ASSP) Server project aims to create an open source platform-independent SMTP Proxy server which implements auto-whitelists, self learning Hidden-Markov-Model and/or Bayesian, Greylisting, DNSBL, DNSWL, URIBL, SPF, SRS, Backscatter, Virus scanning, attachment blocking, Senderbase and multiple other filter methods. Click 'Files' to download the professional version 2.8.1 build 24261. A linux(ubuntu 20.04 LTS) and a freeBSD 12.2 based ready to run OVA of ASSP V2 are...
    Leader badge
    Downloads: 39,067 This Week
    Last Update:
    See Project
  • 13
    mbFXWords

    mbFXWords

    Analyze text. Diagonal read subject, predicate, obj. Search other pdf.

    ...JavaFX Application, runs with Oracle Java Runtime Environment version 8 that is including JavaFX. NLP extensions: - Divide sentences in subclauses: segmentation. - Divide plain text: subject, predicate, object. - Count words: stemming. - Search for similar content: pdf's. Gives out subject, predicate and object of sentences of pdf and plain text files. Provides comfortable GUI. Automatic language detection.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    The Corpora contains 81,000 tagged words of Arabic resources (Contemporary Arabic (CCA) [1] and Arabic Wikipedia [2]) text with the basic tags (verb, noun, adjective). [1] http://www.comp.leeds.ac.uk/eric/latifa/research.htm. [2] http://ar.wikipedia.org.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    ooPorter

    A Porter stemming or stemmer algorithm coded in ooRexx

    This is an ooRexx line-by-line port from Ansi-C to ooRexx of the stemming routine published by Martin Porter 1980. The original source code from Porter has been commented out and emulated by the corresponding (oo)Rexx code as far as possible. This is not an example of good or fast (oo)Rexx programming, it is merely a demonstration of the Porter stemming routine ported to ooRexx, use and modify as necessary.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    The Apelon DTS (Distributed Terminology System) is an integrated set of open source components that provides comprehensive terminology services in distributed application environments.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17

    TextProcessor

    A Java package to preprocess text datasets for posterior text analysis

    The TextProcessor Java package is a text processing toolkit, which provides some frequently used text processing functions such as stemming, removing stop-words, generating a term vocabulary, and calculating the term-doc frequency matrix. Basic topic mining models such as LDA and sparse NMF are also supported. The package can also generate feature files from a given text dataset with LDA and LIBSVM format for posterior procedures such as classification or clustering.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Arabic Computational Linguistics resources and Tools, Arabic Text Mining Tools, Arabic Language tools, Arabic Morphological Analysis (Stemming / Light Stemming), Arabic text preprocessing, Arabic Corpora, Open Source Arabic Corpora OSAC, Comparable Corpora. For more information: http://sites.google.com/site/motazsite
    Leader badge
    Downloads: 17 This Week
    Last Update:
    See Project
  • 19

    JAVA Arabic Stemmer

    A JAVA class with a small functionality that is stemming Arabic words

    A JAVA Arabic stemmer that is based on Shereen Khoja algorithm. This java class offers a function called stemWrod which takes an arabic word and return the stem of it.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    MongoLantern - MongoDB Fulltext Search

    MongoLantern - MongoDB Fulltext Search

    Open Source MongoDB Fulltext Search Server

    MongoLantern is an open source full text search server using MongoDB as index storage, which allows MongoLantern to migrate any changes very easily into account using MongoDB API. It's written originally written in PHP can be migrated to any desired language as required using it's future APIs. MongoLantern 0.7 - Stable/Production Release: 1. MongoLantern API support enabled. 2. CSV indexer added as a plugin. 3. node.js API client added.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Guia Brasil de recursos turisticos econônomicos e mapeamento etnico e comportamental, para definir u parametro de gostos e tendencias, incorporação do algoritimo de Poter Stemming em datamine.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    The Neurpheus Morphological Analyser performs morphological analysis, stemming or word form generation tasks using sophisticated classification methods for an analysis of words unseen in a training dictionary.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Based on the Buckwalter Morphological Analyzer (Version 1.0) for doing Arabic stemming and POS tagging. Includes a rewrite of the original Perl script, with better documentation and more flexible options, and a C++ interface (usable as a library or app).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    PHP-based Japanese verb stemmer using dictionary-support suffix stemming model
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Cubit is an Azureus plugin that enables decentralized, approximate keyword search of torrents within the Azureus client. It provides accurate and useful results even with errors in the search terms, stemming from typos and common spelling variations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB