Open Source Windows Natural Language Processing (NLP) Tools

Natural Language Processing (NLP) Tools for Windows

View 40 business solutions

Browse free open source Natural Language Processing (NLP) tools and projects for Windows below. Use the toggles on the left to filter open source Natural Language Processing (NLP) tools by OS, license, language, programming language, and project status.

  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • AestheticsPro Medical Spa Software Icon
    AestheticsPro Medical Spa Software

    Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

    AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.
    Learn More
  • 1
    Virastyar

    Virastyar

    Virastyar is an spell checker for low-resource languages

    Virastyar is a free and open-source (FOSS) spell checker. It stands upon the shoulders of many free/libre/open-source (FLOSS) libraries developed for processing low-resource languages, especially Persian and RTL languages Publications: Kashefi, O., Nasri, M., & Kanani, K. (2010). Towards Automatic Persian Spell Checking. SCICT. Kashefi, O., Sharifi, M., & Minaie, B. (2013). A novel string distance metric for ranking Persian respelling suggestions. Natural Language Engineering, 19(2), 259-284. Rasooli, M. S., Kahefi, O., & Minaei-Bidgoli, B. (2011). Effect of adaptive spell checking in Persian. In NLP-KE Contributors: Omid Kashefi Azadeh Zamanifar Masoumeh Mashaiekhi Meisam Pourafzal Reza Refaei Mohammad Hedayati Kamiar Kanani Mehrdad Senobari Sina Iravanin Mohammad Sadegh Rasooli Mohsen Hoseinalizadeh Mitra Nasri Alireza Dehlaghi Fatemeh Ahmadi Neda PourMorteza
    Leader badge
    Downloads: 539 This Week
    Last Update:
    See Project
  • 2
    MeCab is a fast and customizable Japanese morphological analyzer. MeCab is designed for generic purpose and applied to variety of NLP tasks, such as Kana-Kanji conversion. MeCab provides parameter estimation functionalities based on CRFs and HMM
    Leader badge
    Downloads: 1,554 This Week
    Last Update:
    See Project
  • 3
    OpenVINO

    OpenVINO

    OpenVINO™ Toolkit repository

    OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime, Post-Training Optimization Tool, as well as CPU, GPU, MYRIAD, multi device and heterogeneous plugins to accelerate deep learning inferencing on Intel® CPUs and Intel® Processor Graphics. It supports pre-trained models from the Open Model Zoo, along with 100+ open source and public models in popular formats such as TensorFlow, ONNX, PaddlePaddle, MXNet, Caffe, Kaldi.
    Downloads: 38 This Week
    Last Update:
    See Project
  • 4
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with an accuracy within 1% of the best available. It's blazing fast, easy to install and comes with a simple and productive API.
    Downloads: 10 This Week
    Last Update:
    See Project
  • Field Sales+ for MS Dynamics 365 and Salesforce Icon
    Field Sales+ for MS Dynamics 365 and Salesforce

    Maximize your sales performance on the go.

    Bring Dynamics 365 and Salesforce wherever you go with Resco’s solution. With powerful offline features and reliable data syncing, your team can access CRM data on mobile devices anytime, anywhere. This saves time, cuts errors, and speeds up customer visits.
    Learn More
  • 5
    torchtext

    torchtext

    Data loaders and abstractions for text and NLP

    We recommend Anaconda as a Python package management system. Please refer to pytorch.org for the details of PyTorch installation. LTS versions are distributed through a different channel than the other versioned releases. Alternatively, you might want to use the Moses tokenizer port in SacreMoses (split from NLTK). You have to install SacreMoses. To build torchtext from source, you need git, CMake and C++11 compiler such as g++. When building from source, make sure that you have the same C++ compiler as the one used to build PyTorch. A simple way is to build PyTorch from source and use the same environment to build torchtext. If you are using the nightly build of PyTorch, check out the environment it was built with conda (here) and pip (here). Text classification: SST2, AG_NEWS, SogouNews, DBpedia, YelpReviewPolarity, YelpReviewFull, YahooAnswers, AmazonReviewPolarity, AmazonReviewFull, IMDB, etc.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    VADER

    VADER

    Lexicon and rule-based sentiment analysis tool

    VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool designed for analyzing the sentiment of text, particularly in social media and short text formats. It is optimized for quick and accurate analysis of positive, negative, and neutral sentiments.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 7
    AWS Toolkit for Visual Studio Code

    AWS Toolkit for Visual Studio Code

    Local Lambda debug, CodeWhisperer, SAM/CFN syntax, etc.

    The AWS Toolkit extension for Visual Studio Code enables you to interact with Amazon Web Services (AWS). Try the AWS Code Sample Catalog to start coding with the AWS SDK. The AWS Explorer provides access to the AWS services that you can work with when using the Toolkit. To see the AWS Explorer, choose the AWS icon in the Activity bar. The Developer Tools panel is a section for developer-focused tooling curated for working in an IDE. The Developer Tools panel can be found underneath the AWS Explorer when the AWS icon is selected in the Activity bar. The AWS CDK Explorer enables you to work with AWS Cloud Development Kit (CDK) applications. It shows a top-level view of your CDK applications that have been synthesized in your workspace. Amazon CodeWhisperer provides inline code suggestions using machine learning and natural language processing on the contents of your current file. Supported languages include Java, Python and Javascript.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    Haystack

    Haystack

    Haystack is an open source NLP framework to interact with your data

    Apply the latest NLP technology to your own data with the use of Haystack's pipeline architecture. Implement production-ready semantic search, question answering, summarization and document ranking for a wide range of NLP applications. Evaluate components and fine-tune models. Ask questions in natural language and find granular answers in your documents using the latest QA models with the help of Haystack pipelines. Perform semantic search and retrieve ranked documents according to meaning, not just keywords! Make use of and compare the latest pre-trained transformer-based languages models like OpenAI’s GPT-3, BERT, RoBERTa, DPR, and more. Pick any Transformer model from Hugging Face's Model Hub, experiment, find the one that works. Use Haystack NLP components on top of Elasticsearch, OpenSearch, or plain SQL. Boost search performance with Pinecone, Milvus, FAISS, or Weaviate vector databases, and dense passage retrieval.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI architectures are typically large and require a lot of data and compute for training. NeMo uses PyTorch Lightning for easy and performant multi-GPU/multi-node mixed-precision training. Supported models: Jasper, QuartzNet, CitriNet, Conformer-CTC, Conformer-Transducer, Squeezeformer-CTC, Squeezeformer-Transducer, ContextNet, LSTM-Transducer (RNNT), LSTM-CTC. NGC collection of pre-trained speech processing models.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Trumba is an All-in-one Calendar Management and Event Registration platform Icon
    Trumba is an All-in-one Calendar Management and Event Registration platform

    Great for live, virtual and hybrid events

    Publish, promote and track your events more affordably and effectively—all in one place.
    Learn More
  • 10
    Ciphey

    Ciphey

    Decrypt encryptions without knowing the key or cipher

    Fully automated decryption/decoding/cracking tool using natural language processing & artificial intelligence, along with some common sense. You don't know, you just know it's possibly encrypted. Ciphey will figure it out for you. Ciphey can solve most things in 3 seconds or less. Ciphey aims to be a tool to automate a lot of decryptions & decodings such as multiple base encodings, classical ciphers, hashes or more advanced cryptography. If you don't know much about cryptography, or you want to quickly check the ciphertext before working on it yourself, Ciphey is for you. The technical part. Ciphey uses a custom-built artificial intelligence module (AuSearch) with a Cipher Detection Interface to approximate what something is encrypted with. And then a custom-built, customizable natural language processing Language Checker Interface, which can detect when the given text becomes plaintext.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    Machine Learning PyTorch Scikit-Learn

    Machine Learning PyTorch Scikit-Learn

    Code Repository for Machine Learning with PyTorch and Scikit-Learn

    Initially, this project started as the 4th edition of Python Machine Learning. However, after putting so much passion and hard work into the changes and new topics, we thought it deserved a new title. So, what’s new? There are many contents and additions, including the switch from TensorFlow to PyTorch, new chapters on graph neural networks and transformers, a new section on gradient boosting, and many more that I will detail in a separate blog post. For those who are interested in knowing what this book covers in general, I’d describe it as a comprehensive resource on the fundamental concepts of machine learning and deep learning. The first half of the book introduces readers to machine learning using scikit-learn, the defacto approach for working with tabular datasets. Then, the second half of this book focuses on deep learning, including applications to natural language processing and computer vision.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    Search-Index

    Search-Index

    A persistent, network resilient, full text search library

    Search-Index is a lightweight and fast JavaScript-based search engine that enables full-text search indexing and retrieval for web applications.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    gse

    gse

    Go efficient multilingual NLP and text segmentation

    Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others. Gse is implements jieba by golang, and try add NLP support and more feature. Support common, search engine, full mode, precise mode and HMM mode multiple word segmentation modes. Support user and embed dictionary, Part-of-speech/POS tagging, analyze segment info, stop and trim words. Support multilingual: English, Chinese, Japanese and others. Support Traditional Chinese. Support HMM cut text use Viterbi algorithm. Support NLP by TensorFlow (in work). Named Entity Recognition (in work). Supports with elastic search and bleve. run JSON RPC service.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    From ingesting data to exploring it, annotating it, and managing workflows. Diffgram is a single application that will improve your data labeling and bring all aspects of training data under a single roof. Diffgram is world’s first truly open source training data platform that focuses on giving its users an unlimited experience. This is aimed to reduce your data labeling bills and increase your Training Data Quality. Training Data is the art of supervising machines through data. This includes the activities of annotation, which produces structured data; ready to be consumed by a machine learning model. Annotation is required because raw media is considered to be unstructured and not usable without it. That’s why training data is required for many modern machine learning use cases including computer vision, natural language processing and speech recognition.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    ModelScope

    ModelScope

    Bring the notion of Model-as-a-Service to life

    ModelScope is built upon the notion of “Model-as-a-Service” (MaaS). It seeks to bring together most advanced machine learning models from the AI community, and streamlines the process of leveraging AI models in real-world applications. The core ModelScope library open-sourced in this repository provides the interfaces and implementations that allow developers to perform model inference, training and evaluation. In particular, with rich layers of API abstraction, the ModelScope library offers unified experience to explore state-of-the-art models spanning across domains such as CV, NLP, Speech, Multi-Modality, and Scientific-computation. Model contributors of different areas can integrate models into the ModelScope ecosystem through the layered APIs, allowing easy and unified access to their models. Once integrated, model inference, fine-tuning, and evaluations can be done with only a few lines of code.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    AI learning

    AI learning

    AiLearning, data analysis plus machine learning practice

    We actively respond to the Research Open Source Initiative (DOCX) . Open source today is not just open source, but datasets, models, tutorials, and experimental records. We are also exploring other categories of open source solutions and protocols. I hope you will understand this initiative, combine this initiative with your own interests, and do what you can. Everyone's tiny contributions, together, are the entire open source ecosystem. We are iBooker, a large open-source community, we-media, and online earning community, with a QQ group of more than 10,000 people and at least 10,000 subscribers. The number of Github Stars exceeds 60k, and it ranks in the top 100 of all Github organizations. The daily up of all its websites exceeds 4k, and the peak of Alexa ranking is 20k. Our core members are certified as CSDN blog experts and short-book programmers as excellent authors. We have established ApacheCN, a non-profit document, and tutorial translation project.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Docspell

    Docspell

    Assist in organizing your piles of documents

    Docspell is a personal document organizer. Or sometimes called a "Document Management System" (DMS). You'll need a scanner to convert your papers into files. Docspell can then assist in organizing the resulting mess. It can unify your files from scanners, emails, and other sources. It is targeted for home use, i.e. families, households, and also for smaller groups/companies. You can associate tags, set correspondent,s and lots of other predefined and custom metadata. If your documents are associated with such metadata, you can quickly find them later using the search feature. However adding this manually is a tedious task. Docspell can help by suggesting correspondents, guessing tags or finding dates using machine learning. It can learn metadata from existing documents and find things using NLP. This makes adding metadata to your documents a lot easier. For machine learning, it relies on the free (GPL) Stanford Core NLP library.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    Spark NLP

    Spark NLP

    State of the Art Natural Language Processing

    Experience the power of large language models like never before, unleashing the full potential of Natural Language Processing (NLP) with Spark NLP, the open source library that delivers scalable LLMs. The full code base is open under the Apache 2.0 license, including pre-trained models and pipelines. The only NLP library built natively on Apache Spark. The most widely used NLP library in the enterprise. Spark ML provides a set of machine learning applications that can be built using two main components, estimators and transformers. The estimators have a method that secures and trains a piece of data to such an application. The transformer is generally the result of a fitting process and applies changes to the target dataset. These components have been embedded to be applicable to Spark NLP. Pipelines are a mechanism for combining multiple estimators and transformers in a single workflow. They allow multiple chained transformations along a machine-learning task.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    Weaviate

    Weaviate

    Weaviate is a cloud-native, modular, real-time vector search engine

    Weaviate in a nutshell: Weaviate is a vector search engine and vector database. Weaviate uses machine learning to vectorize and store data, and to find answers to natural language queries. With Weaviate you can also bring your custom ML models to production scale. Weaviate in detail: Weaviate is a low-latency vector search engine with out-of-the-box support for different media types (text, images, etc.). It offers Semantic Search, Question-Answer-Extraction, Classification, Customizable Models (PyTorch/TensorFlow/Keras), and more. Built from scratch in Go, Weaviate stores both objects and vectors, allowing for combining vector search with structured filtering with the fault-tolerance of a cloud-native database, all accessible through GraphQL, REST, and various language clients.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    Subliminal Blaster 4

    Subliminal Blaster 4

    Subliminal Blaster Powered 4 - Mude seus Hábitos! Change your habits

    Subliminal Blaster is a NLP software that shows text subliminal messages in your computer screen while you use it normaly for your activities. It re-programs your mind in a subconscious level while you exercite your conscious with your activities like browsing, working, watching video and others. Subliminal Blaster é um software de PNL que exibe mensagens subliminares na tela do PC enquanto você utiliza normalmente para suas atividades. Ele reprograma sua mente a nível subconsciente enquanto você exercita seu consciente em suas atividades. WE ARE NOW ON VERSION 4! Please support the project by donating bitcoins 1GRYGnSmpuU1ZuXodn2H9UVEpVRBx5CTL2 Or dogecoins! DBfkGrdLvmpbYQzcRCm9KLUuPk9Zigjjod Would you like to contribute? Go to our Facebook page! https://www.facebook.com/SubliminalBlasterIntl/
    Leader badge
    Downloads: 25 This Week
    Last Update:
    See Project
  • 21
    OpenNLP provides the organizational structure for coordinating several different projects which approach some aspect of Natural Language Processing. OpenNLP also defines a set of Java interfaces and implements some basic infrastructure for NLP compon
    Leader badge
    Downloads: 23 This Week
    Last Update:
    See Project
  • 22
    Awesome Fraud Detection Research Papers

    Awesome Fraud Detection Research Papers

    A curated list of data mining papers about fraud detection

    A curated list of data mining papers about fraud detection from several conferences.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    Datasets

    Datasets

    Hub of ready-to-use datasets for ML models

    Datasets is a library for easily accessing and sharing datasets, and evaluation metrics for Natural Language Processing (NLP), computer vision, and audio tasks. Load a dataset in a single line of code, and use our powerful data processing methods to quickly get your dataset ready for training in a deep learning model. Backed by the Apache Arrow format, process large datasets with zero-copy reads without any memory constraints for optimal speed and efficiency. We also feature a deep integration with the Hugging Face Hub, allowing you to easily load and share a dataset with the wider NLP community. There are currently over 2658 datasets, and more than 34 metrics available. Datasets naturally frees the user from RAM memory limitation, all datasets are memory-mapped using an efficient zero-serialization cost backend (Apache Arrow). Smart caching: never wait for your data to process several times.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    GT NLP Class

    GT NLP Class

    Course materials for Georgia Tech CS 4650 and 7650

    This repository contains lecture notes, slides, assignments, and code for a university-level Natural Language Processing course. It spans core NLP topics such as language modeling, sequence tagging, parsing, semantics, and discourse, alongside modern machine learning methods used to solve them. Students work through programming exercises and problem sets that build intuition for both classical algorithms (like HMMs and CRFs) and neural approaches (like word embeddings and sequence models). The materials emphasize theory grounded in practical experimentation, often via Python notebooks or scripts that visualize results and encourage ablation studies. Clear organization and self-contained examples make it possible to follow along outside the classroom, using the repo as a self-study resource. For learners and instructors alike, the course provides a coherent path from foundational linguistics to current techniques, with reproducible code that makes concepts concrete.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    HanLP

    HanLP

    Han Language Processing

    HanLP is a multilingual Natural Language Processing (NLP) library composed of a series of models and algorithms. Built on TensorFlow 2.0, it was designed to advance state-of-the-art deep learning techniques and popularize the application of natural language processing in both academia and industry. HanLP is capable of lexical analysis (Chinese word segmentation, part-of-speech tagging, named entity recognition), syntax analysis, text classification, and sentiment analysis. It comes with pretrained models for numerous languages including Chinese and English. It offers efficient performance, clear structure and customizable features, with plenty more amazing features to look forward to on the roadmap.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next