Showing 15 open source projects for "text analysis"

View related business solutions
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    Spoon

    Spoon

    Metaprogramming library to analyze and transform Java source code

    Spoon is an open-source library to analyze, rewrite, transform, transpile Java source code. It parses source files to build a well-designed AST with powerful analysis and transformation API. It supports modern Java versions up to Java 20. Spoon is an official Inria open-source project, and member of the OW2 open-source consortium.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 2
    Stanza

    Stanza

    Stanford NLP Python library for many human languages

    Stanza is a collection of accurate and efficient tools for the linguistic analysis of many human languages. Starting from raw text to syntactic analysis and entity recognition, Stanza brings state-of-the-art NLP models to languages of your choosing. Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of speech and morphological features, to give a syntactic structure dependency parse, and to recognize named entities. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Article Extractor

    Article Extractor

    To extract main article from given URL with Node.js

    A Node.js library for extracting main content from web articles, removing unnecessary clutter like ads and navigation elements.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    LangExtract

    LangExtract

    A Python library for extracting structured information

    LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material. Each extracted entity is precisely grounded in its original context, allowing visual inspection and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to expand its capabilities, focusing on versatile data extraction, platform support, and seamless integration with various systems. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    Interpret-Text

    Interpret-Text

    State-of-the-art explainers for text-based machine learning models

    A library that incorporates state-of-the-art explainers for text-based machine learning models and visualizes the result with a built-in dashboard. Interpret-Text builds on Interpret, an open source python package for training interpretable models and helping to explain blackbox machine learning systems. We have added extensions to support text models. Interpret-Text incorporates community-developed interpretability techniques for NLP models and a visualization dashboard to view the results. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    CodeQL

    CodeQL

    Libraries and queries that power security researchers

    CodeQL is a semantic code analysis engine that treats programs as queryable databases, enabling users to write expressive queries that identify security vulnerabilities, logic bugs, and code quality issues across large codebases. Instead of just pattern matching text, CodeQL ingests source code, builds rich representations of structure and data flow, and allows queries that reason about control flow, type systems, and interprocedural relationships.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    fastNLP

    fastNLP

    fastNLP: A Modularized and Extensible NLP Framework

    ...Various convenient NLP tools, such as Embedding loading (including ELMo and BERT), intermediate data cache, etc.. Provide a variety of neural network components and recurrence models (covering tasks such as Chinese word segmentation, named entity recognition, syntactic analysis, text classification, text matching, metaphor resolution, summarization, etc.). Trainer provides a variety of built-in Callback functions to facilitate experiment recording, exception capture, etc. Automatic download of some datasets and pre-trained models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    PyCBC

    PyCBC

    Learn how to use PyCBC to analyze gravitational-wave data

    PyCBC is a software developed by a collaboration of LIGO, Virgo, and independent scientists. It is open source and freely available. We use PyCBC in the detection of gravitational waves from binary mergers such as GW150914. These examples explore how to analyze gravitational wave data, how we find potential signals and learn about them. Many of these tutorials will require you to make edits to config files as part of their exercises. At the moment this isn't easy to do on services like...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 10
    Deeplearning-papernotes

    Deeplearning-papernotes

    Summaries and notes on Deep Learning research papers

    Deeplearning-papernotes is an implementation of Convolutional Neural Networks for sentence and text classification in TensorFlow, based on a well-known research paper that applies CNN architectures to natural language processing tasks with strong performance in sentiment analysis and similar classification problems. The repository provides the complete network definition, including an embedding layer to convert words into dense representations, convolution and max-pooling layers to extract informative features, and a final softmax classifier to distinguish between target classes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Gumbo

    Gumbo

    An HTML5 parsing library in pure C99

    Gumbo is an implementation of the HTML5 parsing algorithm implemented as a pure C99 library with no outside dependencies. It's designed to serve as a building block for other tools and libraries such as linters, validators, templating languages, and refactoring and analysis tools. Gumbo gains some of this by virtue of being written in C, but it is not an important consideration for the intended use-case, and was not a major design factor. Gumbo is intentionally designed to turn an HTML...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    TextBlob

    TextBlob

    TextBlob is a Python library for processing textual data

    Simple, Pythonic, text processing, Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    Fast Matrix for Java

    General purpose matrix utilities for Java in Parallel Computing

    Fast Matrix for Java (fm4j) is a general-purpose matrix utility library for computing with dense matrices. fm4j encapsulated different underlying implementations and select the optimal one in run-time depending on the size of the input matrix. Moreover, fm4j employs Java (Tm) Concurrency to take advantage of the computation power of multi-cor processors.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Voikko

    Voikko

    Library of linguistic tools

    Voikko is a spell checking, grammar checking, morphological analysis and hyphenation system. Spell checkers are available for multiple languages, other features for Finnish only.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 15
    The Scheme Natural Language Toolkit (S-NLTK) is a Scheme R6RS library for language and text processing, and various tasks related to symbolic and statistical analysis of language data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB