Showing 125 open source projects for "analyze text"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    Spoon

    Spoon

    Metaprogramming library to analyze and transform Java source code

    Spoon is an open-source library to analyze, rewrite, transform, transpile Java source code. It parses source files to build a well-designed AST with powerful analysis and transformation API. It supports modern Java versions up to Java 20. Spoon is an official Inria open-source project, and member of the OW2 open-source consortium.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 2
    TeXtidote

    TeXtidote

    Spelling, grammar and style checking on LaTeX documents

    If so, you probably know that the process is far from simple. Since LaTeX documents contain special commands and keywords (the so-called "markup") that are not part of the "real" text, you cannot run a grammar checker directly on these files: it cannot tell the difference between markup and text. The other option is to remove all this markup, leaving only the "clear" text; however, when a grammar tool points to a problem at a specific line in this clear text, it becomes hard to retrace that...
    Downloads: 128 This Week
    Last Update:
    See Project
  • 3
    Zed

    Zed

    High-performance, multiplayer code editor from the creators of Atom

    ...Fast native terminal tightly integrates with Zed's language-aware task runner and AI capabilities. First-class modal editing via Vim bindings, including features like text objects and marks. Zed is built by a global community of thousands of developers. Boost your Zed experience by choosing from hundreds of extensions that broaden language support, offer different themes, and more.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 4
    TRIBE v2

    TRIBE v2

    A multimodal model for brain response prediction

    ...TRIBE v2 allows researchers to simulate and analyze brain activity without requiring direct human experiments. Overall, it provides a powerful tool for studying perception, cognition, and multimodal processing in the brain.
    Downloads: 11 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 5
    Open Notebook

    Open Notebook

    An Open Source implementation of Notebook LM with more flexibility

    ...The platform supports 16+ AI providers—including OpenAI, Anthropic, Ollama, Google, and LM Studio—allowing flexible model choice and cost optimization. Open Notebook enables users to organize and analyze multi-modal content such as PDFs, videos, audio files, web pages, and Office documents. It combines full-text and vector search with context-aware AI chat to deliver insights grounded in your own research materials. With advanced features like multi-speaker podcast generation, customizable content transformations, and a comprehensive REST API, Open Notebook provides a powerful and extensible research environment.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 6
    gse

    gse

    Go efficient multilingual NLP and text segmentation

    Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others. Gse is implements jieba by golang, and try add NLP support and more feature. Support common, search engine, full mode, precise mode and HMM mode multiple word segmentation modes. Support user and embed dictionary, Part-of-speech/POS tagging, analyze segment info, stop and trim words.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Logseq

    Logseq

    A privacy-first, open-source platform for knowledge management

    Logseq is a privacy-first, open-source knowledge base that works on top of local plain-text Markdown and Org-mode files. Use it to write, organize and share your thoughts, keep your to-do list, and build your own digital garden. Logseq is a platform for knowledge management and collaboration. It focuses on privacy, longevity, and user control. The server will never store or analyze your private notes. Your data are plain text files and we currently support both Markdown and Emacs Org-mode (more to be added soon). ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 8
    ARIS

    ARIS

    Lightweight Markdown-only skills for autonomous ML research

    ARIS is an experimental automation framework that leverages AI coding agents to perform continuous research and development tasks autonomously, even without active user supervision. The system is designed to run iterative cycles of research, coding, testing, and refinement, effectively simulating a “sleep mode” where productive work continues in the background. It integrates with AI tools such as Claude Code to generate solutions, analyze results, and improve outputs over time. The project...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Everywhere

    Everywhere

    Context-aware desktop AI assistant that understands screen content

    Everywhere is a context-aware desktop AI assistant designed to interact directly with the content displayed on a user’s screen. It distinguishes itself from traditional AI tools by eliminating the need for manual input methods such as copying text or taking screenshots, instead allowing users to invoke assistance instantly through a shortcut. It can analyze on-screen information in real time and provide contextual responses, making it useful for tasks like troubleshooting errors, summarizing articles, translating text, and refining written content. It integrates with multiple large language model providers and supports various tools, enabling flexible and extensible AI-powered workflows. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 10
    OpenAI Privacy Filter

    OpenAI Privacy Filter

    Bidirectional token-classification model for identifiable info

    OpenAI Privacy Filter is an open-weight machine learning model designed to detect and mask personally identifiable information in text with high efficiency and contextual awareness. It operates as a bidirectional token classification system that labels sensitive data in a single forward pass rather than generating text sequentially, enabling fast processing for large datasets. The model supports long-context inputs, allowing it to analyze extensive documents without chunking, which improves consistency in redaction tasks. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 11
    Telegram-OSINT

    Telegram-OSINT

    https://github.com/The-Osint-Toolbox/Telegram-OSINT

    Telegram-OSINT is an extensive open source repository that compiles tools, techniques, and resources for conducting open source intelligence investigations on the Telegram platform. It serves as a central reference for analysts, researchers, and investigators who want to discover, analyze, and collect publicly available information from Telegram channels, groups, and bots. It organizes a wide variety of utilities that interact with Telegram’s API to gather data such as channel details,...
    Downloads: 36 This Week
    Last Update:
    See Project
  • 12
    LLM-Aided OCR Project

    LLM-Aided OCR Project

    Enhances Tesseract OCR output using LLMs (local or API)

    LLM Aided OCR is an open-source system designed to improve optical character recognition accuracy by combining traditional OCR tools with large language models. The project addresses common OCR challenges such as distorted text, unusual fonts, historical documents, and complex layouts that often produce inaccurate results with standard OCR pipelines. The system first extracts raw text using OCR engines and then applies language models to analyze and correct recognition errors based on context. This AI-assisted correction process helps reconstruct missing characters, fix formatting mistakes, and produce more coherent text outputs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Instagram OSINT Tool

    Instagram OSINT Tool

    Instagram OSINT tool for gathering profile data and public posts

    ...In addition to profile information, it can also retrieve post-related data and download publicly available images associated with an account. The results are saved locally in structured formats such as JSON-style data inside text files, making them easy to analyze or integrate into other applications. InstagramOSINT also exposes a Python API so developers can import the functionality.
    Downloads: 56 This Week
    Last Update:
    See Project
  • 14
    deepdoctection

    deepdoctection

    A Repo For Document AI

    ...For more specific text processing tasks use one of the many other great NLP libraries.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    X-osint

    X-osint

    Open source OSINT tool for gathering data on emails, phones, and IPs

    X-osint is an open source intelligence framework designed to collect and analyze publicly available information from multiple sources. It focuses on gathering useful and credible data about entities such as phone numbers, email addresses, and IP addresses using a range of automated OSINT techniques. It provides investigators and researchers with a centralized interface for running information-gathering tasks that would normally require multiple separate tools.
    Downloads: 44 This Week
    Last Update:
    See Project
  • 16
    rollama

    rollama

    Wrap the Ollama API, which allows you to run different LLMs

    rollama is an R package that provides a convenient interface for interacting with local large language models through the Ollama API, bringing modern AI capabilities into the R ecosystem. It is designed to make LLM usage accessible to data scientists and researchers who work primarily in R, allowing them to generate text, analyze data, and create embeddings without relying on external cloud services. The package emphasizes reproducibility and privacy by enabling local execution of models, which is especially valuable for sensitive or research-oriented workflows. It supports common LLM tasks such as text generation, annotation, and embedding creation, making it useful for tasks like document analysis and data labeling. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    AI-Researcher

    AI-Researcher

    AI-Researcher: Autonomous Scientific Innovation

    ...It lets users input high-level research goals or questions in natural language and then automatically plans, decomposes, and executes tasks such as literature surveying, summarization, synthesis, experiment design, and draft generation. The system integrates retrieval mechanisms to pull in external knowledge sources, contextually analyze documents and papers, and build structured representations of ideas and arguments that can later be turned into coherent reports or drafts. Rather than simply generating text from prompts, AI-Researcher orchestrates sequences of subtasks — such as extracting definitions, identifying key experiments, and tracking citations — and uses self-refinement loops to iteratively improve outputs.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    Matter AI

    Matter AI

    Matter AI is open-source AI Code Reviewer Agent

    Matter AI is an AI-powered platform designed to enhance productivity through automated content generation, data analysis, and decision support. It leverages machine learning models to process text, analyze patterns, and generate insights, making it suitable for businesses looking to optimize data-driven decision-making. Matter AI integrates with various data sources and provides customizable AI workflows tailored to different industries.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19

    VOIP-VOICE-TO-TEXT&ANALYS

    Convert VoIP calls to text and analyze them with AI

    The VoIP voice-to-text software for Issabel is an intelligent, AI-based solution that converts calls into accurate Persian text. After each call, the audio file is sent to the GPT-4O AI engine, producing editable transcripts. The software also provides AI-powered call analysis, extracting key points, customer requests, satisfaction levels, and sensitive topics, all stored in the database. This helps sales and support teams make faster decisions, improve response quality, and enhance customer...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Elasticsearch

    Elasticsearch

    A Distributed RESTful Search Engine

    Elasticsearch is a distributed, RESTful search and analytics engine that lets you store, search and analyze with ease at scale. It lets you perform and combine many types of searches; it scales seamlessly, and offers answers incredibly fast with search results you can rank based on a variety of factors. Elasticsearch can be used for a wide variety of use cases, from maps and metrics to site search and workplace search, and with all data types.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 21
    Windrecorder

    Windrecorder

    Windrecorder is a memory search app by records everything

    ...It captures screen content locally and builds a searchable database using OCR and image understanding, allowing users to rewind and rediscover anything they have previously seen. The system indexes only meaningful visual changes, extracting text, browser data, and contextual information to improve search accuracy and reduce storage overhead. It includes a web-based interface where users can browse timelines, analyze activity, and perform semantic queries on recorded content. The tool emphasizes privacy by running entirely offline, ensuring that all captured data remains on the user’s device without external transmission. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    Learning Interpretability Tool

    Learning Interpretability Tool

    Interactively analyze ML models to understand their behavior

    The Learning Interpretability Tool (LIT, formerly known as the Language Interpretability Tool) is a visual, interactive ML model-understanding tool that supports text, image, and tabular data. It can be run as a standalone server, or inside of notebook environments such as Colab, Jupyter, and Google Cloud Vertex AI notebooks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Tookie-OSINT

    Tookie-OSINT

    Username OSINT tool for discovering accounts across many websites

    Tookie-OSINT is an open source intelligence tool designed to help security researchers, ethical hackers, and investigators discover online accounts associated with a specific username. It automates the process of searching for usernames across multiple websites, making it easier to identify a person's presence on different platforms. By entering a target username, Tookie-OSINT scans a list of supported sites and checks whether the username exists on those platforms. This approach removes the...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 24
    InternLM-XComposer-2.5

    InternLM-XComposer-2.5

    InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System

    InternLM-XComposer is an open-source multimodal AI system designed to generate long-form content that combines text with visual elements such as images and diagrams. The model is built on top of the InternLM language model architecture and extends its capabilities to handle multimodal inputs and outputs. Instead of producing only textual responses, the system can generate visually enriched documents such as illustrated articles, presentations, and educational materials.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Argilla

    Argilla

    The open-source data curation platform for LLMs

    Argilla is a production-ready framework for building and improving datasets for NLP projects. Deploy your own Argilla Server on Spaces with a few clicks. Use embeddings to find the most similar records with the UI. This feature uses vector search combined with traditional search (keyword and filter based). Argilla is free, open-source, and 100% compatible with major NLP libraries (Hugging Face transformers, spaCy, Stanford Stanza, Flair, etc.). In fact, you can use and combine your preferred...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB