Showing 2772 open source projects for "text based"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    htmly

    htmly

    Simple and fast databaseless PHP blogging platform, and Flat-File CMS

    HTMLy is an open source databaseless PHP blogging platform. A flat-file CMS that allows you to create a fast, secure, and powerful website or blog in seconds. HTMLy uses a unique algorithm to find or list any content based on date, type, category, tag, or author, and it's performance remains fast even if we have tens of thousands of posts and hundreds of tags. As a flat-file CMS, HTMLy is designed to run smoothly despite using minimal server specs. With 512MB of RAM or even in shared...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Decap

    Decap

    A Git-based CMS for Static Site Generators

    Open source content management for your Git workflow. Use Decap CMS with any static site generator for a faster and more flexible web project. Get the speed, security, and scalability of a static site, while still providing a convenient editing interface for content. Content is stored in your Git repository alongside your code for easier versioning, multi-channel publishing, and the option to handle content updates directly in Git. Decap CMS is built as a single-page React app. Create...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    Advanced NLP with spaCy

    Advanced NLP with spaCy

    Advanced NLP with spaCy: A free online course

    Advanced NLP with spaCy is an open-source educational repository that provides the materials for an interactive course on advanced natural language processing using the spaCy library. The course is designed to teach developers how to build real-world NLP systems by combining rule-based techniques with machine learning models. The repository includes lessons, exercises, and examples that guide learners through tasks such as tokenization, named entity recognition, text classification, and training custom NLP models. It also demonstrates how spaCy pipelines work and how developers can extend them with custom components and training data. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Pixeltable

    Pixeltable

    Data Infrastructure providing an approach to multimodal AI workloads

    ...Unlike traditional architectures that require multiple tools such as databases, vector stores, and workflow orchestrators, Pixeltable unifies these functions within a table-based abstraction. Developers define data transformations and AI operations using computed columns on tables, allowing pipelines to evolve incrementally as new data or models are added. The framework supports multimodal content including images, video, text, and audio, enabling applications such as retrieval-augmented generation systems, semantic search, and multimedia analytics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    OmAgent

    OmAgent

    Build multimodal language agents for fast prototype and production

    OmAgent is an open-source Python framework designed to simplify the development of multimodal language agents that can reason, plan, and interact with different types of data sources. The framework provides abstractions and infrastructure for building AI agents that operate on text, images, video, and audio while maintaining a relatively simple interface for developers. Instead of forcing developers to implement complex orchestration logic manually, the system manages task scheduling, worker coordination, and node optimization behind the scenes. Its architecture uses a graph-based workflow engine where tasks are represented as nodes in a directed workflow, enabling modular composition of complex reasoning pipelines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    OCRBase

    OCRBase

    MD/.JSON Document OCR and structured data extraction API

    OCRBase is a self-hostable document OCR and structured extraction system built to turn PDFs into machine-usable outputs at scale, aiming to bridge the gap between raw text extraction and production-ready pipelines. Instead of treating OCR as a one-off script, it presents an API-driven workflow where documents are submitted as jobs and processed through a queue-based architecture that can handle high throughput. The core output is designed for downstream automation, producing structured results like JSON according to user-defined schemas while also providing readable formats like Markdown for human review or indexing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Chinese-XLNet

    Chinese-XLNet

    Chinese XLNet pre-trained model

    ...This model is trained on large-scale Chinese text datasets to learn linguistic patterns, long-range dependencies, and semantic nuance typical of Chinese writing, making it useful for tasks like text classification, question answering, named entity recognition, and language generation. Chinese-XLNet offers an alternative to models like BERT by emphasizing autoregressive and permutation-based learning, which can lead to performance improvements on certain benchmarks and tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    LangExtract

    LangExtract

    A Python library for extracting structured information

    LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material. Each extracted entity is precisely grounded in its original context, allowing visual inspection and validation via automatically generated interactive HTML visualizations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Portkey AI Gateway

    Portkey AI Gateway

    A blazing fast AI Gateway with integrated guardrails

    ...It supports automatic retries, fallbacks, load balancing across providers or keys, and request timeouts to avoid latency spikes. The gateway is multimodal: it can handle text, vision, audio, and image models under a common interface. It also offers features for governance: role-based access, compliance with standards (SOC2, HIPAA, GDPR), secure key management, and logging/analytics of usage, latency, errors, and cost. The system integrates with agent frameworks like LangChain, Autogen, and others, enabling the building of more complex AI applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    asciinema

    asciinema

    Open source terminal session recorder

    ...Forget old screen recording methods and resulting blurry videos. asciinema lets you record your terminal sessions the right way, which is right where you work, in the terminal. Recording is as easy as running one command, and since it’s purely text-based you can copy and paste any content you want, simply pause the recording! You can also easily share your recordings on the web, embed an asciicast player in your blog post, project documentation page or in your conference talk slides. See plenty of example sessions recorded with asciinema here: https://asciinema.org/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Betterfox

    Betterfox

    Firefox user.js for optimal privacy and security

    ...Betterfox recommends pairing these settings with essential extensions like ad blockers and DNS-level protections to achieve a well-rounded browsing experience. Because the preferences are text-based and version controlled, users can review and customize them to meet their own balance of privacy and convenience.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 12
    NVIDIA NeMo Framework

    NVIDIA NeMo Framework

    Scalable generative AI framework built for researchers and developers

    NVIDIA NeMo is a scalable, cloud-native generative AI framework aimed at researchers and PyTorch developers working on large language models, multimodal models, and speech AI (ASR and TTS), with growing support for computer vision. It provides collections of domain-specific modules and reference implementations that make it easier to pre-train, fine-tune, and deploy very large models on multi-GPU and multi-node infrastructure. NeMo 2.0 introduces a Python-based configuration system,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    InvokeAI

    InvokeAI

    InvokeAI is a leading creative engine for Stable Diffusion models

    InvokeAI is an implementation of Stable Diffusion, the open source text-to-image and image-to-image generator. It provides a streamlined process with various new features and options to aid the image generation process. It runs on Windows, Mac and Linux machines, and runs on GPU cards with as little as 4 GB or RAM. InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 14
    ZetaJS

    ZetaJS

    JS wrapper for ZetaOffice in the browser

    The zeta.js library provides the facilities to run an instance of ZetaOffice integrated into your web site, allowing you to control it with JavaScript code via the LibreOffice UNO technology. Use cases range from an in-browser office suite that looks and feels just like its desktop counterpart, to fine-tuned custom text editing and spreadsheet capabilities embedded in your website, to a headless zetajs instance that does document conversion in the background.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    windsurf.vim

    windsurf.vim

    Free, ultrafast Copilot alternative for Vim and Neovim

    ...The aim is to provide a “free, ultrafast” alternative to other AI code assistants (such as GitHub Copilot) directly within Vim/Neovim. Once installed and configured, windsurf.vim can suggest code completions, generate multi-line snippets based on comments or invitation in code, and make the editing experience more predictive and context-aware. The plugin supports major programming languages and allows you to trigger completions as you type—especially after comments or partial code constructs. Because it is designed to integrate with Vim’s editing model, it offers suggestions in-line and leverages virtual text or inline indicators when supported. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    QMK

    QMK

    Keyboard firmware for Atmel AVR and ARM controllers

    QMK (Quantum Mechanical Keyboard) is an open source community centered around developing computer input devices. The community encompasses all sorts of input devices, such as keyboards, mice, and MIDI devices. This is a keyboard firmware based on the tmk_keyboard firmware with some useful features for Atmel AVR and ARM controllers, and more specifically, the OLKB product line, the ErgoDox EZ keyboard, and the Clueboard product line. Keyboards powered by QMK are Planck, Preonic, ErgoDox EZ,...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    AI YouTube Shorts Generator

    AI YouTube Shorts Generator

    A python tool that uses GPT-4, FFmpeg, and OpenCV

    AI-YouTube-Shorts-Generator is a Python-based tool that automates the creation of short-form vertical video clips (“shorts”) from longer source videos — ideal for adapting content for platforms like YouTube Shorts, Instagram Reels, or TikTok. It analyzes input video (whether a local file or a YouTube URL), transcribes audio (with optional GPU-accelerated speech-to-text), uses an AI model to identify the most compelling or engaging segments, and then crops/resizes the video and applies subtitle overlays, producing a polished short video without manual editing. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    LongWriter

    LongWriter

    Unleashing 10,000+ Word Generation from Long Context LLMs

    ...The system uses an agent-based pipeline called AgentWrite that decomposes large writing tasks into smaller subtasks, allowing the model to produce long documents section by section. Researchers also created the LongWriter-6k dataset containing thousands of examples with outputs ranging from a few thousand to tens of thousands of words.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Ol.Text

    Ol.Text

    This is an implementation of Rx text transformation script language.

    Rx is a simple scripting language based on regular expressions designed to transform text information. The Ol.Text project is a Rx implementation for .NET Framework (>= 4.5), .NET Standard (>= 2.0) and .NET (>= 6.0) platforms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    NeMo Retriever Library

    NeMo Retriever Library

    Document content and metadata extraction microservice

    NeMo Retriever Library is a scalable microservice framework designed for extracting, structuring, and enriching content from documents to support downstream generative AI applications. It processes various document types by splitting them into components such as text, tables, charts, and images, and then applies OCR and contextual analysis to convert them into structured data formats. The system is built on NVIDIA NIM microservices, enabling high-performance parallel processing and efficient...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Podcastfy.ai

    Podcastfy.ai

    Transforming Multimodal Content into Captivating Multilingual Audio

    Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, youtube videos as well as images. Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources enabling customization and scale.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Asciidoctor PDF

    Asciidoctor PDF

    Asciidoctor PDF: A native PDF converter for AsciiDoc

    A fast text processor & publishing toolchain for converting AsciiDoc to HTML5, DocBook & more. Asciidoctor is a fast, open source, Ruby-based text processor for parsing AsciiDoc® into a document model and converting it to output formats such as HTML 5, DocBook 5, manual pages, PDF, EPUB 3, and other formats. Asciidoctor also has an ecosystem of extensions, converters, build plugins, and tools to help you author and publish content written in AsciiDoc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    wacli

    wacli

    WhatsApp CLI

    wacli is a command-line interface for WhatsApp that focuses on syncing, searching, and sending messages through the WhatsApp Web protocol. It is designed as a third-party CLI built on top of whatsmeow, giving developers and power users a local-first way to work with WhatsApp data outside the standard app interface. The project supports interactive authentication through a QR-based login flow and then transitions into a non-interactive sync mode for ongoing message capture. It stores data...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    Difftastic

    Difftastic

    A structural diff that understands syntax

    Difftastic is a structural diff tool written in Rust that parses source files using syntax trees (via tree‑sitter) and produces human‑readable diffs at the expression level. It works across 30+ languages and emphasizes readability by aligning code structure rather than lines. Ideal for code review and understanding semantic changes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    AnySoftKeyboard

    AnySoftKeyboard

    Android (f/w 2.1+) on screen keyboard for multiple languages

    The only Android keyboard you'll ever need. Free as in speech and Free as in beer. Android (f/w 4.0.3+, API level 15+) on screen keyboard for multiple languages.
    Downloads: 4 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB