Showing 4524 open source projects for "text based"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Browserless

    Browserless

    The headless Chrome/Chromium driver on top of Puppeteer

    Browserless is an open-source headless browser automation library and service built on top of Puppeteer that simplifies the process of running and scaling Chromium-based browser tasks in production environments. It provides a high-level API for interacting with headless Chrome, allowing developers to perform operations such as generating PDFs, capturing screenshots, extracting text or HTML, and automating web navigation. The project is designed to act as a production-ready abstraction layer over Puppeteer, offering improved reliability, error handling, and scalability for real-world applications. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    MLX Engine

    MLX Engine

    LM Studio Apple MLX engine

    MLX Engine is the Apple MLX-based inference backend used by LM Studio to run large language models efficiently on Apple Silicon hardware. Built on top of the mlx-lm and mlx-vlm ecosystems, the engine provides a unified architecture capable of supporting both text-only and multimodal models. Its design focuses on high-performance on-device inference, leveraging Apple’s MLX stack to accelerate computation on M-series chips.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Kaleidoscope-SDK

    Kaleidoscope-SDK

    User toolkit for analyzing and interfacing with Large Language Models

    kaleidoscope-sdk is a Python module used to interact with large language models hosted via the Kaleidoscope service available at: https://github.com/VectorInstitute/kaleidoscope. It provides a simple interface to launch LLMs on an HPC cluster, asking them to perform basic features like text generation, but also retrieve intermediate information from inside the model, such as log probabilities and activations. Users must authenticate using their Vector Institute cluster credentials. This can...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    WriteFreely

    WriteFreely

    A clean, Markdown-based publishing platform made for writers

    An open source platform for building a writing space on the web. Our fast, auto-saving editor is all you need to quickly get your thoughts down and published to your blog. WriteFreely sets your ideas and your server's resources free. Just run the binary to start your site up. Host your own community of writers. Interact with the decentralized social web via ActivityPub. WriteFreely has spent the past six years reliably powering more than 150,000 blogs on Write.as. WriteFreely is built around...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 5
    Operit AI

    Operit AI

    Powerful Android AI agent with tools, automation, and Linux shell

    Operit is a full-featured AI assistant and agent platform designed specifically for Android devices, aiming to go far beyond traditional chat-based interfaces. It integrates deep system-level capabilities with a wide range of tools, allowing the AI to perform real tasks such as file management, automation, and system control directly on the device. A standout aspect of the project is its built-in Ubuntu 24 environment, which enables users to run Linux commands, scripts, and development tools...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 6
    QMK

    QMK

    Keyboard firmware for Atmel AVR and ARM controllers

    QMK (Quantum Mechanical Keyboard) is an open source community centered around developing computer input devices. The community encompasses all sorts of input devices, such as keyboards, mice, and MIDI devices. This is a keyboard firmware based on the tmk_keyboard firmware with some useful features for Atmel AVR and ARM controllers, and more specifically, the OLKB product line, the ErgoDox EZ keyboard, and the Clueboard product line. Keyboards powered by QMK are Planck, Preonic, ErgoDox EZ,...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 7
    DragSelect

    DragSelect

    An easy JavaScript library for selecting and moving elements

    DragSelect is a JavaScript library that enables users to create a selection box for selecting multiple elements on a webpage. It is lightweight, highly customizable, and can be easily integrated into any front-end project requiring drag-based selection.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    s-tui

    s-tui

    Terminal-based CPU stress and monitoring utility

    s-tui (Stress Terminal UI) is a terminal-based performance monitoring and stress-testing tool focused specifically on CPU behavior analysis in Linux and other UNIX-like systems. It provides real-time graphical visualization of CPU temperature, frequency, power consumption, and utilization directly within a text-based interface, eliminating the need for a graphical desktop environment.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Step-Audio-EditX

    Step-Audio-EditX

    LLM-based Reinforcement Learning audio edit model

    Step-Audio-EditX is an open-source, 3 billion-parameter audio model from StepFun AI designed to make expressive and precise editing of speech and audio as easy as text editing. Rather than treating audio editing as low-level waveform manipulation, this model converts speech into a sequence of discrete “audio tokens” (via a dual-codebook tokenizer) — combining a linguistic token stream and a semantic (prosody/emotion/style) token stream — thereby abstracting audio editing into high-level...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    Ol.Text

    Ol.Text

    This is an implementation of Rx text transformation script language.

    Rx is a simple scripting language based on regular expressions designed to transform text information. The Ol.Text project is a Rx implementation for .NET Framework (>= 4.5), .NET Standard (>= 2.0) and .NET (>= 6.0) platforms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Apache OpenNLP

    Apache OpenNLP

    Apache OpenNLP

    Apache OpenNLP is a machine learning-based NLP library that provides tools for text-processing tasks such as tokenization, sentence segmentation, and named entity recognition.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Advanced NLP with spaCy

    Advanced NLP with spaCy

    Advanced NLP with spaCy: A free online course

    Advanced NLP with spaCy is an open-source educational repository that provides the materials for an interactive course on advanced natural language processing using the spaCy library. The course is designed to teach developers how to build real-world NLP systems by combining rule-based techniques with machine learning models. The repository includes lessons, exercises, and examples that guide learners through tasks such as tokenization, named entity recognition, text classification, and training custom NLP models. It also demonstrates how spaCy pipelines work and how developers can extend them with custom components and training data. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Databend

    Databend

    Cloud-native open source data warehouse for analytics and AI queries

    ...This architecture enables cost-efficient storage and elastic scaling for workloads that involve large datasets and complex queries. Databend provides a unified engine capable of handling analytics, vector search, and full-text search within a single platform. Databend supports SQL-based workflows and enables real-time data ingestion, transformation, and analysis through streaming and task orchestration features. With its cloud-native design and distributed architecture, Databend can run both as a self-hosted system or within managed environments to power data analytics, AI workloads, and large-scale data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Semantra

    Semantra

    Multi-tool for semantic search

    Semantra is an open-source semantic search tool designed to help users explore large collections of documents by meaning rather than simple keyword matching. The software analyzes text and PDF documents stored locally and creates embeddings that allow queries to retrieve results based on conceptual similarity. It is primarily intended for individuals who need to extract insights from large document collections, including researchers, journalists, students, and historians. The system runs from the command line and automatically launches a local web interface where users can perform interactive searches and examine document passages related to a query. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Pixeltable

    Pixeltable

    Data Infrastructure providing an approach to multimodal AI workloads

    ...Unlike traditional architectures that require multiple tools such as databases, vector stores, and workflow orchestrators, Pixeltable unifies these functions within a table-based abstraction. Developers define data transformations and AI operations using computed columns on tables, allowing pipelines to evolve incrementally as new data or models are added. The framework supports multimodal content including images, video, text, and audio, enabling applications such as retrieval-augmented generation systems, semantic search, and multimedia analytics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    OmAgent

    OmAgent

    Build multimodal language agents for fast prototype and production

    OmAgent is an open-source Python framework designed to simplify the development of multimodal language agents that can reason, plan, and interact with different types of data sources. The framework provides abstractions and infrastructure for building AI agents that operate on text, images, video, and audio while maintaining a relatively simple interface for developers. Instead of forcing developers to implement complex orchestration logic manually, the system manages task scheduling, worker coordination, and node optimization behind the scenes. Its architecture uses a graph-based workflow engine where tasks are represented as nodes in a directed workflow, enabling modular composition of complex reasoning pipelines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    OCRBase

    OCRBase

    MD/.JSON Document OCR and structured data extraction API

    OCRBase is a self-hostable document OCR and structured extraction system built to turn PDFs into machine-usable outputs at scale, aiming to bridge the gap between raw text extraction and production-ready pipelines. Instead of treating OCR as a one-off script, it presents an API-driven workflow where documents are submitted as jobs and processed through a queue-based architecture that can handle high throughput. The core output is designed for downstream automation, producing structured results like JSON according to user-defined schemas while also providing readable formats like Markdown for human review or indexing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Chinese-XLNet

    Chinese-XLNet

    Chinese XLNet pre-trained model

    ...This model is trained on large-scale Chinese text datasets to learn linguistic patterns, long-range dependencies, and semantic nuance typical of Chinese writing, making it useful for tasks like text classification, question answering, named entity recognition, and language generation. Chinese-XLNet offers an alternative to models like BERT by emphasizing autoregressive and permutation-based learning, which can lead to performance improvements on certain benchmarks and tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    LangExtract

    LangExtract

    A Python library for extracting structured information

    LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material. Each extracted entity is precisely grounded in its original context, allowing visual inspection and validation via automatically generated interactive HTML visualizations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Portkey AI Gateway

    Portkey AI Gateway

    A blazing fast AI Gateway with integrated guardrails

    ...It supports automatic retries, fallbacks, load balancing across providers or keys, and request timeouts to avoid latency spikes. The gateway is multimodal: it can handle text, vision, audio, and image models under a common interface. It also offers features for governance: role-based access, compliance with standards (SOC2, HIPAA, GDPR), secure key management, and logging/analytics of usage, latency, errors, and cost. The system integrates with agent frameworks like LangChain, Autogen, and others, enabling the building of more complex AI applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    htmly

    htmly

    Simple and fast databaseless PHP blogging platform, and Flat-File CMS

    HTMLy is an open source databaseless PHP blogging platform. A flat-file CMS that allows you to create a fast, secure, and powerful website or blog in seconds. HTMLy uses a unique algorithm to find or list any content based on date, type, category, tag, or author, and it's performance remains fast even if we have tens of thousands of posts and hundreds of tags. As a flat-file CMS, HTMLy is designed to run smoothly despite using minimal server specs. With 512MB of RAM or even in shared...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    asciinema

    asciinema

    Open source terminal session recorder

    ...Forget old screen recording methods and resulting blurry videos. asciinema lets you record your terminal sessions the right way, which is right where you work, in the terminal. Recording is as easy as running one command, and since it’s purely text-based you can copy and paste any content you want, simply pause the recording! You can also easily share your recordings on the web, embed an asciicast player in your blog post, project documentation page or in your conference talk slides. See plenty of example sessions recorded with asciinema here: https://asciinema.org/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    wacli

    wacli

    WhatsApp CLI

    wacli is a command-line interface for WhatsApp that focuses on syncing, searching, and sending messages through the WhatsApp Web protocol. It is designed as a third-party CLI built on top of whatsmeow, giving developers and power users a local-first way to work with WhatsApp data outside the standard app interface. The project supports interactive authentication through a QR-based login flow and then transitions into a non-interactive sync mode for ongoing message capture. It stores data...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    arduinoWebSockets

    arduinoWebSockets

    arduinoWebSockets

    A WebSocket Server and Client for Arduino based on RFC6455.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Evennia

    Evennia

    Python MUD/MUX/MUSH/MU* development system

    Evennia is a mature, open-source framework written in Python — specifically designed to build text-based, online multiplayer games such as MUDs, MUCKs, MUSHes, MUXes, and other “MU-style” virtual worlds. Rather than prescribing a rigid game structure, Evennia gives you a bare-bones but powerful foundation: default systems handle networking, database/storage, server management, user accounts, characters, rooms, items, chat channels, and basic commands — but you define the gameplay rules, content, and game logic yourself in pure Python modules. ...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB