Showing 20 open source projects for "document"

View related business solutions
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    SemTools

    SemTools

    Semantic search and document parsing tools for the command line

    SemTools is an open-source command-line toolkit designed for document parsing, semantic indexing, and semantic search workflows. The project focuses on enabling developers and AI agents to process large document collections and extract meaningful semantic representations that can be searched efficiently. Built with Rust for performance and reliability, the toolchain provides fast processing of text and structured documents while maintaining low system overhead.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 2
    monolith

    monolith

    CLI tool for saving complete web pages as a single HTML file

    ...You can finally replace that gazillion of open tabs with a gazillion of .html files stored somewhere on your precious little drive. Unlike the conventional “Save page as”, monolith not only saves the target document, it embeds CSS, image, and JavaScript assets all at once, producing a single HTML5 document that is a joy to store and share. If compared to saving websites with wget -mpk, this tool embeds all assets as data URLs and therefore lets browsers render the saved page exactly the way it was on the Internet, even when no network connection is available.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    PoloDB

    PoloDB

    PoloDB is an embedded document database

    PoloDB is an embedded document-oriented NoSQL database that provides MongoDB-like functionality in a lightweight package, ideal for local storage in applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Bionic GPT

    Bionic GPT

    Bionic is an on-premise replacement for ChatGPT

    ...Beyond chat, Bionic focuses heavily on enterprise RAG by letting users create AI assistants that work with their own documents, share those assistants across teams, and configure embeddings, chunking, and system prompts through the UI. The platform supports a wide variety of document types, includes data isolation features for teams, and layers in security measures such as RBAC, row-level security in Postgres, strong content security policy settings, and minimal container builds.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    graphql_client

    graphql_client

    Typed, correct GraphQL requests and responses in Rust

    ...Copies documentation from the GraphQL schema to the generated Rust code. Arbitrary derives on the generated responses. Arbitrary custom scalars. Supports multiple operations per query document. Supports setting GraphQL fields as deprecated and having the Rust compiler check their use. Optional reqwest-based client for boilerplate-free API calls from browsers. Implicit and explicit null support.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Extractous

    Extractous

    Fast and efficient unstructured data extraction

    ...For broader format support, the system combines its Rust core with ahead-of-time compiled Apache Tika shared libraries, which allows it to extend parsing coverage while still avoiding traditional server-based overhead. It also supports OCR for images and scanned documents through Tesseract, making it useful for document ingestion pipelines that include image-based or scanned inputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    SurrealDB

    SurrealDB

    A scalable, distributed, collaborative, document-graph database

    With an SQL-style query language, real-time queries with highly-efficient related data retrieval, advanced security permissions for multi-tenant access, and support for performant analytical workloads, SurrealDB is the next generation serverless database. SurrealDB is the ultimate cloud database for tomorrow's applications. SurrealDB is an innovative NewSQL cloud database, suitable for serverless applications, jamstack applications, single-page applications, and traditional applications. It...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Kreuzberg

    Kreuzberg

    Polyglot document intelligence framework

    Kreuzberg is a flexible task orchestration and agent workflow platform designed to help developers build, coordinate, and scale intelligent agents or automation pipelines that interact with external services, runtime environments, and multi-step business workflows. It emphasizes modular design so that developers can define discrete tasks or “actions” and then compose them into complex flows where dependencies, parallel steps, and error handling are declaratively managed. This structure makes...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 9
    Sonic

    Sonic

    Fast, lightweight & schema-less search backend

    Sonic is a super fast and lightweight, schema-less search backend that can be used in place of super-heavy and full-featured search backends like Elasticsearch. It is able to normalize language search queries, auto-complete search queries and offer the most relevant results. Being an identifier index rather than a document index, when queried it provides IDs that can be used to refer to matched documents in an external database.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 10
    Note67

    Note67

    A private, local meeting notes assistant

    note67 is a private, local meeting notes assistant application that combines audio capture, transcription, and AI-powered summarization to help users document conversations and meetings on their own devices without relying on cloud services. Built with a cross-platform architecture using Rust (via Tauri) for backend logic and a TypeScript/React frontend, it prioritizes privacy by performing audio transcription locally with Whisper models and generating summaries with locally-hosted AI, eliminating the need to send sensitive meeting content to external servers. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    Scanopy

    Scanopy

    Clean network diagrams, One-time setup, zero upkeep

    Scanopy is a powerful multi-modal data capture and analysis toolkit that enables users to collect, process, and visualize structured and unstructured information from a variety of sources in a flexible pipeline. It is built to handle complex scanning tasks — such as OCR, document analysis, audio transcription, network data capture, and image extraction — while providing unified APIs and workflows that make managing heterogeneous data sources seamless. Developers can compose custom pipelines that chain together transforms, filters, and exporters, enabling automation of tedious data preparation steps and accelerating insights with minimal code. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    HelixDB

    HelixDB

    Graph-vector database for building unified AI backends fast

    ...HelixDB is built from scratch in Rust and uses LMDB as its storage engine, enabling high performance and low-latency query execution. HelixDB also supports additional data formats such as key-value, document, and relational data, making it flexible for a wide range of backend architectures. A central feature of the project is its custom query language, HelixQL, which is fully type-safe and compiled to ensure reliability and correctness in production environments. HelixDB includes built-in capabilities for embeddings, vector search, keyword search, and graph traversal, which are particularly useful for retrieval-augmented generation and agent-based systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Korvus

    Korvus

    Korvus is a search SDK that unifies the entire RAG pipeline

    Korvus is an open-source retrieval-augmented generation (RAG) pipeline designed to run entirely inside PostgreSQL, allowing developers to build AI search and knowledge systems directly within a database environment. The project consolidates the typical steps of a RAG pipeline—including embedding generation, document retrieval, reranking, and text generation—into a single query executed within the Postgres ecosystem. By leveraging PostgresML and vector extensions such as pgvector, Korvus eliminates the need for external microservices typically used for AI search architectures, reducing both system complexity and latency. The architecture enables machine learning operations to occur directly in the database, minimizing data transfer between services and improving overall performance for large datasets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    MongoDB Rust Driver

    MongoDB Rust Driver

    The official MongoDB Rust Driver

    ...Because it’s asynchronous by design, it works well with Rust async runtimes like Tokio and async-std, letting developers build highly concurrent networked services that efficiently use modern multicore hardware. The crate also includes BSON encoding and decoding that maps cleanly to Rust types, so developers can work with rich document structures while retaining Rust’s performance guarantees.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    gptcommit

    gptcommit

    A git prepare-commit-msg hook for authoring commit messages with GPT-3

    ...Commit messages are a key channel for developers to communicate their work with others, especially in code reviews. When making complex code changes, it can be tedious to thoroughly document the contents of each change. I often felt the impulse to just title my commit “fix bug” and move on. Surfacing these changes with gptcommit helps the author and reviewer by bringing attention to these additional changes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    sentinel

    sentinel

    Sentinel is a filesystem-backed document DBMS written in Rust.

    Sentinel is a filesystem-backed document DBMS built in Rust that prioritizes compliance, transparency, and auditability over raw performance. Unlike traditional databases, every document is a plain JSON file, making your data immediately forensic-friendly and Git-versionable. Perfect for regulated industries requiring GDPR, SOC 2, HIPAA, or PCI-DSS compliance. Sentinel provides async operations with automatic BLAKE3 hashing and optional Ed25519 signatures for cryptographic integrity. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 17
    AppFlowy

    AppFlowy

    Bring projects, wikis, and teams together with AI.

    ...AppFlowy comes with a beautiful rich-text editor that goes beyond just text and bullet points, offering 20+ content types, easy-to-use customized themes, keyboard shortcuts, and color options. It supports real-time team collaboration, enabling you to work with your friends and teammates on the same document in real time, similar to Google Docs. AppFlowy is powered by AppFlowy AI, which is accessible, collaborative, and contextual. Supercharge any type of work in a collaborative team workspace.
    Downloads: 63 This Week
    Last Update:
    See Project
  • 18
    authoscope

    authoscope

    Scriptable network authentication cracker (formerly `badtouch`)

    authoscope is a scriptable network authentication cracker. While the space for common service bruteforce is already very well saturated, you may still end up writing your own python scripts when testing credentials for web applications. The scope of authoscope is specifically cracking custom services. This is done by writing scripts that are loaded into a lua runtime. Those scripts represent a single service and provide a verify(user, password) function that returns either true or false....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    printpdf

    printpdf

    Rust / WASM library for reading, writing and rendering PDF

    ...It includes advanced typography capabilities such as character spacing, scaling, superscript, and subscript, as well as support for Unicode text. printpdf also offers optimization features like font subsetting to reduce file size, making generated PDFs more efficient. Experimental capabilities include rendering PDF pages to SVG and extracting text content, expanding its use cases beyond simple document generation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Mooneye GB

    Mooneye GB

    A Game Boy research project and emulator written in Rust

    ...Some existing emulators are very accurate (Gambatte, BGB >= 1.5) but are not documented very clearly, so they are not that good references for emulator developers. I want this project to document as clearly as possible why certain behavior is emulated in a certain way. This also means writing a lot of test ROMs to figure out corner cases and precise behavior on real hardware. The emulator is lagging behind hardware research. I don't want to spend time making short-lived and probably incorrect fixes to the emulator if I'm not sure about the hardware behavior. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB