Showing 20 open source projects for "documents"

View related business solutions
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 1
    RedisJSON

    RedisJSON

    RedisJSON - a JSON data type for Redis

    RedisJSON is a Redis module that implements ECMA-404 The JSON Data Interchange Standard as a native data type. It allows storing, updating and fetching JSON values from Redis keys (documents).
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Taplo

    Taplo

    A TOML toolkit written in Rust

    A versatile, feature-rich TOML toolkit. This is the repository for Taplo, a TOML v1.0.0 toolkit, more details are on the website. Validate TOML documents syntactically or against JSON schemas. Formatter with fine-grained options. Embeddable language server with features based on JSON schemas. Available wherever Rust compiles. Taplo CLI aims to be an one stop shop tool for working with TOML files via the command line. The features include validation, formatting, and querying TOML documents with a jq-like fashion.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Extractous

    Extractous

    Fast and efficient unstructured data extraction

    ...For broader format support, the system combines its Rust core with ahead-of-time compiled Apache Tika shared libraries, which allows it to extend parsing coverage while still avoiding traditional server-based overhead. It also supports OCR for images and scanned documents through Tesseract, making it useful for document ingestion pipelines that include image-based or scanned inputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Typst

    Typst

    A new markup-based typesetting system that is powerful and easy

    ...Typst supercharges templates: They react to your content and format everything instantly while you type. Select from a wide range of community templates or create your own. Store shared documents in team workspaces to bring everyone in your working group on the same page. Whether in the classroom, the faculty office, or at home. Typst runs in your browser, so everyone on the team can just start writing.
    Downloads: 21 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    Atuin

    Atuin

    Magical shell history

    Atuin is a modern shell history replacement tool and CLI utility that records all your shell commands in a SQLite database alongside contextual metadata. It offers encrypted sync across devices, full-text search, usage statistics, and a desktop application to run executable runbooks as native documents.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    rga

    rga

    rga: ripgrep, but also search in PDFs, E-Books, Office documents, etc.

    rga is a line-oriented search tool that allows you to look for a regex in a multitude of file types. rga wraps the awesome ripgrep and enables it to search in PDF, docx, sqlite, JPG, movie subtitles (mkv, mp4), etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    SemTools

    SemTools

    Semantic search and document parsing tools for the command line

    ...The project focuses on enabling developers and AI agents to process large document collections and extract meaningful semantic representations that can be searched efficiently. Built with Rust for performance and reliability, the toolchain provides fast processing of text and structured documents while maintaining low system overhead. SemTools can parse documents, build semantic embeddings, and perform similarity searches across datasets, making it useful for research, knowledge management, and AI-assisted coding workflows. The toolkit is designed to work well with modern AI pipelines, particularly those involving large language models that require structured knowledge retrieval.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Ferrite

    Ferrite

    A fast, lightweight text editor for Markdown, JSON, YAML, and TOML

    ...The editor is designed around responsiveness and low overhead, prioritizing quick startup, smooth scrolling, and predictable editing even when you are jumping between many small files. It also aims to reduce friction when reading and tweaking structured documents by offering format-aware conveniences and a UI that stays out of the way. Ferrite positions itself as a pragmatic daily driver for notes, documentation, and configuration edits, especially when you do not need a full language server stack.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    Sonic

    Sonic

    Fast, lightweight & schema-less search backend

    ...It is able to normalize language search queries, auto-complete search queries and offer the most relevant results. Being an identifier index rather than a document index, when queried it provides IDs that can be used to refer to matched documents in an external database.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    SurrealDB

    SurrealDB

    A scalable, distributed, collaborative, document-graph database

    With an SQL-style query language, real-time queries with highly-efficient related data retrieval, advanced security permissions for multi-tenant access, and support for performant analytical workloads, SurrealDB is the next generation serverless database. SurrealDB is the ultimate cloud database for tomorrow's applications. SurrealDB is an innovative NewSQL cloud database, suitable for serverless applications, jamstack applications, single-page applications, and traditional applications. It...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    CocoIndex

    CocoIndex

    ETL framework to index data for AI, such as RAG

    CocoIndex is an open-source framework designed for building powerful, local-first semantic search systems. It lets users index and retrieve content based on meaning rather than keywords, making it ideal for modern AI-based search applications. CocoIndex leverages vector embeddings and integrates with various models and frameworks, including OpenAI and Hugging Face, to provide high-quality semantic understanding. It’s built for transparency, ease of use, and local control over your search...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    LOL HTML

    LOL HTML

    Low output latency streaming HTML parser/rewriter with CSS API

    Low Output Latency streaming HTML rewriter/parser with CSS-selector based API. It is designed to modify HTML on the fly with minimal buffering. It can quickly handle very large documents, and operate in environments with limited memory resources. The crate serves as a back-end for the HTML rewriting functionality of Cloudflare Workers, but can be used as a standalone library with a convenient API for a wide variety of HTML rewriting/analysis tasks. The parser switches back to the tag scanner as soon as input leaves the scope of all selector matches. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Bionic GPT

    Bionic GPT

    Bionic is an on-premise replacement for ChatGPT

    ...The interface is intentionally familiar, offering a ChatGPT-like experience with customizable branding, fast Rust-based performance, and conversation history management. Beyond chat, Bionic focuses heavily on enterprise RAG by letting users create AI assistants that work with their own documents, share those assistants across teams, and configure embeddings, chunking, and system prompts through the UI. The platform supports a wide variety of document types, includes data isolation features for teams, and layers in security measures such as RBAC, row-level security in Postgres, strong content security policy settings, and minimal container builds.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    yek

    yek

    Serialize repositories into LLM-ready context w/ smart prioritization

    ...It can stream output when piped or save results to a temporary file, depending on usage. Configuration is handled through a yek.yaml file, allowing users to define ignore rules and priority settings. By consolidating code and documents into a single, ordered format, Yek simplifies preparing repositories for AI-driven analysis, debugging, or automation tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Spider

    Spider

    High-performance Rust web crawler and scraper for large-scale data

    ...It focuses on speed, concurrency, and reliability by using asynchronous and multi-threaded processing to handle large volumes of web pages. It can rapidly crawl websites to collect links, retrieve page content, and extract structured information from HTML documents. Spider can operate concurrently across many pages, allowing it to gather large datasets in a short period of time. Spider also provides mechanisms for subscribing to crawl events so developers can process page data such as URLs, status codes, or HTML content as it is discovered. It supports advanced capabilities such as headless browser rendering, background crawling tasks, and configurable rules that control crawl depth or ignored paths. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    rucola

    Terminal based markdown note manager

    Terminal-based markdown note manager to view statistics, explore connections and launch editing and viewing applications.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Privaxy

    Privaxy

    Privaxy is the next generation tracker and advertisement blocker

    ...By establishing a two-way tunnel between both ends, Privaxy is able to block network requests based on URL patterns and to inject scripts as well as styles into HTML documents. Operating at a lower level, Privaxy is both more efficient as well as more streamlined than browser add-on-based blockers. A single instance of Privaxy on a small virtual machine, server or even, on the same computer as the traffic is originating from, can filter thousands of requests per second while requiring a very small amount of memory.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    printpdf

    printpdf

    Rust / WASM library for reading, writing and rendering PDF

    printpdf is a Rust library for creating, reading, writing, and rendering PDF documents, providing developers with fine-grained control over document generation and layout. It supports a wide range of PDF features, including pages, layers, annotations, vector graphics, images, and embedded fonts, allowing the creation of complex and professional documents. The library emphasizes manual positioning of elements, giving developers precise control over layout and rendering rather than relying on high-level abstractions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Alabaster Theme

    Alabaster Theme

    A light theme for Visual Studio Code

    ...Standard language keywords are deliberately left uncolored under the philosophy that they are obvious and draw unnecessary attention. This restraint produces a clean, low-noise editor surface that emphasizes what changes most during editing: names, literals, and commentary. The repository documents the rationale in detail and has inspired ports to other editors, showing its appeal to developers who prefer quiet aesthetics. Community discussions in the issue tracker revolve around targeted tweaks rather than expanding the palette, consistent with the theme’s minimalist vision.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    json-rust

    json-rust

    JSON implementation in Rust

    Parse and serialize JSON with ease. JSON is a very loose format where anything goes - arrays can hold mixed types, object keys can change types between API calls or not include some keys under some conditions. Mapping that to idiomatic Rust structs introduces friction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB