Showing 2374 open source projects for "open document"

View related business solutions
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 1
    Cherry Studio

    Cherry Studio

    Cherry Studio is a desktop client that supports for multiple LLMs

    Cherry Studio is a cross-platform desktop client that integrates multiple large language model providers into a unified interface for creating and using AI assistants, supporting customization and multi-model conversations. Selection Assistant with smart content selection enhancement. Deep Research with advanced research capabilities. Memory System with global context awareness. Document Preprocessing with improved document handling. MCP Marketplace for Model Context Protocol ecosystem.
    Downloads: 42 This Week
    Last Update:
    See Project
  • 2
    Collabora Online

    Collabora Online

    Collabora Online is a collaborative online office suite

    Collabora Online is a powerful online office suite that you can integrate into your own infrastructure or access via one of our trusted hosting Partners. Your digital sovereignty is our priority. We provide you with all the tools to keep your data secure, without compromising on features. Collabora Online’s text document editor provides a true WYSIWYG editing experience, making visualizing your document layout incredibly easy. Open any document, add comments and track changes from anywhere, with anyone. Format and style your pages with endless options. From simple spreadsheets and calculations to advanced formulas, Calc can do it all. Create giant spreadsheets with up to 16k columns, and add charts, sparklines, and hyperlinks. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Kivik

    Kivik

    Common interface to CouchDB or CouchDB-like databases for Go

    Kivik is a Go client library for interacting with CouchDB and PouchDB databases, providing an abstraction layer for NoSQL document storage and retrieval. It simplifies database operations for Go developers.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    PDFIO.jl

    PDFIO.jl

    PDF Reader Library for Native Julia.

    PDFIO is a native Julia implementation for reading PDF files. It's a 100% Julia implementation of the PDF specification. Other than a few well-established algorithms like flate decode (zlib library) or cryptographic operations (OpenSSL library) almost all of the APIs are written in native Julia. PDF files are in existence for over three decades. Implementations of the PDF writers are not always to the specification or they may even vary significantly from vendor to vendor. Every time, you...
    Downloads: 5 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    OpenDataLoader PDF

    OpenDataLoader PDF

    PDF Parser for AI-ready data. Automate PDF accessibility

    OpenDataLoader PDF is an open-source document processing system designed to convert complex PDF files into structured, AI-ready formats such as Markdown, JSON, and HTML while preserving layout, hierarchy, and semantic meaning. It focuses on enabling downstream use cases like retrieval-augmented generation (RAG), knowledge extraction, and document intelligence pipelines by maintaining accurate reading order and spatial metadata through bounding boxes.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    RavenDB

    RavenDB

    ACID Document Database

    A NoSQL document database designed for high-performance, real-time applications with built-in distributed capabilities.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    monolith

    monolith

    CLI tool for saving complete web pages as a single HTML file

    A data hoarder’s dream come true, bundle any web page into a single HTML file. You can finally replace that gazillion of open tabs with a gazillion of .html files stored somewhere on your precious little drive. Unlike the conventional “Save page as”, monolith not only saves the target document, it embeds CSS, image, and JavaScript assets all at once, producing a single HTML5 document that is a joy to store and share. If compared to saving websites with wget -mpk, this tool embeds all assets as data URLs and therefore lets browsers render the saved page exactly the way it was on the Internet, even when no network connection is available.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    dots.ocr

    dots.ocr

    Multilingual Document Layout Parsing in a Single Vision-Language Model

    dots.ocr is a cutting-edge multilingual document parsing system built on a unified vision-language model that combines layout detection, text recognition, and structural understanding into a single architecture. Unlike traditional OCR pipelines that rely on multiple specialized components, dots.ocr integrates these processes end-to-end, reducing error propagation and improving consistency across tasks. The model is designed to recognize virtually any human script, making it highly effective...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    TinyDB

    TinyDB

    Document oriented database optimized for you

    TinyDB is a lightweight document oriented database optimized for your happiness :) It's written in pure Python and has no external dependencies. The target are small apps that would be blown away by a SQL-DB or an external database server. The current source code has 1800 lines of code (with about 40% documentation) and 1600 lines tests. Like MongoDB, you can store any document (represented as dict) in TinyDB. TinyDB is designed to be simple and fun to use by providing a simple and clean...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 10
    Cloud Firestore

    Cloud Firestore

    Node.js client for Google Cloud Firestore: a NoSQL document database

    The official Firestore client for Node.js, enabling seamless interaction with Google Cloud Firestore, a NoSQL document database optimized for real-time applications.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    WordPerfect Document importer
    Library for reading Corel WordPerfect(tm) documents.
    Leader badge
    Downloads: 373 This Week
    Last Update:
    See Project
  • 12
    Paperless-ngx

    Paperless-ngx

    A community-supported supercharged version of paperless

    Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.
    Downloads: 29 This Week
    Last Update:
    See Project
  • 13
    Elasticsearch MCP Server

    Elasticsearch MCP Server

    A Model Context Protocol (MCP) server implementation

    This MCP server implementation provides interaction capabilities with Elasticsearch and OpenSearch, enabling functionalities such as document searching, index analysis, and cluster management through a set of tools. ​
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Nitrite Database

    Nitrite Database

    NoSQL embedded document store for Java

    Nitrite is an embedded NoSQL database for Java applications, offering lightweight document storage with indexing and query capabilities.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    AnythingLLM

    AnythingLLM

    The all-in-one Desktop & Docker AI application with full RAG and AI

    A full-stack application that enables you to turn any document, resource, or piece of content into a context that any LLM can use as references during chatting. This application allows you to pick and choose which LLM or Vector Database you want to use as well as supporting multi-user management and permissions. AnythingLLM is a full-stack application where you can use commercial off-the-shelf LLMs or popular open-source LLMs and vectorDB solutions to build a private ChatGPT with no compromises that you can run locally as well as host remotely and be able to chat intelligently with any documents you provide it. ...
    Downloads: 107 This Week
    Last Update:
    See Project
  • 16
    MarkPDFDown

    MarkPDFDown

    A high-quality PDF to Markdown tool based on large language model

    MarkPDFdown is an open-source document processing tool designed to convert PDF files into structured Markdown output that can be easily used for documentation, content pipelines, and AI processing workflows. The project focuses on extracting text, formatting, and structural information from complex PDF documents and transforming that information into clean Markdown that preserves the original hierarchy of headings, paragraphs, tables, and lists.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    JSONView

    JSONView

    A web extension that helps you view JSON documents in the browser

    A web extension that helps you view JSON documents in the browser. Normally when encountering a JSON document (content type application/json), Firefox simply prompts you to download the view. With the JSONView extension, JSON documents are shown in the browser similar to how XML documents are shown. The document is formatted, highlighted, and arrays and objects can be collapsed. Even if the JSON document contains errors, JSONView will still show the raw text. JSONView is a Web extension...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 18
    Frappe

    Frappe

    Low code web framework for real world applications

    Frappe is a full-stack, low-code web framework written in Python and JavaScript, used to build scalable and modular enterprise applications. It powers ERPNext and includes tools for REST APIs, user management, document modeling, workflows, and real-time updates. Frappe uses a "model-view-controller" approach with its own ORM and frontend system, enabling rapid development without sacrificing control or performance.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    PHP7

    PHP7

    PHP7 / Laravel Multi-format Streaming Parser

    When it comes to parsing XML/CSV/JSON/... documents, there are 2 approaches to consider. DOM loading loads all the documents, making it easy to navigate and parse, and as such provides maximum flexibility for developers. Streaming implies iterating through the document, acts like a cursor, and stops at each element in its way, thus avoiding memory overkill. Thus, when it comes to big files, callbacks will be executed meanwhile file is downloading and will be much more efficient as far as...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 20
    iText

    iText

    iText for Java represents the next level of SDKs for developers

    iText for Java represents the next level of SDKs for developers who want to take advantage of the benefits PDF can bring. Equipped with a better document engine, high and low-level programming capabilities and the ability to create, edit, and enhance PDF documents, iText can be a boon to nearly every workflow. iText Suite refers to the complete line of products comprising the open-source iText Core PDF library and its add-ons. The iText Suite is a fully-featured SDK for PDF development that allows you to seamlessly embed extensive PDF functionality into your software or workflows. ...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 21
    NeMo Retriever Library

    NeMo Retriever Library

    Document content and metadata extraction microservice

    NeMo Retriever Library is a scalable microservice framework designed for extracting, structuring, and enriching content from documents to support downstream generative AI applications. It processes various document types by splitting them into components such as text, tables, charts, and images, and then applies OCR and contextual analysis to convert them into structured data formats. The system is built on NVIDIA NIM microservices, enabling high-performance parallel processing and efficient...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    OnlyOffice Web

    OnlyOffice Web

    Perform common file preview and editing via the web

    OnlyOffice Web is a browser-based document editing platform built on top of OnlyOffice that allows users to view and edit files entirely on the client side without requiring a backend server. It is designed with a privacy-first approach, ensuring that all document processing occurs locally in the browser, which prevents sensitive data from being uploaded or stored externally. The application supports a wide range of file formats, including DOCX, XLSX, PPTX, and CSV, making it versatile for...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 23
    AI-Media2Doc

    AI-Media2Doc

    AI tool converting video/audio into structured documents instantly

    AI-Media2Doc is a web-based application that uses large language models to convert video and audio content into structured, readable documents in a single workflow. It is designed to transform multimedia inputs into formats such as knowledge notes, summaries, mind maps, and social-style articles, making content easier to review and reuse. AI-Media2Doc emphasizes privacy by processing media locally in the browser using WebAssembly-based ffmpeg, ensuring that original video files are not...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 24
    DocETL

    DocETL

    A system for agentic LLM-powered data processing and ETL

    DocETL is an open-source system designed to build and execute data processing pipelines powered by large language models, particularly for analyzing complex collections of documents and unstructured datasets. The platform allows developers and researchers to construct structured workflows that extract, transform, and organize information from sources such as reports, transcripts, legal documents, and other text-heavy data.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    llmware

    llmware

    Unified framework for building enterprise RAG pipelines

    llmware is an open source framework designed to simplify the creation of enterprise-grade applications powered by large language models. The platform focuses on building secure and private AI workflows that can run locally on laptops, edge devices, or self-hosted servers without relying exclusively on cloud APIs. It provides a unified interface for constructing retrieval-augmented generation pipelines, agent workflows, and document intelligence applications.
    Downloads: 3 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB