Showing 238 open source projects for "open document"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 1
    Deepnote

    Deepnote

    Deepnote is a drop-in replacement for Jupyter

    ...The system supports programming languages such as Python, R, and SQL and allows users to execute and analyze data directly within interactive notebooks. Deepnote emphasizes team-based data science by enabling real-time collaboration similar to shared document editors, allowing multiple users to work simultaneously on the same notebook environment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Pathway AI Pipelines

    Pathway AI Pipelines

    Ready-to-run cloud templates for RAG

    Pathway AI Pipelines is a collection of ready-to-deploy AI pipeline templates designed to help developers rapidly build production-grade retrieval-augmented generation and enterprise search applications. The project provides end-to-end examples that connect live data sources to LLM workflows, enabling applications to stay synchronized with continuously changing information. It supports numerous connectors including local files, Google Drive, SharePoint, Kafka, PostgreSQL, and real-time APIs,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    NeMo Curator

    NeMo Curator

    Scalable data pre processing and curation toolkit for LLMs

    NeMo Curator is a Python library specifically designed for fast and scalable dataset preparation and curation for large language model (LLM) use-cases such as foundation model pretraining, domain-adaptive pretraining (DAPT), supervised fine-tuning (SFT) and paramter-efficient fine-tuning (PEFT). It greatly accelerates data curation by leveraging GPUs with Dask and RAPIDS, resulting in significant time savings. The library provides a customizable and modular interface, simplifying pipeline...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Korvus

    Korvus

    Korvus is a search SDK that unifies the entire RAG pipeline

    Korvus is an open-source retrieval-augmented generation (RAG) pipeline designed to run entirely inside PostgreSQL, allowing developers to build AI search and knowledge systems directly within a database environment. The project consolidates the typical steps of a RAG pipeline—including embedding generation, document retrieval, reranking, and text generation—into a single query executed within the Postgres ecosystem.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    Dolphin

    Dolphin

    Document Image Parsing via Heterogeneous Anchor Prompting”

    Dolphin — maintained by ByteDance — is a project aimed at providing a high-performance, robust, and extensible media or multimedia framework / player infrastructure (or possibly a streaming media solution), intended to meet modern demands for efficiency, flexibility, and integration in media-heavy applications. It seeks to combine performant media playback or handling (audio/video decoding, streaming, buffering) with a modular, developer-friendly API that allows easy embedding into larger...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    FlexLLMGen

    FlexLLMGen

    Running large language models on a single GPU

    FlexLLMGen is an open-source inference engine designed to run large language models efficiently on limited hardware resources such as a single GPU. The system focuses on high-throughput generation workloads where large batches of text must be processed quickly, such as large-scale data extraction or document analysis tasks. Instead of requiring expensive multi-GPU systems, the framework uses techniques such as memory offloading, compression, and optimized batching to run large models on commodity hardware. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    SAG

    SAG

    SQL-Driven RAG Engine

    SAG is an open-source SQL-driven retrieval-augmented generation engine that dynamically constructs knowledge graphs during query processing. Instead of relying on a static knowledge graph prepared in advance, the system automatically builds relational structures between entities while processing user queries. Documents are first decomposed into atomic semantic events, which are then represented using multidimensional natural language vectors. These vectors allow the system to identify...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    ModernBERT

    ModernBERT

    Bringing BERT into modernity via both architecture changes and scaling

    ModernBERT is an open-source research project that modernizes the classic BERT encoder architecture by incorporating recent advances in transformer design, training techniques, and efficiency improvements. The goal of the project is to bring BERT-style models up to date with the capabilities of modern large language models while preserving the strengths of bidirectional encoder architectures used for tasks such as classification, retrieval, and semantic search. ModernBERT introduces...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    DevDocs by CyberAGI

    DevDocs by CyberAGI

    Completely free, private, UI based Tech Documentation MCP server

    DevDocs is an open-source documentation server designed to provide developers with a private, structured interface for browsing and interacting with technical documentation using AI tools. The system functions as a Model Context Protocol (MCP) server that allows large language models and developer assistants to access technical documentation in a structured and efficient way. Instead of sending entire documents to a language model, DevDocs organizes documentation into sections so that only...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 10
    Superagent

    Superagent

    Superagent protects your AI applications

    Superagent is an open-source AI safety platform built to protect applications from prompt injections, data leaks, and harmful outputs. It embeds real-time safety directly into AI workflows, helping teams secure models before threats cause damage. Superagent provides guardrails that block jailbreaks, prompt manipulation, and sensitive data exfiltration. It includes redaction tools to remove PII, PHI, and secrets automatically from text. The platform also scans code repositories to detect...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    The Machine & Deep Learning Compendium

    The Machine & Deep Learning Compendium

    List of references in my private & single document

    The Machine & Deep Learning Compendium is an open-source knowledge repository that compiles summaries, references, and learning materials related to machine learning and deep learning. The project functions as a comprehensive compendium that organizes hundreds of topics covering algorithms, frameworks, research areas, and practical machine learning workflows. Originally created as a personal knowledge base, the repository evolved into a public educational resource designed to help learners...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    marqo

    marqo

    Tensor search for humans

    A tensor-based search and analytics engine that seamlessly integrates with your applications, websites, and workflows. Marqo is a versatile and robust search and analytics engine that can be integrated into any website or application. Due to horizontal scalability, Marqo provides lightning-fast query times, even with millions of documents. Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images. It can seamlessly handle image-to-image, image-to-text and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    LLM TLDR

    LLM TLDR

    95% token savings. 155x faster queries. 16 languages

    LLM TLDR is a tool that leverages large language models (LLMs) to generate concise, coherent summaries (TL;DRs) of long documents, articles, or text files, helping users quickly understand large amounts of content without reading every word. It integrates with LLM APIs to handle input texts of varying lengths and complexity, applying techniques like chunking, context management, and multi-pass summarization to preserve accuracy even when the source is very large. The system supports both...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    repo2txt

    repo2txt

    Web-based tool converts GitHub repository contents

    repo2txt is an open-source developer tool that converts the contents of a code repository into a single structured text file that can be easily consumed by large language models. The tool is designed to address the challenge of analyzing entire codebases with AI assistants, where code is normally distributed across many files and directories. By collecting repository contents and formatting them into a single text document, repo2txt allows developers to feed complete projects into AI systems for analysis, documentation, or code explanation tasks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ChatGPT Academic

    ChatGPT Academic

    ChatGPT extension for scientific research work

    ChatGPT extension for scientific research work, specially optimized academic paper polishing experience, supports custom shortcut buttons, supports custom function plug-ins, supports markdown table display, double display of Tex formulas, complete code display function, new local Python/C++/Go project tree Analysis function/Project source code self-translation ability, newly added PDF and Word document batch summary function/PDF paper full-text translation function. All buttons are...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Node.js Telegram Bot API

    Node.js Telegram Bot API

    Telegram Bot API for NodeJS

    TelegramBot is an EventEmitter that emits several events. Message, received a new incoming Message of any kind. Depending on the properties of the Message, one of these events may ALSO be emitted, text, audio, document, photo, sticker, video, voice, contact, location, new_chat_members, left_chat_member, new_chat_title, new_chat_photo, delete_chat_photo, group_chat_created, game, pinned_message, poll, dice, migrate_from_chat_id, migrate_to_chat_id, channel_chat_created,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    GLM-4.1V

    GLM-4.1V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.1V — often referred to as a smaller / lighter version of the GLM-V family — offers a more resource-efficient option for users who want multimodal capabilities without requiring large compute resources. Though smaller in scale, GLM-4.1V maintains competitive performance, particularly impressive on many benchmarks for models of its size: in fact, on a number of multimodal reasoning and vision-language tasks it outperforms some much larger models from other families. It represents a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Mini Agent

    Mini Agent

    A minimal yet professional single agent demo project

    Mini-Agent is a minimal yet production-minded demo project that shows how to build a serious command-line AI agent around the MiniMax-M2 model. It is designed both as a reference implementation and as a usable agent, demonstrating a full execution loop that includes planning, tool calls, and iterative refinement. The project exposes an Anthropic-compatible API interface and fully supports interleaved thinking, letting the agent alternate between reasoning steps and tool invocations during...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    MedicalGPT

    MedicalGPT

    MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training

    MedicalGPT training medical GPT model with ChatGPT training pipeline, implementation of Pretraining, Supervised Finetuning, Reward Modeling and Reinforcement Learning. MedicalGPT trains large medical models, including secondary pre-training, supervised fine-tuning, reward modeling, and reinforcement learning training.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    LangChain Extract

    LangChain Extract

    Did you say you like data?

    LangChain Extract is an open-source reference application designed to demonstrate how large language models can be used to extract structured data from unstructured text and document files. The project implements a lightweight web service that allows developers to define extraction schemas and apply them to various sources such as plain text, HTML, or PDF documents.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    ChatGPT Retrieval Plugin

    ChatGPT Retrieval Plugin

    The ChatGPT Retrieval Plugin lets you easily find personal documents

    The chatgpt-retrieval-plugin repository implements a semantic retrieval backend that lets ChatGPT (or GPT-powered tools) access private or organizational documents in natural language by combining vector search, embedding models, and plugin infrastructure. It can serve as a custom GPT plugin or function-calling backend so that a chat session can “look up” relevant documents based on user queries, inject those results into context, and respond more knowledgeably about a private knowledge...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Canopy

    Canopy

    Retrieval Augmented Generation (RAG) framework

    Canopy is an open-source retrieval-augmented generation (RAG) framework developed by Pinecone to simplify the process of building applications that combine large language models with external knowledge sources. The system provides a complete pipeline for transforming raw text data into searchable embeddings, storing them in a vector database, and retrieving relevant context for language model responses.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    LangChain-ChatGLM-Webui

    LangChain-ChatGLM-Webui

    Automatic question answering for local knowledge bases based on LLM

    LangChain-ChatGLM-Webui is an open-source web interface that integrates the ChatGLM large language model with the LangChain framework to create an interactive conversational AI platform. The project provides a graphical interface that allows users to interact with language models through chat sessions while also connecting those models to external knowledge sources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    RAG-Retrieval

    RAG-Retrieval

    Unify Efficient Fine-tuning of RAG Retrieval, including Embedding

    RAG-Retrieval is an open-source framework for building and training retrieval systems used in retrieval-augmented generation pipelines. Retrieval-augmented generation combines large language models with external knowledge retrieval to improve factual accuracy and domain-specific reasoning. This repository provides end-to-end infrastructure for training retrieval models, performing inference, and distilling embedding models for improved performance. It includes implementations of modern...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    AI File Sorter

    AI File Sorter

    Local AI file organization with categorization and rename suggestions

    AI File Sorter is a cross-platform desktop application that uses AI (local LLMs run on your computer) to organize files and suggest meaningful file names based on real content, not just filenames or extensions. The app can analyze images locally and propose descriptive rename suggestions (for example, IMG_2048.jpg → clouds_over_lake.jpg). It can also analyze document text to improve categorization and renaming. Supported formats include PDF, DOCX, XLSX, PPTX, ODT, ODS, ODP, and common...
    Leader badge
    Downloads: 328 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB