Showing 14 open source projects for "document code"

View related business solutions
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 1
    DocETL

    DocETL

    A system for agentic LLM-powered data processing and ETL

    ...The platform allows developers and researchers to construct structured workflows that extract, transform, and organize information from sources such as reports, transcripts, legal documents, and other text-heavy data. Instead of relying on single prompts or ad-hoc scripts, DocETL provides a declarative pipeline framework that breaks complex document analysis tasks into manageable operations that can be optimized and orchestrated automatically. Pipelines are typically defined using a low-code YAML interface, giving users full control over prompts and processing steps while still simplifying workflow creation.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    LongBench

    LongBench

    LongBench v2 and LongBench (ACL 25'&24')

    ...Traditional language model benchmarks typically evaluate tasks involving relatively short inputs, which does not reflect many real-world applications such as analyzing large documents or entire code repositories. LongBench addresses this gap by providing datasets that require models to process and reason over long sequences of text across multiple tasks. The benchmark includes multiple categories such as single-document question answering, multi-document reasoning, summarization, long dialogue understanding, and code analysis. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    AnythingLLM

    AnythingLLM

    The all-in-one Desktop & Docker AI application with full RAG and AI

    A full-stack application that enables you to turn any document, resource, or piece of content into a context that any LLM can use as references during chatting. This application allows you to pick and choose which LLM or Vector Database you want to use as well as supporting multi-user management and permissions. AnythingLLM is a full-stack application where you can use commercial off-the-shelf LLMs or popular open-source LLMs and vectorDB solutions to build a private ChatGPT with no...
    Downloads: 143 This Week
    Last Update:
    See Project
  • 4
    Lemon AI

    Lemon AI

    Full-stack Open-source Self-Evolving General AI Agent

    ...The system includes a multi-agent architecture that supports planning, action execution, reflection, and memory, allowing the agent to reason through tasks and refine results iteratively. A key component of the framework is a virtual machine sandbox environment that safely executes code generated by the agent without affecting the host system.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 5
    Generative AI Use Cases (GenU)

    Generative AI Use Cases (GenU)

    Application implementation with business use cases

    ...The project collects a wide range of real-world scenarios that demonstrate how organizations can use large language models and generative AI services within cloud-based architectures. Each example typically includes infrastructure templates, backend services, and application code that show how to integrate generative AI capabilities with other AWS services. These examples cover tasks such as document analysis, conversational assistants, content generation, and knowledge retrieval systems. The repository is intended to serve as both a learning resource and a starting point for developers who want to deploy generative AI solutions using AWS infrastructure.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    gptcommit

    gptcommit

    A git prepare-commit-msg hook for authoring commit messages with GPT-3

    ...Commit messages are a key channel for developers to communicate their work with others, especially in code reviews. When making complex code changes, it can be tedious to thoroughly document the contents of each change. I often felt the impulse to just title my commit “fix bug” and move on. Surfacing these changes with gptcommit helps the author and reviewer by bringing attention to these additional changes.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    repo2txt

    repo2txt

    Web-based tool converts GitHub repository contents

    repo2txt is an open-source developer tool that converts the contents of a code repository into a single structured text file that can be easily consumed by large language models. The tool is designed to address the challenge of analyzing entire codebases with AI assistants, where code is normally distributed across many files and directories. By collecting repository contents and formatting them into a single text document, repo2txt allows developers to feed complete projects into AI systems for analysis, documentation, or code explanation tasks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    LangChain for Java

    LangChain for Java

    LangChain4j is an open-source Java library

    ...The library provides a unified API that allows developers to connect Java applications to multiple AI providers and embedding databases without having to implement separate integrations for each service. Its architecture includes abstractions for prompts, chat interactions, document processing, embeddings, and vector storage, enabling developers to build complex AI workflows with minimal boilerplate code. LangChain4j also implements common design patterns used in generative AI systems, such as retrieval-augmented generation pipelines, tool calling, and intelligent agent frameworks. These abstractions allow developers to orchestrate interactions between language models, external tools, and knowledge bases in a structured and scalable way.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    Qwen3-VL

    Qwen3-VL

    Qwen3-VL, the multimodal large language model series by Alibaba Cloud

    Qwen3-VL is the latest multimodal large language model series from Alibaba Cloud’s Qwen team, designed to integrate advanced vision and language understanding. It represents a major upgrade in the Qwen lineup, with stronger text generation, deeper visual reasoning, and expanded multimodal comprehension. The model supports dense and Mixture-of-Experts (MoE) architectures, making it scalable from edge devices to cloud deployments, and is available in both instruction-tuned and...
    Downloads: 4 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    ModernBERT

    ModernBERT

    Bringing BERT into modernity via both architecture changes and scaling

    ModernBERT is an open-source research project that modernizes the classic BERT encoder architecture by incorporating recent advances in transformer design, training techniques, and efficiency improvements. The goal of the project is to bring BERT-style models up to date with the capabilities of modern large language models while preserving the strengths of bidirectional encoder architectures used for tasks such as classification, retrieval, and semantic search. ModernBERT introduces...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    RAG from Scratch

    RAG from Scratch

    Demystify RAG by building it from scratch

    RAG From Scratch is an educational open-source project designed to teach developers how retrieval-augmented generation systems work by building them step by step. Instead of relying on complex frameworks or cloud services, the repository demonstrates the entire RAG pipeline using transparent and minimal implementations. The project walks through key concepts such as generating embeddings, building vector databases, retrieving relevant documents, and integrating the retrieved context into...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    LLM TLDR

    LLM TLDR

    95% token savings. 155x faster queries. 16 languages

    LLM TLDR is a tool that leverages large language models (LLMs) to generate concise, coherent summaries (TL;DRs) of long documents, articles, or text files, helping users quickly understand large amounts of content without reading every word. It integrates with LLM APIs to handle input texts of varying lengths and complexity, applying techniques like chunking, context management, and multi-pass summarization to preserve accuracy even when the source is very large. The system supports both...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Superagent

    Superagent

    Superagent protects your AI applications

    Superagent is an open-source AI safety platform built to protect applications from prompt injections, data leaks, and harmful outputs. It embeds real-time safety directly into AI workflows, helping teams secure models before threats cause damage. Superagent provides guardrails that block jailbreaks, prompt manipulation, and sensitive data exfiltration. It includes redaction tools to remove PII, PHI, and secrets automatically from text. The platform also scans code repositories to detect...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    EvaDB

    EvaDB

    Database system for building simpler and faster AI-powered application

    Over the last decade, AI models have radically changed the world of natural language processing and computer vision. They are accurate on various tasks ranging from question answering to object tracking in videos. To use an AI model, the user needs to program against multiple low-level libraries, like PyTorch, Hugging Face, Open AI, etc. This tedious process often leads to a complex AI app that glues together these libraries to accomplish the given task. This programming complexity prevents...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo