17 projects for "open document" with 2 filters applied:

  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    GLM-OCR

    GLM-OCR

    Accurate × Fast × Comprehensive

    GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B), enabling deployment in high-concurrency services and edge environments. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 2
    GLM-4.6V

    GLM-4.6V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.6V represents the latest generation of the GLM-V family and marks a major step forward in multimodal AI by combining advanced vision-language understanding with native “tool-call” capabilities, long-context reasoning, and strong generalization across domains. Unlike many vision-language models that treat images and text separately or require intermediate conversions, GLM-4.6V allows inputs such as images, screenshots or document pages directly as part of its reasoning pipeline — and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    GLM-4.5V

    GLM-4.5V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    ...It embodies the design philosophy of mixing visual and textual modalities into a unified model capable of general-purpose reasoning, content understanding, and generation, while already supporting a wide variety of tasks: from image captioning and visual question answering to content recognition, GUI-based agents, video understanding, and long-document interpretation. GLM-4.5V emerged from a training framework that leverages scalable reinforcement learning (with curriculum sampling) to boost performance across tasks ranging from STEM problem solving to long-context reasoning, giving it broad applicability beyond narrow benchmarks. When it was released, it achieved state-of-the-art results on a large collection of public multimodal benchmarks for open-source models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 5
    FinGPT

    FinGPT

    Open-Source Financial Large Language Models

    ...The platform typically includes tools for fine-tuning, context engineering, and prompt templating, enabling users to build specialized assistants for tasks like sentiment analysis, earnings summary generation, risk profiling, trading signal interpretation, and document extraction from financial reports.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    GLM-4.1V

    GLM-4.1V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.1V — often referred to as a smaller / lighter version of the GLM-V family — offers a more resource-efficient option for users who want multimodal capabilities without requiring large compute resources. Though smaller in scale, GLM-4.1V maintains competitive performance, particularly impressive on many benchmarks for models of its size: in fact, on a number of multimodal reasoning and vision-language tasks it outperforms some much larger models from other families. It represents a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    ChatGPT Retrieval Plugin

    ChatGPT Retrieval Plugin

    The ChatGPT Retrieval Plugin lets you easily find personal documents

    The chatgpt-retrieval-plugin repository implements a semantic retrieval backend that lets ChatGPT (or GPT-powered tools) access private or organizational documents in natural language by combining vector search, embedding models, and plugin infrastructure. It can serve as a custom GPT plugin or function-calling backend so that a chat session can “look up” relevant documents based on user queries, inject those results into context, and respond more knowledgeably about a private knowledge...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Nemotron 3

    Nemotron 3

    Large language model developed and released by NVIDIA

    ...This configuration supports a massive context length of up to 1 million tokens, making it suitable for long-context reasoning, agentic tasks, extended dialogues, and applications like code generation or document summarization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Nemotron 3 Nano

    Nemotron 3 Nano

    LL model providing reasoning and conversational capabilities

    ...This architecture allows the system to maintain strong reasoning capabilities while improving throughput and reducing the computational cost associated with large context processing. The model is designed as a general-purpose language system capable of handling tasks such as chat interaction, coding assistance, document analysis, and instruction following.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 10
    Mistral Large 3 675B Instruct 2512

    Mistral Large 3 675B Instruct 2512

    Frontier-scale 675B multimodal instruct MoE model for enterprise AIMis

    Mistral Large 3 675B Instruct 2512 is a state-of-the-art multimodal granular Mixture-of-Experts model featuring 675B total parameters and 41B active parameters, trained from scratch on 3,000 H200 GPUs. As the instruct-tuned FP8 variant, it is optimized for reliable instruction following, agentic workflows, production-grade assistants, and long-context enterprise tasks. It incorporates a massive 673B-parameter language MoE backbone and a 2.5B-parameter vision encoder, enabling rich multimodal...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Qwen3.6-35B-A3B

    Qwen3.6-35B-A3B

    Open multimodal model for coding, agents, and long-context tasks

    ...Architecturally, it uses a Mixture-of-Experts design with 35B total parameters and 3B active, supports a native 262K-token context window, and can be extended to about 1M tokens with YaRN. It also performs strongly across coding, agent, vision, reasoning, and document-understanding benchmarks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    DeepSeek-V4-Pro

    DeepSeek-V4-Pro

    Flagship MoE model for advanced reasoning, coding, and agents

    DeepSeek-V4-Pro is a flagship open-weight Mixture-of-Experts language model designed for high-performance reasoning, coding, and agent-based workflows at scale. It features approximately 1.6 trillion total parameters with around 49B activated during inference, enabling strong efficiency while maintaining frontier-level capability. The model supports an ultra-long context window of up to 1 million tokens, making it highly suitable for long-document reasoning, large codebases, and complex multi-step tasks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    MiniMax-M2.7

    MiniMax-M2.7

    Self-evolving AI model for agents, coding, and complex workflows

    MiniMax-M2.7 is a large-scale open-weight language model designed for advanced agent-based workflows, professional software engineering, and complex productivity tasks. With 229B parameters, it introduces a self-evolution framework in which the model actively improves its own capabilities by updating memory, generating skills, and iterating through reinforcement learning experiments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    DeepSeek-V4-Flash

    DeepSeek-V4-Flash

    Efficient MoE model for million-token reasoning and coding

    DeepSeek-V4-Flash is a preview Mixture-of-Experts language model built for efficient million-token context intelligence. It has 284B total parameters with 13B activated and supports a 1M-token context window, making it suitable for long-document reasoning, complex coding, agentic workflows, and large-scale information processing. The model uses a hybrid attention architecture that combines Compressed Sparse Attention and Heavily Compressed Attention to improve long-context efficiency, while...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Ministral 3 3B Base 2512

    Ministral 3 3B Base 2512

    Small 3B-base multimodal model ideal for custom AI on edge hardware

    Ministral 3 3B Base 2512 is the smallest model in the Ministral 3 family, offering a compact yet capable multimodal architecture suited for lightweight AI applications. It combines a 3.4B-parameter language model with a 0.4B vision encoder, enabling both text and image understanding in a tiny footprint. As the base pretrained model, it is not fine-tuned for instructions or reasoning, making it the ideal foundation for custom post-training, domain adaptation, or specialized downstream tasks....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Ministral 3 3B Reasoning 2512

    Ministral 3 3B Reasoning 2512

    Compact 3B-param multimodal model for efficient on-device reasoning

    Ministral 3 3B Reasoning 2512 is the smallest reasoning-capable model in the Ministal-3 family, yet delivers a surprisingly capable multimodal and multilingual base for lightweight AI applications. It pairs a 3.4B-parameter language model with a 0.4B-parameter vision encoder, enabling it to understand both text and image inputs. This reasoning-tuned variant is optimized for tasks like math, coding, and other STEM-related problem solving, making it suitable for applications that require...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Ministral 3 14B Base 2512

    Ministral 3 14B Base 2512

    Powerful 14B-base multimodal model — flexible base for fine-tuning

    Ministral 3 14B Base 2512 is the largest model in the Ministral 3 line, offering state-of-the-art language and vision capabilities in a dense, base-pretrained form. It combines a 13.5B-parameter language model with a 0.4B-parameter vision encoder, enabling both high-quality text understanding/generation and image-aware tasks. As a “base” model (i.e. not fine-tuned for instruction or reasoning), it provides a flexible foundation ideal for custom fine-tuning or downstream specialization. The...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB