• MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    MarkPDFDown

    MarkPDFDown

    A high-quality PDF to Markdown tool based on large language model

    ...The project focuses on extracting text, formatting, and structural information from complex PDF documents and transforming that information into clean Markdown that preserves the original hierarchy of headings, paragraphs, tables, and lists. By producing Markdown rather than raw text, the tool makes it easier to integrate documents into knowledge bases, documentation systems, or language model pipelines that rely on structured input. The software is particularly useful for developers working with technical documents, academic papers, or reports that need to be indexed, summarized, or processed by downstream AI systems.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 2
    Instructor Python

    Instructor Python

    Structured outputs for llms

    Instructor is a Python library that bridges OpenAI responses with structured data validation using Pydantic models. It lets developers specify expected output schemas and ensures that the responses from OpenAI APIs are automatically parsed and validated against those models. This makes integrating LLMs into structured workflows safer and more predictable, especially in production applications.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    kg-gen

    kg-gen

    Knowledge Graph Generation from Any Text

    ...Instead of relying on traditional rule-based extraction techniques, KG-Gen uses language models to identify entities and their relationships, producing higher-quality graph structures from raw text. The framework addresses common problems in automatic knowledge graph construction, particularly sparsity and duplication of entities, by applying a clustering and entity-resolution process that merges semantically similar nodes. This allows the generated graphs to be denser, more coherent, and easier to use for downstream tasks such as retrieval-augmented generation, semantic search, and reasoning systems.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Engram

    Engram

    A New Axis of Sparsity for Large Language Models

    ...Engineered with speed and memory efficiency in mind, Engram supports batched indexing, incremental updates, and custom distance metrics so developers can tailor search behaviors to their domain’s needs. In addition to raw similarity search, the project includes tools for clustering, ranking, and filtering results, enabling richer user experiences like “related content”, semantic auto-completion, and contextual filtering.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    llm.c

    llm.c

    LLM training in simple, raw C/CUDA

    llm.c is a minimalist, systems-level implementation of a small transformer-based language model in C that prioritizes clarity and educational value. By stripping away heavy frameworks, it exposes the core math and memory flows of embeddings, attention, and feed-forward layers. The code illustrates how to wire forward passes, losses, and simple training or inference loops with direct control over arrays and buffers. Its compact design makes it easy to trace execution, profile hotspots, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    LLM-Aided OCR Project

    LLM-Aided OCR Project

    Enhances Tesseract OCR output using LLMs (local or API)

    ...The project addresses common OCR challenges such as distorted text, unusual fonts, historical documents, and complex layouts that often produce inaccurate results with standard OCR pipelines. The system first extracts raw text using OCR engines and then applies language models to analyze and correct recognition errors based on context. This AI-assisted correction process helps reconstruct missing characters, fix formatting mistakes, and produce more coherent text outputs. The project is particularly useful for digitizing historical documents, research papers, and scanned materials where traditional OCR often struggles. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Deep Lake

    Deep Lake

    Data Lake for Deep Learning. Build, manage, and query datasets

    ...Use one API to upload, download, and stream datasets to/from AWS S3/S3-compatible storage, GCP, Activeloop cloud, or local storage. Store images, audios and videos in their native compression. Deeplake automatically decompresses them to raw data only when needed, e.g., when training a model. Treat your cloud datasets as if they are a collection of NumPy arrays in your system's memory. Slice them, index them, or iterate through them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Canopy

    Canopy

    Retrieval Augmented Generation (RAG) framework

    Canopy is an open-source retrieval-augmented generation (RAG) framework developed by Pinecone to simplify the process of building applications that combine large language models with external knowledge sources. The system provides a complete pipeline for transforming raw text data into searchable embeddings, storing them in a vector database, and retrieving relevant context for language model responses. It is designed to handle many of the complex components required for a RAG workflow, including document chunking, embedding generation, prompt construction, and chat history management. Developers can use Canopy to quickly build chat systems that answer questions using their own data instead of relying solely on the pretrained knowledge of the language model. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    autollm

    autollm

    Ship RAG based LLM web apps in seconds

    ...The framework also includes built-in readers for multiple content sources such as PDFs, DOCX files, notebooks, websites, and other document types, which helps shorten the time between raw data and a working knowledge application.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    BIG-bench

    BIG-bench

    Beyond the Imitation Game collaborative benchmark for measuring

    ...The suite provides a common JSON task format and an evaluation harness so research groups can contribute new tasks and reproduce results consistently. It emphasizes robustness analysis—looking at scale trends, calibration, and areas where models systematically fail—to guide model development beyond raw accuracy. BIG-bench is as much a community process as a dataset, encouraging open sharing of tasks and findings to keep evaluations fresh and comprehensive.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB