7 projects for "llama.cpp python" with 1 filter applied:

  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    llama.cpp

    llama.cpp

    Port of Facebook's LLaMA model in C/C++

    The llama.cpp project enables the inference of Meta's LLaMA model (and other models) in pure C/C++ without requiring a Python runtime. It is designed for efficient and fast model execution, offering easy integration for applications needing LLM-based capabilities. The repository focuses on providing a highly optimized and portable implementation for running large language models directly within C/C++ environments.
    Downloads: 123 This Week
    Last Update:
    See Project
  • 2
    GPT4All

    GPT4All

    Run Local LLMs on Any Device. Open-source

    ...The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This project also supports Python integrations for easy automation and customization. GPT4All is ideal for individuals and businesses seeking private, offline access to powerful LLMs.
    Downloads: 145 This Week
    Last Update:
    See Project
  • 3
    OuteTTS

    OuteTTS

    Interface for OuteTTS models

    ...It provides a high-level Interface API that wraps model configuration, speaker handling, and audio generation so you can focus on integrating speech into your application rather than wiring up low-level engines. The project supports multiple backends including llama.cpp (Python bindings and server), Hugging Face Transformers, ExLlamaV2, VLLM and a JavaScript interface via Transformers.js, allowing it to run on CPUs, NVIDIA CUDA GPUs, AMD ROCm, Vulkan-capable GPUs, and Apple Metal. It also includes a notion of speaker profiles: you can create a speaker from a short audio sample, save it as JSON, and reuse it for consistent voice identity across generations and sessions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Harbor LLM

    Harbor LLM

    Run a full local LLM stack with one command using Docker

    Harbor is an open source, containerized toolkit designed to simplify running local large language model (LLM) environments. It combines a CLI and companion app to launch backends, frontends, and supporting services with minimal setup. With a single command, users can start preconfigured tools like Ollama and Open WebUI, enabling chat, workflows, and integrations immediately. Harbor supports multiple inference engines, including llama.cpp and vLLM, and connects them seamlessly to user...
    Downloads: 10 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    Backtrack Sampler

    Backtrack Sampler

    An easy-to-understand framework for LLM samplers

    Backtrack Sampler is a framework designed for experimenting with custom sampling strategies for language models (LLMs), enabling the ability to rewind and revise generated tokens. It allows developers to create and test their own token generation strategies by providing a base structure for manipulating logits and probabilities, making it a flexible tool for those interested in fine-tuning the behavior of LLMs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Orpheus TTS

    Orpheus TTS

    Towards Human-Sounding Speech

    Orpheus TTS is a state-of-the-art open-source text-to-speech system built on a Llama-3B backbone, treating speech synthesis as a large language model problem instead of a traditional TTS pipeline. It is designed to produce human-like speech with natural intonation, emotion, and rhythm, targeting quality comparable to or better than many closed-source systems. The project ships both pretrained and finetuned English models, as well as a family of multilingual models released as a research...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    Mellum-4b-base

    Mellum-4b-base

    JetBrains’ 4B parameter code model for completions

    ...With a context window of 8,192 tokens, it excels at code completion, fill-in-the-middle tasks, and intelligent code suggestions for professional developer tools and IDEs. The model is efficient for both cloud inference with vLLM and local deployment using llama.cpp or Ollama, thanks to its bf16 precision and AMP training. While the base model is not fine-tuned for downstream tasks, it is designed to be easily adapted through supervised fine-tuning (SFT) or reinforcement learning (RL). Benchmarks on RepoBench, SAFIM, and HumanEval demonstrate its competitive performance, with specialized fine-tuned versions for Python already showing strong improvements.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB