Showing 76 open source projects for "llama-cpp-python.whl"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    bert4torch

    bert4torch

    An elegent pytorch implement of transformers

    An elegant PyTorch implement of transformers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Jan.ai

    Jan.ai

    Open source alternative to ChatGPT that runs 100% offline

    ...It allows you to download and run LLMs (local language models) offline while also offering optional integration with cloud-based model providers—giving you full control over your data and AI interactions. Download and run LLMs (Llama, Gemma, Qwen, GPT-oss etc.) from HuggingFace. Connect to GPT models via OpenAI, Claude models via Anthropic, Mistral, Groq, and others. Create specialized AI assistants for your tasks. MCP integration for agentic capabilities.
    Downloads: 32 This Week
    Last Update:
    See Project
  • 3
    Eliza

    Eliza

    Autonomous agents for everyone

    ...Full support for voice, text, and media interactions. Built-in RAG memory system, document processing, media analysis, and autonomous trading capabilities. Supports multiple AI models including Llama, GPT-4, and Claude. Create custom actions, add new platform integrations, and extend functionality through a modular plugin system. Full TypeScript support.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    DeepSeek R1

    DeepSeek R1

    Open-source, high-performance AI model with advanced reasoning

    ...This approach has resulted in performance comparable to leading models like OpenAI's o1, while maintaining cost-efficiency. To further support the research community, DeepSeek has released distilled versions of the model based on architectures such as LLaMA and Qwen.
    Downloads: 70 This Week
    Last Update:
    See Project
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 5
    Orpheus TTS

    Orpheus TTS

    Towards Human-Sounding Speech

    Orpheus TTS is a state-of-the-art open-source text-to-speech system built on a Llama-3B backbone, treating speech synthesis as a large language model problem instead of a traditional TTS pipeline. It is designed to produce human-like speech with natural intonation, emotion, and rhythm, targeting quality comparable to or better than many closed-source systems. The project ships both pretrained and finetuned English models, as well as a family of multilingual models released as a research preview, and includes data-processing scripts so users can train or finetune their own variants. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    GPT4All

    GPT4All

    Run Local LLMs on Any Device. Open-source

    GPT4All is an open-source project that allows users to run large language models (LLMs) locally on their desktops or laptops, eliminating the need for API calls or GPUs. The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This...
    Downloads: 112 This Week
    Last Update:
    See Project
  • 7
    llamafile

    llamafile

    Distribute and run LLMs with a single file

    ...We're doing that by combining llama.cpp with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a "llamafile") that runs locally on most computers, with no installation. The easiest way to try it for yourself is to download our example llamafile for the LLaVA model (license: LLaMA 2, OpenAI). LLaVA is a new LLM that can do more than just chat; you can also upload images and ask it questions about them. With llamafile, this all happens locally; no data ever leaves your computer.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 8
    LocalAI

    LocalAI

    Self-hosted, community-driven, local OpenAI compatible API

    ...Drop-in replacement for OpenAI running LLMs on consumer-grade hardware. Free Open Source OpenAI alternative. No GPU is required. Runs ggml, GPTQ, onnx, TF compatible models: llama, gpt4all, rwkv, whisper, vicuna, koala, gpt4all-j, cerebras, falcon, dolly, starcoder, and many others. LocalAI is a drop-in replacement REST API that’s compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer-grade hardware, supporting multiple model families that are compatible with the ggml format. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 9
    Axolotl

    Axolotl

    Go ahead and axolotl questions

    Axolotl is a powerful and flexible framework for fine-tuning large language models on custom datasets. Built for researchers and developers, Axolotl simplifies the process of adapting LLMs for specific tasks, including chat, code generation, and instruction following. It supports a wide variety of model architectures and offers out-of-the-box optimization strategies for efficient training.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • 10
    SillyTavern

    SillyTavern

    LLM Frontend for Power Users

    Mobile-friendly, Multi-API (KoboldAI/CPP, Horde, NovelAI, Ooba, OpenAI, OpenRouter, Claude, Scale), VN-like Waifu Mode, Horde SD, System TTS, WorldInfo (lorebooks), customizable UI, auto-translate, and more prompt options than you'd ever want or need. Optional Extras server for more SD/TTS options + ChromaDB/Summarize. SillyTavern is a user interface you can install on your computer (and Android phones) that allows you to interact with text generation AIs and chat/roleplay with characters you or the community create. ...
    Downloads: 137 This Week
    Last Update:
    See Project
  • 11
    Lepton AI

    Lepton AI

    A Pythonic framework to simplify AI service building

    A Pythonic framework to simplify AI service building. Cutting-edge AI inference and training, unmatched cloud-native experience, and top-tier GPU infrastructure. Ensure 99.9% uptime with comprehensive health checks and automatic repairs.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    OpenLLM

    OpenLLM

    Operating LLMs in production

    ...With OpenLLM, you can run inference with any open-source large-language models, deploy to the cloud or on-premises, and build powerful AI apps. Built-in supports a wide range of open-source LLMs and model runtime, including Llama 2, StableLM, Falcon, Dolly, Flan-T5, ChatGLM, StarCoder, and more. Serve LLMs over RESTful API or gRPC with one command, query via WebUI, CLI, our Python/Javascript client, or any HTTP client.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    LazyLLM

    LazyLLM

    Easiest and laziest way for building multi-agent LLMs applications

    LazyLLM is an optimized, lightweight LLM server designed for easy and fast deployment of large language models. It is fully compatible with the OpenAI API specification, enabling developers to integrate their own models into applications that normally rely on OpenAI’s endpoints. LazyLLM emphasizes low resource usage and fast inference while supporting multiple models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    llama2-webui

    llama2-webui

    Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere

    Running Llama 2 with gradio web UI on GPU or CPU from anywhere (Linux/Windows/Mac).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    BrowserAI

    BrowserAI

    Run local LLMs like llama, deepseek, kokoro etc. inside your browser

    BrowserAI is a cutting-edge platform that allows users to run large language models (LLMs) directly in their web browser without the need for a server. It leverages WebGPU for accelerated performance and supports offline functionality, making it a highly efficient and privacy-conscious solution. The platform provides a developer-friendly SDK with pre-configured popular models, and it allows for seamless switching between MLC and Transformer engines. Additionally, it supports features such as...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    promptfoo

    promptfoo

    Evaluate and compare LLM outputs, catch regressions, improve prompts

    ...Use built-in metrics, LLM-graded evals, or define your own custom metrics. Compare prompts and model outputs side-by-side, or integrate the library into your existing test/CI workflow. Use OpenAI, Anthropic, and open-source models like Llama and Vicuna, or integrate custom API providers for any LLM API.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    SGLang

    SGLang

    SGLang is a fast serving framework for large language models

    SGLang is a fast serving framework for large language models and vision language models. It makes your interaction with models faster and more controllable by co-designing the backend runtime and frontend language.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    h2oGPT

    h2oGPT

    Private chat with local GPT with document, images, video, etc.

    h2oGPT is an open-source platform that allows users to interact with local GPT models in a completely private environment. It supports a variety of document types, including PDFs, Word files, images, video frames, and even audio, enabling users to query and analyze their documents or engage in a private chat with AI. The platform is designed to be secure and offline, ensuring that all data remains private and under the user's control. h2oGPT supports several AI models, including oLLaMa and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Pruna AI

    Pruna AI

    Pruna is a model optimization framework built for developers

    Pruna is an open-source, self-hostable AI inference engine designed to help teams deploy and manage large language models (LLMs) efficiently across private or hybrid infrastructures. Built with performance and developer ergonomics in mind, Pruna simplifies inference workflows by enabling multi-model orchestration, autoscaling, GPU resource allocation, and compatibility with popular open-source models. It is ideal for companies or teams looking to reduce reliance on external APIs while...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Curated Transformers

    Curated Transformers

    PyTorch library of curated Transformer models and their components

    ...It provides state-of-the-art models that are composed of a set of reusable components. Supports state-of-the-art transformer models, including LLMs such as Falcon, Llama, and Dolly v2. Implementing a feature or bugfix benefits all models. For example, all models support 4/8-bit inference through the bitsandbytes library and each model can use the PyTorch meta device to avoid unnecessary allocations and initialization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Agents-Flex

    Agents-Flex

    Agents-Flex is an elegant LLM Application Framework like LangChain

    Agents-Flex includes a variety of network protocols for connecting LLMs, such as HTTP, SSE and WS. Its simple and flexible design allows developers to easily connect to various LLMs, including OpenAI, LLama, and other AI. Agents-Flex provides a rich set of development templates and Prompt Frameworks, including FEW-SHOT, CRISPE, BROKE, and ICIO. Developers can also customize their own unique prompt templates. Agents-Flex has a very flexible Function Calling component. It supports local method definitions, parsing, callbacks through LLMs, and executing local methods to obtain results. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Tribe AI

    Tribe AI

    Low code tool to rapidly build and coordinate multi-agent teams

    Low code tool to rapidly build and coordinate multi-agent teams. Have you heard the saying, 'Two minds are better than one'? That's true for agents too. Tribe leverages on the langgraph framework to let you customize and coordinate teams of agents easily. By splitting up tough tasks among agents who are good at different things, each one can focus on what it does best. This makes solving problems faster and better.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Speech-AI-Forge

    Speech-AI-Forge

    Speech-AI-Forge is a project developed around TTS generation model

    ...It is model-agnostic and advertises support for a variety of TTS and speech models such as ChatTTS, CosyVoice, Fish-Speech, FireredTTS and others, as well as Whisper-based ASR, giving you a flexible playground for experimenting with different speech stacks. The project also integrates with general-purpose LLMs (for example GPT- or LLaMA-style models), which can be used to pre-process text, manage conversations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    MCP Hub

    MCP Hub

    An MCP client for Neovim that seamlessly integrates MCP servers

    mcphub.nvim is an MCP (Model Context Protocol) client plugin for Neovim that seamlessly integrates MCP servers into your editing workflow with an intuitive interface for managing, testing, and using MCP servers with your favorite chat plugins. Create your first MCP capable agent you need only 6 lines of code. Works with any langchain-supported LLM that supports tool calling (OpenAI, Anthropic, Groq, LLama etc.) Explore MCP capabilities and generate starter code with the interactive code builder. An MCP client for Neovim that seamlessly integrates MCP servers into your editing workflow with an intuitive interface for managing, testing, and using MCP servers with your favorite chat plugins.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    CSM (Conversational Speech Model)

    CSM (Conversational Speech Model)

    A Conversational Speech Generation Model

    The CSM (Conversational Speech Model) is a speech generation model developed by Sesame AI that creates RVQ audio codes from text and audio inputs. It uses a Llama backbone and a smaller audio decoder to produce audio codes for realistic speech synthesis. The model has been fine-tuned for interactive voice demos and is hosted on platforms like Hugging Face for testing. CSM offers a flexible setup and is compatible with CUDA-enabled GPUs for efficient execution.
    Downloads: 4 This Week
    Last Update:
    See Project