Page 2 | llama-cpp-python.whl free download

bert4torch

An elegent pytorch implement of transformers

An elegant PyTorch implement of transformers.

Downloads: 0 This Week

Last Update: 2025-09-25

See Project

Jan.ai

Open source alternative to ChatGPT that runs 100% offline

...It allows you to download and run LLMs (local language models) offline while also offering optional integration with cloud-based model providers—giving you full control over your data and AI interactions. Download and run LLMs (Llama, Gemma, Qwen, GPT-oss etc.) from HuggingFace. Connect to GPT models via OpenAI, Claude models via Anthropic, Mistral, Groq, and others. Create specialized AI assistants for your tasks. MCP integration for agentic capabilities.

Downloads: 30 This Week

Last Update: 15 hours ago

See Project

DeepSeek R1

Open-source, high-performance AI model with advanced reasoning

...This approach has resulted in performance comparable to leading models like OpenAI's o1, while maintaining cost-efficiency. To further support the research community, DeepSeek has released distilled versions of the model based on architectures such as LLaMA and Qwen.

1 Review

Downloads: 68 This Week

Last Update: 2025-07-09

See Project

Orpheus TTS

Towards Human-Sounding Speech

Orpheus TTS is a state-of-the-art open-source text-to-speech system built on a Llama-3B backbone, treating speech synthesis as a large language model problem instead of a traditional TTS pipeline. It is designed to produce human-like speech with natural intonation, emotion, and rhythm, targeting quality comparable to or better than many closed-source systems. The project ships both pretrained and finetuned English models, as well as a family of multilingual models released as a research preview, and includes data-processing scripts so users can train or finetune their own variants. ...

Downloads: 2 This Week

Last Update: 2025-11-28

See Project

GPT4All

Run Local LLMs on Any Device. Open-source

GPT4All is an open-source project that allows users to run large language models (LLMs) locally on their desktops or laptops, eliminating the need for API calls or GPUs. The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This...

1 Review

Downloads: 107 This Week

Last Update: 2025-03-17

See Project

llamafile

Distribute and run LLMs with a single file

...We're doing that by combining llama.cpp with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a "llamafile") that runs locally on most computers, with no installation. The easiest way to try it for yourself is to download our example llamafile for the LLaVA model (license: LLaMA 2, OpenAI). LLaVA is a new LLM that can do more than just chat; you can also upload images and ask it questions about them. With llamafile, this all happens locally; no data ever leaves your computer.

Downloads: 12 This Week

Last Update: 2025-05-14

See Project

LocalAI

Self-hosted, community-driven, local OpenAI compatible API

...Drop-in replacement for OpenAI running LLMs on consumer-grade hardware. Free Open Source OpenAI alternative. No GPU is required. Runs ggml, GPTQ, onnx, TF compatible models: llama, gpt4all, rwkv, whisper, vicuna, koala, gpt4all-j, cerebras, falcon, dolly, starcoder, and many others. LocalAI is a drop-in replacement REST API that’s compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer-grade hardware, supporting multiple model families that are compatible with the ggml format. ...

Downloads: 11 This Week

Last Update: 2025-11-26

See Project

SillyTavern

LLM Frontend for Power Users

Mobile-friendly, Multi-API (KoboldAI/CPP, Horde, NovelAI, Ooba, OpenAI, OpenRouter, Claude, Scale), VN-like Waifu Mode, Horde SD, System TTS, WorldInfo (lorebooks), customizable UI, auto-translate, and more prompt options than you'd ever want or need. Optional Extras server for more SD/TTS options + ChromaDB/Summarize. SillyTavern is a user interface you can install on your computer (and Android phones) that allows you to interact with text generation AIs and chat/roleplay with characters you or the community create. ...

Downloads: 152 This Week

Last Update: 2025-11-22

See Project

Axolotl

Go ahead and axolotl questions

Axolotl is a powerful and flexible framework for fine-tuning large language models on custom datasets. Built for researchers and developers, Axolotl simplifies the process of adapting LLMs for specific tasks, including chat, code generation, and instruction following. It supports a wide variety of model architectures and offers out-of-the-box optimization strategies for efficient training.

Downloads: 1 This Week

Last Update: 2025-11-18

See Project

LazyLLM

Easiest and laziest way for building multi-agent LLMs applications

LazyLLM is an optimized, lightweight LLM server designed for easy and fast deployment of large language models. It is fully compatible with the OpenAI API specification, enabling developers to integrate their own models into applications that normally rely on OpenAI’s endpoints. LazyLLM emphasizes low resource usage and fast inference while supporting multiple models.

Downloads: 0 This Week

Last Update: 2025-11-01

See Project

Lepton AI

A Pythonic framework to simplify AI service building

A Pythonic framework to simplify AI service building. Cutting-edge AI inference and training, unmatched cloud-native experience, and top-tier GPU infrastructure. Ensure 99.9% uptime with comprehensive health checks and automatic repairs.

Downloads: 1 This Week

Last Update: 2025-11-07

See Project

llama2-webui

Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere

Running Llama 2 with gradio web UI on GPU or CPU from anywhere (Linux/Windows/Mac).

Downloads: 0 This Week

Last Update: 2023-10-04

See Project

SGLang

SGLang is a fast serving framework for large language models

SGLang is a fast serving framework for large language models and vision language models. It makes your interaction with models faster and more controllable by co-designing the backend runtime and frontend language.

Downloads: 1 This Week

Last Update: 3 days ago

See Project

OpenLLM

Operating LLMs in production

...With OpenLLM, you can run inference with any open-source large-language models, deploy to the cloud or on-premises, and build powerful AI apps. Built-in supports a wide range of open-source LLMs and model runtime, including Llama 2， StableLM, Falcon, Dolly, Flan-T5, ChatGLM, StarCoder, and more. Serve LLMs over RESTful API or gRPC with one command, query via WebUI, CLI, our Python/Javascript client, or any HTTP client.

Downloads: 1 This Week

Last Update: 2025-04-21

See Project

BrowserAI

Run local LLMs like llama, deepseek, kokoro etc. inside your browser

BrowserAI is a cutting-edge platform that allows users to run large language models (LLMs) directly in their web browser without the need for a server. It leverages WebGPU for accelerated performance and supports offline functionality, making it a highly efficient and privacy-conscious solution. The platform provides a developer-friendly SDK with pre-configured popular models, and it allows for seamless switching between MLC and Transformer engines. Additionally, it supports features such as...

Downloads: 1 This Week

Last Update: 2025-05-21

See Project

promptfoo

Evaluate and compare LLM outputs, catch regressions, improve prompts

...Use built-in metrics, LLM-graded evals, or define your own custom metrics. Compare prompts and model outputs side-by-side, or integrate the library into your existing test/CI workflow. Use OpenAI, Anthropic, and open-source models like Llama and Vicuna, or integrate custom API providers for any LLM API.

Downloads: 1 This Week

Last Update: 4 days ago

See Project

h2oGPT

Private chat with local GPT with document, images, video, etc.

h2oGPT is an open-source platform that allows users to interact with local GPT models in a completely private environment. It supports a variety of document types, including PDFs, Word files, images, video frames, and even audio, enabling users to query and analyze their documents or engage in a private chat with AI. The platform is designed to be secure and offline, ensuring that all data remains private and under the user's control. h2oGPT supports several AI models, including oLLaMa and...

Downloads: 2 This Week

Last Update: 2025-02-22

See Project

Pruna AI

Pruna is a model optimization framework built for developers

Pruna is an open-source, self-hostable AI inference engine designed to help teams deploy and manage large language models (LLMs) efficiently across private or hybrid infrastructures. Built with performance and developer ergonomics in mind, Pruna simplifies inference workflows by enabling multi-model orchestration, autoscaling, GPU resource allocation, and compatibility with popular open-source models. It is ideal for companies or teams looking to reduce reliance on external APIs while...

Downloads: 0 This Week

Last Update: 2025-11-10

See Project

Curated Transformers

PyTorch library of curated Transformer models and their components

...It provides state-of-the-art models that are composed of a set of reusable components. Supports state-of-the-art transformer models, including LLMs such as Falcon, Llama, and Dolly v2. Implementing a feature or bugfix benefits all models. For example, all models support 4/8-bit inference through the bitsandbytes library and each model can use the PyTorch meta device to avoid unnecessary allocations and initialization.

Downloads: 0 This Week

Last Update: 2024-04-17

See Project

Agents-Flex

Agents-Flex is an elegant LLM Application Framework like LangChain

Agents-Flex includes a variety of network protocols for connecting LLMs, such as HTTP, SSE and WS. Its simple and flexible design allows developers to easily connect to various LLMs, including OpenAI, LLama, and other AI. Agents-Flex provides a rich set of development templates and Prompt Frameworks, including FEW-SHOT, CRISPE, BROKE, and ICIO. Developers can also customize their own unique prompt templates. Agents-Flex has a very flexible Function Calling component. It supports local method definitions, parsing, callbacks through LLMs, and executing local methods to obtain results. ...

Downloads: 1 This Week

Last Update: 2025-11-17

See Project

Speech-AI-Forge

Speech-AI-Forge is a project developed around TTS generation model

...It is model-agnostic and advertises support for a variety of TTS and speech models such as ChatTTS, CosyVoice, Fish-Speech, FireredTTS and others, as well as Whisper-based ASR, giving you a flexible playground for experimenting with different speech stacks. The project also integrates with general-purpose LLMs (for example GPT- or LLaMA-style models), which can be used to pre-process text, manage conversations.

Downloads: 2 This Week

Last Update: 2025-11-28

See Project

MCP Hub

An MCP client for Neovim that seamlessly integrates MCP servers

mcphub.nvim is an MCP (Model Context Protocol) client plugin for Neovim that seamlessly integrates MCP servers into your editing workflow with an intuitive interface for managing, testing, and using MCP servers with your favorite chat plugins. Create your first MCP capable agent you need only 6 lines of code. Works with any langchain-supported LLM that supports tool calling (OpenAI, Anthropic, Groq, LLama etc.) Explore MCP capabilities and generate starter code with the interactive code builder. An MCP client for Neovim that seamlessly integrates MCP servers into your editing workflow with an intuitive interface for managing, testing, and using MCP servers with your favorite chat plugins.

Downloads: 0 This Week

Last Update: 2025-08-15

See Project

CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

CogVLM2 is the second generation of the CogVLM vision-language model series, developed by ZhipuAI and released in 2024. Built on Meta-Llama-3-8B-Instruct, CogVLM2 significantly improves over its predecessor by providing stronger performance across multimodal benchmarks such as TextVQA, DocVQA, and ChartQA, while introducing extended context length support of up to 8K tokens and high-resolution image input up to 1344×1344. The series includes models for both image understanding and video understanding, with CogVLM2-Video supporting up to 1-minute videos by analyzing keyframes. ...

Downloads: 0 This Week

Last Update: 3 days ago

See Project

Groq AppGen

Project showcasing Llama 3.3 70B HTML codegen abilities

Groq AppGen is an interactive web application (built with Next.js and TypeScript) that uses Groq’s LLM API to generate or modify web application code based on natural-language prompts. Essentially, you tell the app what kind of web app or page you want (in plain English), and groq-appgen will produce HTML/JSX code scaffolding, layout, and optionally application logic accordingly. It supports iterative feedback: you can refine your prompt, adjust parameters or requirements, and have the app...

Downloads: 0 This Week

Last Update: 1 day ago

See Project

CSM (Conversational Speech Model)

A Conversational Speech Generation Model

The CSM (Conversational Speech Model) is a speech generation model developed by Sesame AI that creates RVQ audio codes from text and audio inputs. It uses a Llama backbone and a smaller audio decoder to produce audio codes for realistic speech synthesis. The model has been fine-tuned for interactive voice demos and is hosted on platforms like Hugging Face for testing. CSM offers a flexible setup and is compatible with CUDA-enabled GPUs for efficient execution.

Downloads: 8 This Week

Last Update: 2025-03-19

See Project

Search Results for "llama-cpp-python.whl" - Page 2

Showing 75 open source projects for "llama-cpp-python.whl"

bert4torch

Jan.ai

DeepSeek R1

Orpheus TTS

GPT4All

llamafile

LocalAI

SillyTavern

Axolotl

LazyLLM

Lepton AI

llama2-webui

SGLang

OpenLLM

BrowserAI

promptfoo

h2oGPT

Pruna AI

Curated Transformers

Agents-Flex

Speech-AI-Forge

MCP Hub

CogVLM2

Groq AppGen

CSM (Conversational Speech Model)

Search Results for "llama-cpp-python.whl" - Page 2

Showing 75 open source projects for "llama-cpp-python.whl"

bert4torch

Jan.ai

DeepSeek R1

Orpheus TTS

GPT4All

llamafile

LocalAI

SillyTavern

Axolotl

LazyLLM

Lepton AI

llama2-webui

SGLang

OpenLLM

BrowserAI

promptfoo

h2oGPT

Pruna AI

Curated Transformers

Agents-Flex

Speech-AI-Forge

MCP Hub

CogVLM2

Groq AppGen

CSM (Conversational Speech Model)

Related Searches

Related Categories