Dependency free download

llama.cpp

Port of Facebook's LLaMA model in C/C++

The llama.cpp project enables the inference of Meta's LLaMA model (and other models) in pure C/C++ without requiring a Python runtime. It is designed for efficient and fast model execution, offering easy integration for applications needing LLM-based capabilities. The repository focuses on providing a highly optimized and portable implementation for running large language models directly within C/C++ environments.

1 Review

Downloads: 249 This Week

Last Update: 6 hours ago

See Project

LLMCompiler

An LLM Compiler for Parallel Function Calling

...LLMCompiler addresses this limitation by applying principles from classical compilers to analyze a task and construct an execution plan that allows multiple functions to run in parallel whenever possible. The framework builds a dependency graph of required operations, identifying which tasks must run sequentially and which can be executed simultaneously. Its architecture includes components such as a planning module that constructs the task graph, a task dispatcher that manages dependencies, and an executor that performs parallel calls.

Downloads: 0 This Week

Last Update: 2026-03-06

See Project

Torch Pruning

DepGraph: Towards Any Structural Pruning

...It introduces a graph-based algorithm called DepGraph that automatically identifies dependencies between layers, allowing parameters to be pruned safely across complex architectures. This dependency analysis makes it possible to prune large networks such as transformers, convolutional networks, and diffusion models without breaking the computational graph. Torch-Pruning physically removes parameters rather than masking them, which results in smaller and faster models during both training and inference. The toolkit supports a wide variety of architectures used in computer vision and large language models, making it a flexible solution for model compression tasks.

Downloads: 0 This Week

Last Update: 2026-03-05

See Project

RWKV Runner

A RWKV management and startup tool, full automation, only 8MB

RWKV (pronounced as RwaKuv) is an RNN with GPT-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, fast training, saves VRAM, "infinite" ctxlen, and free text embedding. Moreover it's 100% attention-free. Default configs has enabled custom CUDA kernel acceleration, which is much faster and consumes much less VRAM. If you encounter possible compatibility...

Downloads: 8 This Week

Last Update: 2026-05-08

See Project

OpenAI API client for Kotlin

OpenAI API client for Kotlin with multiplatform capabilities

OpenAI API client for Kotlin with multiplatform and coroutines capabilities.

Downloads: 0 This Week

Last Update: 2026-02-07

See Project

OneFileLLM

Specify a github or local repo, github pull request

OneFileLLM is an open-source project designed to simplify the distribution and execution of large language model applications by packaging them into a single portable file. The concept behind the project is to eliminate the complexity normally associated with deploying AI systems, which often require multiple dependencies, frameworks, and configuration steps. Instead, the entire runtime environment, model interface, and application logic are bundled together into a single executable...

Downloads: 4 This Week

Last Update: 2026-06-12

See Project

NativeMind Extension

Your fully private, open-source, on-device AI assistant

NativeMindExtension is an open-source browser extension that provides a private, on-device AI assistant designed to run without cloud dependencies. The project is built around a privacy-first model in which conversations, document analysis, translations, and writing assistance stay on the user’s device rather than being sent to external servers. It integrates with local model back ends such as Ollama and also supports WebLLM for quick in-browser trials, giving users a choice between stronger...

Downloads: 2 This Week

Last Update: 2026-03-23

See Project

opensrc

Fetch source code for npm packages

OpenSrc is an open-source utility developed by Vercel Labs that retrieves and exposes the source code of npm packages so that AI coding agents can better understand how external libraries work. When large language models generate code, they often rely only on type definitions or documentation, which can limit their understanding of how a library actually behaves. OpenSrc addresses this limitation by allowing agents to fetch the underlying source code of dependencies and analyze their...

Downloads: 0 This Week

Last Update: 2026-04-18

See Project

SeaGOAT

local-first semantic code search engine

SeaGOAT is an open-source semantic code search engine designed to help developers explore and understand large codebases more efficiently. Instead of relying solely on traditional keyword search, it uses vector embeddings to represent the meaning of code and queries, allowing users to perform semantic searches that find relevant code even when the exact keywords are not present. The tool runs locally on a developer’s machine and processes repositories using a combination of embedding models...

Downloads: 0 This Week

Last Update: 2026-03-09

See Project

Cake

Distributed LLM and StableDiffusion inference

Cake is a compact, powerful toolkit that combines a flexible TCP/UDP proxy, port forwarding system, and connection manager designed for both development and penetration testing scenarios. It enables users to create complex networking flows where traffic can be proxied, relayed, and manipulated between endpoints — useful for debugging networked applications, inspecting protocols, or tunneling traffic through different hops. The tool is designed to work with multiple protocols and supports...

Downloads: 0 This Week

Last Update: 2026-04-24

See Project

llama2.c

Inference Llama 2 in one file of pure C

llama2.c is a minimalist implementation of the Llama 2 language model architecture designed to run entirely in pure C. Created by Andrej Karpathy, this project offers an educational and lightweight framework for performing inference on small Llama 2 models without external dependencies. It provides a full training and inference pipeline: models can be trained in PyTorch and later executed using a concise 700-line C program (run.c). While it can technically load Meta’s official Llama 2...

Downloads: 1 This Week

Last Update: 4 days ago

See Project

Secret Llama

Fully private LLM chatbot that runs entirely with a browser

Secret Llama is a privacy-first large-language-model chatbot that runs entirely inside your web browser, meaning no server is required and your conversation data never leaves your device. It focuses on open-source model support, letting you load families like Llama and Mistral directly in the client for fully local inference. Because everything happens in-browser, it can work offline once models are cached, which is helpful for air-gapped environments or travel. The interface mirrors the...

Downloads: 3 This Week

Last Update: 2025-11-07

See Project

local-llm

Run LLMs locally on Cloud Workstations

local-llm is a development framework that enables developers to run large language models locally within Google Cloud Workstations or standard environments without requiring GPU hardware. It focuses on making generative AI development more accessible by leveraging quantized models and CPU-based execution, eliminating the dependency on expensive GPU infrastructure. The repository includes tools, Docker configurations, and command-line utilities that simplify the process of downloading, running, and interacting with language models directly on local or cloud-based workstations. This approach improves data privacy and control, as all inference can be performed locally without sending sensitive information to external APIs. ...

Downloads: 5 This Week

Last Update: 2026-03-17

See Project

Search Results for "Dependency"

Showing 13 open source projects for "Dependency"

llama.cpp

LLMCompiler

Torch Pruning

RWKV Runner

OpenAI API client for Kotlin

OneFileLLM

NativeMind Extension

opensrc

SeaGOAT

Cake

llama2.c

Secret Llama

local-llm

Search Results for "Dependency"

Showing 13 open source projects for "Dependency"

llama.cpp

LLMCompiler

Torch Pruning

RWKV Runner

OpenAI API client for Kotlin

OneFileLLM

NativeMind Extension

opensrc

SeaGOAT

Cake

llama2.c

Secret Llama

local-llm

Related Searches

Related Categories