local free download - SourceForge

llmfit

157 models, 30 providers, one command to find what runs on hardware

...By presenting clear performance estimates and compatibility guidance, the project reduces the trial-and-error typically involved in local LLM experimentation. Overall, llmfit serves as a practical decision assistant for developers who want to run language models efficiently on their own machines.

Downloads: 19 This Week

Last Update: 18 hours ago

See Project

Extractous

Fast and efficient unstructured data extraction

Extractous is a Rust-based unstructured data extraction library focused on fast local parsing of documents and other content-heavy files. Its purpose is to extract text and metadata efficiently from formats such as PDF, Word, HTML, email archives, images, and more, without depending on external APIs or separate parsing servers. The project emphasizes performance and low memory usage, and its maintainers describe it as a local-first alternative to heavier extraction stacks. ...

Downloads: 0 This Week

Last Update: 2026-03-06

See Project

Floneum

Instant, controllable, local pre-trained AI models in Rust

...Many plugins can be written in different programming languages and compiled to WebAssembly modules, allowing them to run safely within the system. The platform is implemented primarily in Rust and emphasizes performance, modularity, and local execution.

Downloads: 0 This Week

Last Update: 2026-03-06

See Project

webclaw

Fast, local-first web content extraction for LLMs

...Webclaw also provides advanced capabilities such as recursive crawling, structured JSON extraction, summarization, and content comparison, making it suitable for research and data pipelines. Its local-first architecture ensures privacy and eliminates the need for API keys.

Downloads: 0 This Week

Last Update: 7 days ago

See Project

rtk

CLI proxy that reduces LLM token consumption

rtk is an open-source command-line proxy designed to optimize interactions between AI coding agents and the terminal by reducing unnecessary token consumption. When AI assistants execute shell commands during software development tasks, the resulting terminal output often contains large amounts of repetitive or irrelevant information that can overwhelm the model’s context window. RTK intercepts these command outputs and compresses them into concise summaries before sending them to the...

Downloads: 30 This Week

Last Update: 6 days ago

See Project

mistral.rs

Fast, flexible LLM inference

mistral.rs is a fast and flexible LLM inference engine implemented in Rust, designed to run and serve modern language models with an emphasis on performance and practical deployment. It provides multiple entry points for developers, including a CLI for running models locally and an HTTP server that exposes an OpenAI-compatible API surface for easy integration with existing clients. The project includes hardware-aware tooling that can benchmark a system and choose sensible quantization and...

Downloads: 2 This Week

Last Update: 2026-06-25

See Project

Paddler

Open-source LLM load balancer and serving platform for hosting LLMs

Paddler is an open-source LLM infrastructure platform designed to deploy, manage, and scale large language models on private infrastructure. The system acts as a specialized load balancer and serving layer for language models, enabling organizations to run inference workloads without relying on external API providers. It supports running models locally through engines such as llama.cpp while distributing requests across multiple compute nodes to improve performance and reliability. The...

Downloads: 1 This Week

Last Update: 2026-06-11

See Project

uzu

A high-performance inference engine for AI models

...The engine implements a hybrid architecture in which model layers can be executed either as custom GPU kernels or through Apple’s MPSGraph API, allowing it to balance performance and compatibility depending on the workload. By utilizing Apple’s unified memory architecture, uzu reduces memory copying overhead and improves inference throughput for local AI workloads. The system includes a simple high-level API that enables developers to run models, create inference sessions, and generate outputs with minimal configuration.

Downloads: 0 This Week

Last Update: 2026-06-08

See Project

Search Results for "local"

Showing 8 open source projects for "local"

llmfit

Extractous

Floneum

webclaw

rtk

mistral.rs

Paddler

uzu

Search Results for "local"

Showing 8 open source projects for "local"

llmfit

Extractous

Floneum

webclaw

rtk

mistral.rs

Paddler

uzu

Related Categories