local server free download

Showing 44 open source projects for "local server"

View related business solutions

Artificial Intelligence Python Clear Filters & Widen Search

Secure File Transfer for Windows with Cerberus by Redwood
Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.

Try for Free
99.99% Uptime for MySQL and PostgreSQL Databases
Sub-second maintenance. 2x read/write performance. Built-in vector search for AI apps.

Cloud SQL Enterprise Plus delivers near-zero downtime with 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server.

Try Free
1

Local-NotebookLM

Googles NotebookLM but local

...It supports multiple LLM providers, including local and hosted options, and can be used through a command-line workflow, API server, Docker setup, or web UI. Overall, it is useful for students, researchers, and knowledge workers who want private, customizable audio summaries from documents.

Downloads: 8 This Week

Last Update: 5 days ago
See Project
2

MCP Server DuckDB

A Model Context Protocol (MCP) server implementation for DuckDB

An MCP server implementation for DuckDB, providing database interaction capabilities through MCP tools, allowing operations like querying, table creation, and schema inspection.

Downloads: 0 This Week

Last Update: 2025-05-05
See Project
3

WhatsApp MCP Server

WhatsApp MCP server enabling AI access to chats and messaging

whatsapp-mcp is an open source Model Context Protocol (MCP) server that enables AI agents to interact directly with a user’s WhatsApp account through a structured interface. It acts as a bridge between WhatsApp and large language models, allowing controlled access to messages, chats, and contacts. whatsapp-mcp is composed of two main components: a Go-based bridge that connects to the WhatsApp Web API and stores data locally, and a Python-based MCP server that exposes tools for AI interaction. ...

Downloads: 0 This Week

Last Update: 2026-03-17
See Project
4

Chatterbox TTS Server

Self-host the powerful Chatterbox TTS model

Chatterbox-TTS-Server is a self-hosted server for running the Chatterbox text-to-speech model through both a web interface and API endpoints. It is designed for users who want local or private speech generation without depending entirely on a hosted voice platform. The project supports predefined voices, voice cloning, and longer text workflows, making it useful for audiobooks, narration, content tools, and assistant-style applications.

Downloads: 7 This Week

Last Update: 2026-06-08
See Project
Earn up to 16% annual interest with Nexo.
More flexibility. More control.

Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
5

Lemonade

Lemonade helps users run local LLMs with the highest performance

Lemonade is a local LLM runtime that aims to deliver the highest possible performance on your own hardware by auto-configuring state-of-the-art inference engines for both NPUs and GPUs. The project positions itself as a “local LLM server” you can run on laptops and workstations, abstracting away backend differences while giving you a single place to serve and manage models.

Downloads: 6 This Week

Last Update: 3 days ago
See Project
6

OpenJarvis

Personal AI, On Personal Devices

OpenJarvis is an open-source framework designed to build personal AI agents that run primarily on local devices rather than relying on cloud infrastructure. Developed as part of the Intelligence Per Watt research initiative, it focuses on improving the efficiency and practicality of on-device AI systems. The framework provides shared primitives for building local-first agents, along with evaluation tools that measure performance using metrics such as energy consumption, latency, cost, and...

Downloads: 48 This Week

Last Update: 2026-05-25
See Project
7

web-eval-agent MCP Server

An MCP server that autonomously evaluates web applications

...Marketing and README material emphasize supercharging local debugging loops by combining live browser execution with LLM-driven hypotheses and fixes. Activity on the repo shows steady iteration, with issues and PRs centered on reliability and developer experience. In short, it wraps autonomous, in-editor web testing and diagnosis behind a predictable MCP interface.

Downloads: 0 This Week

Last Update: 2025-11-22
See Project
8

mcpo

A simple, secure MCP-to-OpenAPI proxy server

mcpo is a minimal bridge that exposes any MCP tool as an OpenAPI-compatible HTTP server. Instead of writing glue code, you point mcpo at an MCP server command and it generates REST endpoints and an OpenAPI spec that other systems (or LLM agent frameworks) can call immediately. This design lets you reuse a growing library of MCP servers with platforms that only understand HTTP+OpenAPI, unifying tool access across ecosystems.

Downloads: 0 This Week

Last Update: 2026-02-27
See Project
9

Colab-MCP

An MCP server for interacting with Google Colab

...This approach bridges the gap between local AI agents and remote high-performance compute environments, allowing users to offload heavy workloads such as machine learning training, data analysis, and dependency-heavy tasks to Colab’s GPU and TPU resources. By exposing Colab as an MCP server, the tool enables seamless integration with a wide range of AI assistants and agent frameworks, creating a standardized interface for tool use and execution.

Downloads: 0 This Week

Last Update: 2026-03-27
See Project
Host LLMs in Production With On-Demand GPUs
NVIDIA L4 GPUs. 5-second cold starts. Scale to zero when idle.

Deploy your model, get an endpoint, pay only for compute time. No GPU provisioning or infrastructure management required.

Try Free
10

WhisperLive

A nearly-live implementation of OpenAI's Whisper

WhisperLive is a “nearly live” implementation of OpenAI’s Whisper model focused on real-time transcription. It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and...

Downloads: 31 This Week

Last Update: 2026-06-02
See Project
11

MiniMax-MCP

Official MiniMax Model Context Protocol (MCP) server

...The server is written in Python and distributed under the MIT license, with a pyproject.toml and uv-based workflow that makes installation and execution reproducible. Configuration is handled through JSON files that tell MCP clients how to launch the server (typically via uvx minimax-mcp) and which environment variables to use for the API key, host, and output directory.

Downloads: 1 This Week

Last Update: 2026-05-21
See Project
12

Reader 3

Quick illustration of how one can easily read books together with LLMs

This project is a minimalist, self-hosted EPUB reader designed to help users browse and read EPUB books one chapter at a time through a lightweight local server, making it especially easy to extract or work with chapters in external tools like large language models. It was created primarily as a simple demonstration of how to combine local book reading with LLM workflows without heavy dependencies or complicated setup, and it runs with just a small Python script and a basic HTTP server.

Downloads: 1 This Week

Last Update: 2026-02-05
See Project
13

ChatGLM3

ChatGLM3 series: Open Bilingual Chat LLMs | Open Source Bilingual Chat

...It keeps the series’ smooth dialog and low deployment cost while adding native tool use (function calling), a built-in code interpreter, and agent-style workflows. The family includes base and long-context variants (8K/32K/128K). The repo ships Python APIs, CLI and web demos (Gradio/Streamlit), an OpenAI-format API server, and a compact fine-tuning kit. Quantization (4/8-bit), CPU/MPS support, and accelerator backends (TensorRT-LLM, OpenVINO, chatglm.cpp) enable lightweight local or edge deployment.

Downloads: 2 This Week

Last Update: 6 days ago
See Project
14

Pocket TTS

A TTS that fits in your CPU (and pocket)

...Because it is CPU-oriented, it fits well in server environments where GPU access is limited, in desktop apps, or in edge deployments where simplicity matters more than maximum throughput. It also emphasizes developer ergonomics, providing a straightforward API surface that can be integrated into pipelines, assistants, accessibility tools, or batch generation scripts.

Downloads: 9 This Week

Last Update: 2026-05-04
See Project
15

ChatTTS webUI & API

A simple native web interface that uses ChatTTS to synthesize text

ChatTTS-ui is a local web interface and API wrapper around the ChatTTS speech synthesis system, designed to make advanced TTS models easy to use from a browser. It runs a small backend server (Python + Torch + ffmpeg) and exposes a simple webpage where you can type text, adjust parameters, and generate audio. The project supports Chinese, English, and mixed text with digits and control symbols, making it suitable for bilingual content and numerically heavy text like announcements or prompts. ...

Downloads: 21 This Week

Last Update: 2026-06-14
See Project
16

LMCache

Supercharge Your LLM with the Fastest KV Cache Layer

LMCache is an extension layer for LLM serving engines that accelerates inference, especially with long contexts, by storing and reusing key-value (KV) attention caches across requests. Instead of rebuilding KV states for repeated or shared text segments, LMCache persists and retrieves them from multiple tiers—GPU memory, CPU DRAM, and local disk—then injects them into subsequent requests to reduce TTFT and increase throughput. Its design supports reuse beyond strict prefix matching and enables sharing across serving instances, improving efficiency under real multi-tenant traffic. The broader project includes examples, tests, a server component, and public posts describing cross-engine sharing and inter-GPU KV transfers. ...

Downloads: 0 This Week

Last Update: 2 days ago
See Project
17

LazyLLM

Easiest and laziest way for building multi-agent LLMs applications

LazyLLM is an optimized, lightweight LLM server designed for easy and fast deployment of large language models. It is fully compatible with the OpenAI API specification, enabling developers to integrate their own models into applications that normally rely on OpenAI’s endpoints. LazyLLM emphasizes low resource usage and fast inference while supporting multiple models.

Downloads: 0 This Week

Last Update: 1 day ago
See Project
18

gemini-web2api

Convert Google Gemini web into OpenAI-compatible API

...It is designed to let OpenAI-style clients connect to Gemini-like models through routes such as chat completions, models, responses, and native Gemini-compatible endpoints. The project can run as a simple local server and uses a mostly single-file design with an optional dependency for streaming. It supports model aliases for Flash, Thinking, Pro-style routing, Auto, and Lite variants. The tool also includes optional API keys, function calling, SSE streaming, web search access, Docker deployment, and client examples for OpenAI SDK-style usage. ...

Downloads: 1 This Week

Last Update: 2026-06-27
See Project
19

stt

Voice Recognition to Text Tool

...It leverages open-source speech models such as Faster-Whisper to recognize and transcribe human speech into plain text, structured JSON objects, or subtitle files with time codes, making it suitable for both personal and professional transcription tasks. The project is designed to be easy to deploy: you can run a local Python server that exposes an HTTP API for uploading audio/video files and retrieving transcriptions in different formats. It supports GPU acceleration if available, enabling faster processing on compatible hardware but still offers reliable performance on CPUs alone.

Downloads: 0 This Week

Last Update: 2026-02-17
See Project
20

OpenVoice

Instant voice cloning by MIT and MyShell. Audio foundation model

OpenVoice is a versatile instant voice cloning system that can replicate a speaker’s tone color from just a short audio clip and then generate speech in multiple languages. It is designed not only to match the timbre of the reference voice, but also to give granular control over style parameters such as emotion, accent, rhythm, pauses, and intonation. The model supports cross-lingual and even zero-shot cross-lingual voice cloning, so a speaker recorded in one language can be made to speak...

Downloads: 29 This Week

Last Update: 2025-11-28
See Project
21

OpenAI-Compatible Edge-TTS API

Free, high-quality text-to-speech API endpoint to replace OpenAI

...Because it relies on Edge’s TTS, the audio generation itself is free, and the project essentially acts as a smart proxy that handles formatting and streaming. The server supports Server-Sent Events (SSE) for streaming audio, enabling low-latency playback in chat UIs and other interactive tools. A Docker image is provided for one-command deployment, and environment variables can be used to configure default voice, language, response format, authentication, and logging options.

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
22

Preswald

Python tool for browser-based interactive data apps in one file

Preswald is an open source Python-based framework and static-site generator designed for building interactive data applications that run entirely in the browser. It packages application logic, data processing, and user interface components into a single self-contained output, enabling easy sharing and deployment without requiring local dependencies. Preswald leverages a WebAssembly runtime along with technologies like Pyodide and DuckDB to execute Python code directly in the browser...

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
23

SeaGOAT

local-first semantic code search engine

SeaGOAT is an open-source semantic code search engine designed to help developers explore and understand large codebases more efficiently. Instead of relying solely on traditional keyword search, it uses vector embeddings to represent the meaning of code and queries, allowing users to perform semantic searches that find relevant code even when the exact keywords are not present. The tool runs locally on a developer’s machine and processes repositories using a combination of embedding models...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
24

FastKoko

Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

FastKoko is a self-hosted text-to-speech server built around the Kokoro-82M model and exposed through a FastAPI backend. It is designed to be easy to deploy via Docker, with separate CPU and GPU images so that users can choose between pure CPU inference and NVIDIA GPU acceleration. The project exposes an OpenAI-compatible speech endpoint, which means existing code that talks to the OpenAI audio API can often be pointed at a Kokoro-FastAPI instance with minimal changes. It supports multiple...

Downloads: 0 This Week

Last Update: 2026-06-06
See Project
25

TokenSpeed

TokenSpeed is a speed-of-light LLM inference engine

...It builds on ideas and components from the broader open-source inference ecosystem while presenting its own execution stack. TokenSpeed is useful for developers building local or server-side LLM infrastructure for agents, coding systems, and high-volume AI applications. Its main value is providing an inference layer optimized for fast token generation under practical agent workloads.

Downloads: 5 This Week

Last Update: 2 days ago
See Project