api server free download

Triton Inference Server

The Triton Inference Server provides an optimized cloud

...Triton delivers optimized performance for many query types, including real-time, batched, ensembles, and audio/video streaming. Provides Backend API that allows adding custom backends and pre/post-processing operations. Model pipelines using Ensembling or Business Logic Scripting (BLS). HTTP/REST and GRPC inference protocols based on the community-developed KServe protocol. A C API and Java API allow Triton to link directly into your application for edge and other in-process use cases.

Downloads: 3 This Week

Last Update: 2026-04-10

See Project

API-for-Open-LLM

Openai style api for open large language models

API-for-Open-LLM is a lightweight API server designed for deploying and serving open large language models (LLMs), offering a simple way to integrate LLMs into applications.

Downloads: 0 This Week

Last Update: 2025-01-22

See Project

LazyLLM

Easiest and laziest way for building multi-agent LLMs applications

LazyLLM is an optimized, lightweight LLM server designed for easy and fast deployment of large language models. It is fully compatible with the OpenAI API specification, enabling developers to integrate their own models into applications that normally rely on OpenAI’s endpoints. LazyLLM emphasizes low resource usage and fast inference while supporting multiple models.

Downloads: 4 This Week

Last Update: 2026-03-04

See Project

DeepDetect

Deep Learning API and Server in C++14 support for Caffe, PyTorch

...Neural network templates for the most effective architectures for GPU, CPU, and Embedded devices. Training in a few hours and with small data thanks to 25+ pre-trained models. Full Open Source, with an ecosystem of tools (API clients, video, annotation, ...) Fast Server written in pure C++, a single codebase for Cloud, Desktop & Embedded.

Downloads: 2 This Week

Last Update: 2025-07-19

See Project

Infinity

Low-latency REST API for serving text-embeddings

Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting all sentence-transformer models and frameworks. Infinity is developed under MIT License. Infinity powers inference behind Gradient.ai and other Embedding API providers.

Downloads: 3 This Week

Last Update: 2025-08-22

See Project

BrowserAI

Run local LLMs like llama, deepseek, kokoro etc. inside your browser

BrowserAI is a cutting-edge platform that allows users to run large language models (LLMs) directly in their web browser without the need for a server. It leverages WebGPU for accelerated performance and supports offline functionality, making it a highly efficient and privacy-conscious solution. The platform provides a developer-friendly SDK with pre-configured popular models, and it allows for seamless switching between MLC and Transformer engines. Additionally, it supports features such as...

Downloads: 2 This Week

Last Update: 2 days ago

See Project

Text Generation Inference

Large Language Model Text Generation Inference

Text Generation Inference is a high-performance inference server for text generation models, optimized for Hugging Face's Transformers. It is designed to serve large language models efficiently with optimizations for performance and scalability.

Downloads: 1 This Week

Last Update: 2025-12-18

See Project

kolosal

Open Source and Lightweight Local LLM Platform

Kolosal AI is the leading open-source local LLM platform. Download, train, and run local LLM models on your device with no cloud dependencies. An opensource and lightweight alternative to LM Studio.

Downloads: 3 This Week

Last Update: 2025-03-27

See Project

Openfire LLM Chatbot Plugin

LLM Chatbot Assistant for Openfire server

This plugin is a wrapper to hosted AI Inference server for LLM chat models. It uses the Botz API to create a chatbot in Openfire which will engage in XMPP chat and groupchat conversations.

Downloads: 0 This Week

Last Update: 2024-05-04

See Project

Search Results for "api server"

Showing 9 open source projects for "api server"

Triton Inference Server

API-for-Open-LLM

LazyLLM

DeepDetect

Infinity

BrowserAI

Text Generation Inference

kolosal

Openfire LLM Chatbot Plugin

Search Results for "api server"

Showing 9 open source projects for "api server"

Triton Inference Server

API-for-Open-LLM

LazyLLM

DeepDetect

Infinity

BrowserAI

Text Generation Inference

kolosal

Openfire LLM Chatbot Plugin

Related Searches

Related Categories