Openai style api for open large language models
Run Local LLMs on Any Device. Open-source
Port of OpenAI's Whisper model in C/C++
Low-latency REST API for serving text-embeddings
The Triton Inference Server provides an optimized cloud
Unofficial (Golang) Go bindings for the Hugging Face Inference API
Easiest and laziest way for building multi-agent LLMs applications
Optimizing inference proxy for LLMs
The free, Open Source alternative to OpenAI, Claude and others
User-friendly AI Interface
A library for accelerating Transformer models on NVIDIA GPUs
Private Open AI on Kubernetes
The unofficial python package that returns response of Google Bard
Large Language Model Text Generation Inference
Replace OpenAI GPT with another LLM in your app
Deep Learning API and Server in C++14 support for Caffe, PyTorch
Operating LLMs in production
Simplifies the local serving of AI models from any source
Bring the notion of Model-as-a-Service to life
A RWKV management and startup tool, full automation, only 8MB
Unified Model Serving Framework
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
Data manipulation and transformation for audio signal processing
Python Package for ML-Based Heterogeneous Treatment Effects Estimation
Library for OCR-related tasks powered by Deep Learning