Operating LLMs in production
Run Local LLMs on Any Device. Open-source
A RWKV management and startup tool, full automation, only 8MB
Serving system for machine learning models
A scalable inference server for models optimized with OpenVINO
The official Python client for the Huggingface Hub
A Pythonic framework to simplify AI service building
Private Open AI on Kubernetes
The Triton Inference Server provides an optimized cloud
Prem provides a unified environment to develop AI applications