Deep Learning API and Server in C++14 support for Caffe, PyTorch
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
Standardized Serverless ML Inference Platform on Kubernetes
Openai style api for open large language models
Visual Instruction Tuning: Large Language-and-Vision Assistant
CPU/GPU inference server for Hugging Face transformer models
Deploy a ML inference service on a budget in 10 lines of code