Openai style api for open large language models
Self-hosted, community-driven, local OpenAI compatible API
Low-latency REST API for serving text-embeddings
Port of OpenAI's Whisper model in C/C++
Simplifies the local serving of AI models from any source
Deep Learning API and Server in C++14 support for Caffe, PyTorch
Unified Model Serving Framework