The Triton Inference Server provides an optimized cloud
Openai style api for open large language models
Low-latency REST API for serving text-embeddings
A client implementation for ChatGPT and Bing AI
Deep Learning API and Server in C++14 support for Caffe, PyTorch
Python binding to the Apache Tika™ REST services
Leading free and open-source face recognition system
Framework for intelligent service-based networks. Mobile compatible.
Deploy a ML inference service on a budget in 10 lines of code
Fast Coreference Resolution in spaCy with Neural Networks