Unofficial (Golang) Go bindings for the Hugging Face Inference API
Deep Learning API and Server in C++14 support for Caffe, PyTorch
FlashInfer: Kernel Library for LLM Serving
Everything you need to build state-of-the-art foundation models
Private Open AI on Kubernetes
A RWKV management and startup tool, full automation, only 8MB
Uncover insights, surface problems, monitor, and fine tune your LLM
Unified Model Serving Framework
A set of Docker images for training and serving models in TensorFlow
Build Production-ready Agentic Workflow with Natural Language
A lightweight vision library for performing large object detection
Single-cell analysis in Python
Bayesian inference with probabilistic programming
Trainable models and NN optimization tools
A unified framework for scalable computing
Easiest and laziest way for building multi-agent LLMs applications
Bolt is a deep learning library with high performance
Python Package for ML-Based Heterogeneous Treatment Effects Estimation
Replace OpenAI GPT with another LLM in your app
Framework that is dedicated to making neural data processing
MII makes low-latency and high-throughput inference possible
Serving system for machine learning models
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Optimizing inference proxy for LLMs
Neural Network Compression Framework for enhanced OpenVINO