A Pythonic framework to simplify AI service building
Open-source tool designed to enhance the efficiency of workloads
Run Local LLMs on Any Device. Open-source
Images to inference with no labeling
The Triton Inference Server provides an optimized cloud
Easiest and laziest way for building multi-agent LLMs applications
Unified Model Serving Framework
Operating LLMs in production
Sparsity-aware deep learning inference runtime for CPUs
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
An MLOps framework to package, deploy, monitor and manage models
GPU environment management and cluster orchestration
Standardized Serverless ML Inference Platform on Kubernetes
A high-performance ML model serving framework, offers dynamic batching
A toolkit to optimize ML models for deployment for Keras & TensorFlow
Replace OpenAI GPT with another LLM in your app
A computer vision framework to create and deploy apps in minutes
Training & Implementation of chatbots leveraging GPT-like architecture
Deploy a ML inference service on a budget in 10 lines of code