Open-source tool designed to enhance the efficiency of workloads
A Pythonic framework to simplify AI service building
The Triton Inference Server provides an optimized cloud
Run Local LLMs on Any Device. Open-source
GPU environment management and cluster orchestration
Easiest and laziest way for building multi-agent LLMs applications
Operating LLMs in production
Sparsity-aware deep learning inference runtime for CPUs
A high-performance ML model serving framework, offers dynamic batching
Standardized Serverless ML Inference Platform on Kubernetes
Unified Model Serving Framework
Replace OpenAI GPT with another LLM in your app
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
An MLOps framework to package, deploy, monitor and manage models
Images to inference with no labeling
A toolkit to optimize ML models for deployment for Keras & TensorFlow
A computer vision framework to create and deploy apps in minutes
Training & Implementation of chatbots leveraging GPT-like architecture
Deploy a ML inference service on a budget in 10 lines of code