A scalable inference server for models optimized with OpenVINO
Port of OpenAI's Whisper model in C/C++
Deep Learning API and Server in C++14 support for Caffe, PyTorch
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
A Pythonic framework to simplify AI service building
Library for serving Transformers models on Amazon SageMaker
Open-Source AI Camera. Empower any camera/CCTV
Easiest and laziest way for building multi-agent LLMs applications
State-of-the-art diffusion models for image and audio generation
LLM training code for MosaicML foundation models
A general-purpose probabilistic programming system
A high-performance ML model serving framework, offers dynamic batching
GPU environment management and cluster orchestration
Protect and discover secrets using Gitleaks
A unified framework for scalable computing
MNN is a blazing fast, lightweight deep learning framework
Large Language Model Text Generation Inference
Fast inference engine for Transformer models
Powering Amazon custom machine learning chips
Openai style api for open large language models
Standardized Serverless ML Inference Platform on Kubernetes
A library for accelerating Transformer models on NVIDIA GPUs
Lightweight Python library for adding real-time multi-object tracking
PyTorch library of curated Transformer models and their components