PArallel Distributed Deep LEarning: Machine Learning Framework
Uncover insights, surface problems, monitor, and fine tune your LLM
Training and deploying machine learning models on Amazon SageMaker
Fast inference engine for Transformer models
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction
Replace OpenAI GPT with another LLM in your app
LLM.swift is a simple and readable library
Connect home devices into a powerful cluster to accelerate LLM
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
Visual Instruction Tuning: Large Language-and-Vision Assistant
FlashInfer: Kernel Library for LLM Serving
Optimizing inference proxy for LLMs
Easiest and laziest way for building multi-agent LLMs applications
Official inference library for Mistral models
On-device AI across mobile, embedded and edge for PyTorch
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
Operating LLMs in production
Uplift modeling and causal inference with machine learning algorithms
Unified Model Serving Framework
Standardized Serverless ML Inference Platform on Kubernetes
A lightweight vision library for performing large object detection
Deep Learning API and Server in C++14 support for Caffe, PyTorch
Deep learning optimization library: makes distributed training easy
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
AI interface for tinkerers (Ollama, Haystack RAG, Python)