Replace OpenAI GPT with another LLM in your app
Single-cell analysis in Python
Operating LLMs in production
A library for accelerating Transformer models on NVIDIA GPUs
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
A set of Docker images for training and serving models in TensorFlow
Create HTML profiling reports from pandas DataFrame objects
A Pythonic framework to simplify AI service building
Optimizing inference proxy for LLMs
AIMET is a library that provides advanced quantization and compression
MNN is a blazing fast, lightweight deep learning framework
Fast inference engine for Transformer models
An Open-Source Programming Framework for Agentic AI
20+ high-performance LLMs with recipes to pretrain, finetune at scale
State-of-the-art Parameter-Efficient Fine-Tuning
Official inference library for Mistral models
Superduper: Integrate AI models and machine learning workflows
PArallel Distributed Deep LEarning: Machine Learning Framework
A scalable inference server for models optimized with OpenVINO
AICI: Prompts as (Wasm) Programs
A RWKV management and startup tool, full automation, only 8MB
State-of-the-art diffusion models for image and audio generation
Private Open AI on Kubernetes
Phi-3.5 for Mac: Locally-run Vision and Language Models
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction