Port of OpenAI's Whisper model in C/C++
Training and deploying machine learning models on Amazon SageMaker
Port of Facebook's LLaMA model in C/C++
Run Local LLMs on Any Device. Open-source
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
AIMET is a library that provides advanced quantization and compression
A high-throughput and memory-efficient inference and serving engine
Phi-3.5 for Mac: Locally-run Vision and Language Models
Easy-to-use deep learning framework with 3 key features
An MLOps framework to package, deploy, monitor and manage models
Sparsity-aware deep learning inference runtime for CPUs
OpenMLDB is an open-source machine learning database
FlashInfer: Kernel Library for LLM Serving
Single-cell analysis in Python
Superduper: Integrate AI models and machine learning workflows
Operating LLMs in production
PyTorch library of curated Transformer models and their components
Uncover insights, surface problems, monitor, and fine tune your LLM
OpenMMLab Model Deployment Framework
DoWhy is a Python library for causal inference
Large Language Model Text Generation Inference
Unified Model Serving Framework
A Pythonic framework to simplify AI service building
Integrate, train and manage any AI models and APIs with your database