Run Local LLMs on Any Device. Open-source
Port of Facebook's LLaMA model in C/C++
Python Package for ML-Based Heterogeneous Treatment Effects Estimation
An MLOps framework to package, deploy, monitor and manage models
Fast inference engine for Transformer models
AI interface for tinkerers (Ollama, Haystack RAG, Python)
Optimizing inference proxy for LLMs
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Trainable models and NN optimization tools
OpenMMLab Model Deployment Framework
A set of Docker images for training and serving models in TensorFlow
OpenMMLab Video Perception Toolbox