Port of Facebook's LLaMA model in C/C++
Run Local LLMs on Any Device. Open-source
Fast inference engine for Transformer models
A set of Docker images for training and serving models in TensorFlow
An MLOps framework to package, deploy, monitor and manage models
Optimizing inference proxy for LLMs
OpenMMLab Model Deployment Framework
Trainable models and NN optimization tools
Python Package for ML-Based Heterogeneous Treatment Effects Estimation
Create HTML profiling reports from pandas DataFrame objects
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
AI interface for tinkerers (Ollama, Haystack RAG, Python)
OpenMMLab Video Perception Toolbox