Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion
A set of Docker images for training and serving models in TensorFlow
Fast inference engine for Transformer models
Operating LLMs in production
Trainable models and NN optimization tools
Official inference library for Mistral models
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
OpenAI swift async text to image for SwiftUI app using OpenAI
AIMET is a library that provides advanced quantization and compression
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Deep learning optimization library: makes distributed training easy
DoWhy is a Python library for causal inference
Superduper: Integrate AI models and machine learning workflows
MII makes low-latency and high-throughput inference possible
The unofficial python package that returns response of Google Bard
Lightweight inference library for ONNX files, written in C++
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
Implementation of "Tree of Thoughts
Sequence-to-sequence framework, focused on Neural Machine Translation
Guide to deploying deep-learning inference networks
Toolkit for allowing inference and serving with MXNet in SageMaker