Port of Facebook's LLaMA model in C/C++
A scalable inference server for models optimized with OpenVINO
The free, Open Source alternative to OpenAI, Claude and others
Simplifies the local serving of AI models from any source
Unofficial (Golang) Go bindings for the Hugging Face Inference API
AIMET is a library that provides advanced quantization and compression
Integrate, train and manage any AI models and APIs with your database
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
Superduper: Integrate AI models and machine learning workflows
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
User-friendly AI Interface
Open standard for machine learning interoperability
Phi-3.5 for Mac: Locally-run Vision and Language Models
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Neural Network Compression Framework for enhanced OpenVINO
Private Open AI on Kubernetes
OpenAI swift async text to image for SwiftUI app using OpenAI
State-of-the-art diffusion models for image and audio generation
Run Local LLMs on Any Device. Open-source
Official inference library for Mistral models
OpenVINO™ Toolkit repository
An innovative library for efficient LLM inference
Sparsity-aware deep learning inference runtime for CPUs
Operating LLMs in production
On-device AI across mobile, embedded and edge for PyTorch