Port of OpenAI's Whisper model in C/C++
Port of Facebook's LLaMA model in C/C++
Run Local LLMs on Any Device. Open-source
User-friendly AI Interface
ONNX Runtime: cross-platform, high performance ML inferencing
OpenVINO™ Toolkit repository
High-performance neural network inference framework for mobile
Everything you need to build state-of-the-art foundation models
The free, Open Source alternative to OpenAI, Claude and others
A high-throughput and memory-efficient inference and serving engine
Open-Source AI Camera. Empower any camera/CCTV
Protect and discover secrets using Gitleaks
Uncover insights, surface problems, monitor, and fine tune your LLM
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
A RWKV management and startup tool, full automation, only 8MB
PArallel Distributed Deep LEarning: Machine Learning Framework
A toolkit to optimize ML models for deployment for Keras & TensorFlow
A library for accelerating Transformer models on NVIDIA GPUs
Fast inference engine for Transformer models
Training and deploying machine learning models on Amazon SageMaker
GPU environment management and cluster orchestration
LLM.swift is a simple and readable library
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Run serverless GPU workloads with fast cold starts on bare-metal
Operating LLMs in production