A Pythonic framework to simplify AI service building
Port of Facebook's LLaMA model in C/C++
Run Local LLMs on Any Device. Open-source
Open-Source AI Camera. Empower any camera/CCTV
State-of-the-art diffusion models for image and audio generation
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Bring the notion of Model-as-a-Service to life
A library to communicate with ChatGPT, Claude, Copilot, Gemini
Superduper: Integrate AI models and machine learning workflows
Integrate, train and manage any AI models and APIs with your database
The Triton Inference Server provides an optimized cloud
Official inference library for Mistral models
AIMET is a library that provides advanced quantization and compression
Replace OpenAI GPT with another LLM in your app
Tensor search for humans
AI interface for tinkerers (Ollama, Haystack RAG, Python)
LLM training code for MosaicML foundation models
Database system for building simpler and faster AI-powered application
Phi-3.5 for Mac: Locally-run Vision and Language Models
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Sparsity-aware deep learning inference runtime for CPUs
Operating LLMs in production
Build your chatbot within minutes on your favorite device
20+ high-performance LLMs with recipes to pretrain, finetune at scale