Self-contained Machine Learning and Natural Language Processing lib
PyTorch extensions for fast R&D prototyping and Kaggle farming
Probabilistic reasoning and statistical analysis in TensorFlow
A GPU-accelerated library containing highly optimized building blocks
Powering Amazon custom machine learning chips
Set of comprehensive computer vision & machine intelligence libraries
An innovative library for efficient LLM inference
LLM.swift is a simple and readable library
Build your chatbot within minutes on your favorite device
Connect home devices into a powerful cluster to accelerate LLM
20+ high-performance LLMs with recipes to pretrain, finetune at scale
Low-latency REST API for serving text-embeddings
A library for accelerating Transformer models on NVIDIA GPUs
Turn your existing data infrastructure into a feature store
A high-performance ML model serving framework, offers dynamic batching
Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion
Bring the notion of Model-as-a-Service to life
Pytorch domain library for recommendation systems
A toolkit to optimize ML models for deployment for Keras & TensorFlow
Serve machine learning models within a Docker container
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
Standardized Serverless ML Inference Platform on Kubernetes
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
Framework for Accelerating LLM Generation with Multiple Decoding Heads
Run 100B+ language models at home, BitTorrent-style