Easiest and laziest way for building multi-agent LLMs applications
Port of Facebook's LLaMA model in C/C++
A high-throughput and memory-efficient inference and serving engine
Run serverless GPU workloads with fast cold starts on bare-metal
Fast inference engine for Transformer models
A RWKV management and startup tool, full automation, only 8MB
Protect and discover secrets using Gitleaks
Easy-to-use deep learning framework with 3 key features
lightweight, standalone C++ inference engine for Google's Gemma models
A general-purpose probabilistic programming system
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
Build Production-ready Agentic Workflow with Natural Language
PyTorch extensions for fast R&D prototyping and Kaggle farming
Unified Model Serving Framework
MNN is a blazing fast, lightweight deep learning framework
Gaussian processes in TensorFlow
Low-latency REST API for serving text-embeddings
LLM training code for MosaicML foundation models
Tensor search for humans
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Set of comprehensive computer vision & machine intelligence libraries
Deep Learning API and Server in C++14 support for Caffe, PyTorch
Images to inference with no labeling
A real time inference engine for temporal logical specifications
High quality, fast, modular reference implementation of SSD in PyTorch