Port of Facebook's LLaMA model in C/C++
Run Local LLMs on Any Device. Open-source
A high-throughput and memory-efficient inference and serving engine
Easiest and laziest way for building multi-agent LLMs applications
Deep learning optimization library: makes distributed training easy
Bayesian inference with probabilistic programming
Easy-to-use deep learning framework with 3 key features
Unified Model Serving Framework
Data manipulation and transformation for audio signal processing
Uncover insights, surface problems, monitor, and fine tune your LLM
An easy-to-use LLMs quantization package with user-friendly apis
A RWKV management and startup tool, full automation, only 8MB
Large Language Model Text Generation Inference
Open-Source AI Camera. Empower any camera/CCTV
lightweight, standalone C++ inference engine for Google's Gemma models
Probabilistic reasoning and statistical analysis in TensorFlow
An Open-Source Programming Framework for Agentic AI
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
PyTorch library of curated Transformer models and their components
Low-latency REST API for serving text-embeddings
A library for accelerating Transformer models on NVIDIA GPUs
A GPU-accelerated library containing highly optimized building blocks
State-of-the-art diffusion models for image and audio generation
Pytorch domain library for recommendation systems
PyTorch extensions for fast R&D prototyping and Kaggle farming