TensorRT LLM provides users with an easy-to-use Python API
Lemonade helps users run local LLMs with the highest performance
WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
Bringing large-language models and chat to web browsers
OpenVINO™ Toolkit repository
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Clone a voice in 5 seconds to generate arbitrary speech in real-time
GPU stress test OpenGL and Vulkan graphics benchmark Windows/Linux
Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
Bringing the Unsloth experience to Mac users via Apple's MLX framework
Public CI, Docker images for popular JAX libraries
Numerical differential equation solvers in JAX
Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
Expert Parallelism Load Balancer
Open-Source Low-Latency Accelerated Linux WebRTC HTML5 Remote Desktop
GPU benchmark testing graphics performance with realistic 3D scenes.
powerMAX is a CPU and GPU burn-in test
Easy-to-use deep learning framework with 3 key features
A simple, performant and scalable Jax LLM
Probabilistic reasoning and statistical analysis in TensorFlow
Multi-lingual large voice generation model, providing inference
Software that uses AI to perform real-time voice conversion
An engine-agnostic deep learning framework in Java
Run LLMs locally on Cloud Workstations
Hardware Info for Linux portable AppImage + Benchmark