Integrate cutting-edge LLM technology quickly and easily into your app
Efficient Triton Kernels for LLM Training
Research project. A Memory solution for users, teams, and applications
Real time face swap and one-click video deepfake
FlashInfer: Kernel Library for LLM Serving
TT-NN operator library, and TT-Metalium low level kernel programming
Secure, kernel-enforced sandbox CLI and SDKs for AI agents
Burn is a new comprehensive dynamic Deep Learning Framework
A RWKV management and startup tool, full automation, only 8MB
Open source solution that can meet the requirements of workloads
Clean and efficient FP8 GEMM kernels with fine-grained scaling
Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
Training neural networks on Apple Neural Engine via APIs
An experimental version of DeepSeek model
The Compute Library is a set of computer vision and machine learning
Open Source OCR Engine
FlashMLA: Efficient Multi-head Latent Attention Kernels
Tool that provides interactive visualizations for large embeddings
C++ library for high performance inference on NVIDIA GPUs
A Powerful Native Multimodal Model for Image Generation
A Simple and Universal Swarm Intelligence Engine
Deep and Machine Learning for Microscopy
GUI for a Vocal Remover that uses Deep Neural Networks
Library for efficiently connecting and optimizing teams of AI agents
How to optimize some algorithm in cuda