Research project. A Memory solution for users, teams, and applications
Efficient Triton Kernels for LLM Training
Integrate cutting-edge LLM technology quickly and easily into your app
Burn is a new comprehensive dynamic Deep Learning Framework
TT-NN operator library, and TT-Metalium low level kernel programming
A RWKV management and startup tool, full automation, only 8MB
Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
Training neural networks on Apple Neural Engine via APIs
Open source solution that can meet the requirements of workloads
An experimental version of DeepSeek model
FlashMLA: Efficient Multi-head Latent Attention Kernels
C++ library for high performance inference on NVIDIA GPUs
A Powerful Native Multimodal Model for Image Generation
The Compute Library is a set of computer vision and machine learning
Toolkit for making machine learning and data analysis applications
Library for efficiently connecting and optimizing teams of AI agents
Deepnote is a drop-in replacement for Jupyter
AI memory OS for LLM and Agent systems
Tool that provides interactive visualizations for large embeddings
oneAPI Deep Neural Network Library (oneDNN)
How to optimize some algorithm in cuda
Deep and Machine Learning for Microscopy
The easiest way to use Ollama in .NET
Geometric deep learning extension library for PyTorch
Clean and efficient FP8 GEMM kernels with fine-grained scaling