TT-NN operator library, and TT-Metalium low level kernel programming
Towards self-verifiable mathematical reasoning
Easily compute clip embeddings and build a clip retrieval system
Making RAG Simpler with Small and Open-Sourced Language Models
"Big Model" trains a visual multimodal VLM with 26M parameters
Netflix’s Workflow Orchestrator
A theoretical reconstruction of the Claude Mythos architecture
FlashMLA: Efficient Multi-head Latent Attention Kernels
The open source AI research agent
From-scratch PyTorch implementation of Google's TurboQuant
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Open-weight, large-scale hybrid-attention reasoning model
The open-source managed agents platform
DeepSeek Coder: Let the Code Write Itself
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Open-source LLM load balancer and serving platform for hosting LLMs
Autonomous AI agent that you can configure and build
Drop-in replacement for standard residual connections in Transformers
Unsupervised Learning for Image Registration
Training neural networks on Apple Neural Engine via APIs
Official inference framework for 1-bit LLMs
PyTorch3D is FAIR's library of reusable components for deep learning
Multi-user UI for managing and running Stable Diffusion workflows tool
DeepSeek 4 Flash local inference engine for Metal
Confidential Compute Open Network, Decentralized AI Inference on TON