Making large AI models cheaper, faster and more accessible
The official repo of Qwen chat & pretrained large language model
A high-throughput and memory-efficient inference and serving engine
The leading agent orchestration platform for Claude
AI-powered penetration testing assistant using local LLM on linux
ChatGPT interface with better UI
Framework for building neural networks
Lets make video diffusion practical
Text and image to video generation: CogVideoX and CogVideo
Deep learning optimization library making distributed training easy
Python package built to ease deep learning on graph
A Next-Generation Training Engine Built for Ultra-Large MoE Models
An implementation of a deep learning recommendation model (DLRM)
Supercharge Your LLM with the Fastest KV Cache Layer
Low-code framework for building custom LLMs, neural networks
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Advancing Open-source World Models
Omnilingual ASR Open-Source Multilingual SpeechRecognition
Scalable and user friendly neural forecasting algorithms.
MobileLLM Optimizing Sub-billion Parameter Language Models
TensorRT LLM provides users with an easy-to-use Python API
Enterprise multi-agent orchestration framework for scalable AI apps
High-performance inference framework for large language models
Deep learning optimization library: makes distributed training easy
Open platform for training, serving, and evaluating language models