TokenSpeed is a speed-of-light LLM inference engine
A nearly-live implementation of OpenAI's Whisper
Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
Ultralytics YOLO
Mooncake is the serving platform for Kimi
CS2, Valorant, Fortnite, APEX, every game
A computer vision framework to create and deploy apps in minutes
Transformer related optimization, including BERT, GPT
Efficient 13B MoE language model with long context and reasoning modes
High-performance MoE model with MLA, MTP, and multilingual reasoning