Python inference and LoRA trainer package for the LTX-2 audio–video
Flux 2 image generation model pure C inference
FlashMLA: Efficient Multi-head Latent Attention Kernels
ChatGLM-6B: An Open Bilingual Dialogue Language Model
Reference PyTorch implementation and models for DINOv3
Hackable and optimized Transformers building blocks
RGBD video generation model conditioned on camera input
Capable of understanding text, audio, vision, video
MiniMax-M2, a model built for Max coding & agentic workflows
PyTorch implementation of JiT
Chat & pretrained large vision language model
Analyze computation-communication overlap in V3/R1
AI Suite for upscaling, interpolating & restoring images/videos
Open-source, high-performance Mixture-of-Experts large language model
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Chinese LLaMA & Alpaca large language model + local CPU/GPU training
llama.go is like llama.cpp in pure Golang
Lightweight multimodal translation model for 55 languages
Efficient MoE model for million-token reasoning and coding
ClinicalBERT model trained on MIMIC notes for clinical NLP tasks
Efficient 14B multimodal instruct model with edge deployment and FP8
Large-scale xAI model for local inference with SGLang, Grok-2.5
Tiny pre-trained IBM model for multivariate time series forecasting
Quantized 675B multimodal instruct model optimized for NVFP4
Ultra-efficient 3B multimodal instruct model built for edge deployment