Python inference and LoRA trainer package for the LTX-2 audio–video
Flux 2 image generation model pure C inference
FlashMLA: Efficient Multi-head Latent Attention Kernels
ChatGLM-6B: An Open Bilingual Dialogue Language Model
The official repo of Qwen chat & pretrained large language model
Reference PyTorch implementation and models for DINOv3
RGBD video generation model conditioned on camera input
Capable of understanding text, audio, vision, video
MiniMax-M2, a model built for Max coding & agentic workflows
PyTorch implementation of JiT
Chat & pretrained large vision language model
Analyze computation-communication overlap in V3/R1
Open-source, high-performance Mixture-of-Experts large language model
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Chinese LLaMA & Alpaca large language model + local CPU/GPU training
llama.go is like llama.cpp in pure Golang
Lightweight multimodal translation model for 55 languages
Efficient MoE model for million-token reasoning and coding
ClinicalBERT model trained on MIMIC notes for clinical NLP tasks
Efficient 14B multimodal instruct model with edge deployment and FP8
Large-scale xAI model for local inference with SGLang, Grok-2.5
Tiny pre-trained IBM model for multivariate time series forecasting
Quantized 675B multimodal instruct model optimized for NVFP4
Ultra-efficient 3B multimodal instruct model built for edge deployment